public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Matthias Kretz <m.kretz@gsi.de>
To: Thomas Rodgers <trodgers@redhat.com>,
	libstdc++ <libstdc++@gcc.gnu.org>,
	Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] std::experimental::simd
Date: Fri, 8 May 2020 21:03:11 +0200	[thread overview]
Message-ID: <33105491.xCRyjBS7g1@excalibur> (raw)
In-Reply-To: <xkqeo8qyl8y8.fsf@trodgers.remote>

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

Here's my last update to the std::experimental::simd patch. It's currently 
based on the gcc-10 branch.

Cheers,
  Matthias

-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 std::experimental::simd              https://github.com/VcDevel/std-simd
──────────────────────────────────────────────────────────────────────────

[-- Attachment #2: simd.patch --]
[-- Type: text/x-patch, Size: 1583956 bytes --]

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 0f03126db1c..c7ac33faaf5 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -2869,6 +2869,17 @@ since C++14 and the implementation is complete.
       <entry>Library Fundamentals 2 TS</entry>
     </row>
 
+    <row>
+      <entry>
+	<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0214r9.pdf">
+	  P0214R9
+	</link>
+      </entry>
+      <entry>Data-Parallel Types</entry>
+      <entry>Y</entry>
+      <entry>Parallelism 2 TS</entry>
+    </row>
+
   </tbody>
 </tgroup>
 </table>
@@ -3014,6 +3025,185 @@ since C++14 and the implementation is complete.
       If <code>!is_regular_file(p)</code>, an error is reported.
    </para>
 
+   <section xml:id="iso.2017.par2ts" xreflabel="Implementation Specific Behavior of the Parallelism 2 TS"><info><title>Parallelism 2 TS</title></info>
+
+     <para>
+        <emphasis>9.3 [parallel.simd.abi]</emphasis>
+        <code>max_fixed_size&lt;T&gt;</code> is 32, except when targetting
+        AVX512BW and <code>sizeof(T)</code> is 1.
+     </para>
+
+     <para>
+        When targeting 32-bit x86,
+        <classname>simd_abi::compatible&lt;T&gt;</classname> is an alias for
+        <classname>simd_abi::scalar</classname>. When targeting 64-bit x86
+        (including x32), <classname>simd_abi::compatible&lt;T&gt;</classname> is
+        an alias for <classname>simd_abi::_VecBuiltin&lt;16&gt;</classname>,
+        unless <code>T</code> is <code>long double</code>, in which case it is
+        an alias for <classname>simd_abi::scalar</classname>.
+     </para>
+
+     <para>
+        When targeting x86 (both 32-bit and 64-bit),
+        <classname>simd_abi::native&lt;T&gt;</classname> is an alias for one of
+        <classname>simd_abi::_VecBuiltin&lt;16&gt;</classname>,
+        <classname>simd_abi::_VecBuiltin&lt;32&gt;</classname>, or
+        <classname>simd_abi::_VecBltnBtmsk&lt;64&gt;</classname>, depending on
+        the machine options the compiler was invoked with.
+     </para>
+
+     <para>
+        For any other targeted machine
+        <classname>simd_abi::compatible&lt;T&gt;</classname> and
+        <classname>simd_abi::native&lt;T&gt;</classname> are aliases for
+        <classname>simd_abi::scalar</classname>. (subject to change)
+     </para>
+
+     <para>
+        The extended ABI tag types defined in the
+        <code>std::experimental::parallelism_v2::simd_abi</code> namespace are:
+        <classname>simd_abi::_VecBuiltin&lt;Bytes&gt;</classname>, and
+        <classname>simd_abi::_VecBltnBtmsk&lt;Bytes&gt;</classname>.
+     </para>
+
+     <para>
+        <classname>simd_abi::deduce&lt;T, N, Abis...&gt;::type</classname>,
+        with <code>N &gt; 1</code> is an alias for an extended ABI tag, if a
+        supported extended ABI tag exists. Otherwise it is an alias for
+        <classname>simd_abi::fixed_size&lt;N&gt;</classname>. The <classname>
+        simd_abi::_VecBltnBtmsk</classname> ABI tag is preferred over
+        <classname>simd_abi::_VecBuiltin</classname>.
+     </para>
+
+     <para>
+        <emphasis>9.4 [parallel.simd.traits]</emphasis>
+        <classname>memory_alignment&lt;T, U&gt;::value</classname> is
+        <code>sizeof(U) * T::size()</code> rounded up to the next power-of-two
+        value.
+     </para>
+
+     <para>
+        <emphasis>9.6.1 [parallel.simd.overview]</emphasis>
+        On ARM, <classname>simd&lt;T, _VecBuiltin&lt;Bytes&gt;&gt;</classname>
+        is supported if <code>__ARM_NEON</code> is defined and
+        <code>sizeof(T) &lt;= 4</code>. Additionally,
+        <code>sizeof(T) == 8</code> with integral <code>T</code> is supported if
+        <code>__ARM_ARCH &gt;= 8</code>, and <code>double</code> is supported if
+        <code>__aarch64__</code> is defined.
+        On x86, given an extended ABI tag <code>Abi</code>,
+        <classname>simd&lt;T, Abi&gt;</classname> is supported according to the
+        following table:
+        <table frame="all" xml:id="table.par2ts_simd_support">
+          <title>Support for Extended ABI Tags</title>
+
+          <tgroup cols="4" align="left" colsep="0" rowsep="1">
+          <colspec colname="c1"/>
+          <colspec colname="c2"/>
+          <colspec colname="c3"/>
+          <colspec colname="c4"/>
+            <thead>
+              <row>
+                <entry>ABI tag <code>Abi</code></entry>
+                <entry>value type <code>T</code></entry>
+                <entry>values for <code>Bytes</code></entry>
+                <entry>required machine option</entry>
+              </row>
+            </thead>
+
+            <tbody>
+              <row>
+                <entry morerows="5">
+                  <classname>_VecBuiltin&lt;Bytes&gt;</classname>
+                </entry>
+                <entry morerows="1"><code>float</code></entry>
+                <entry>8, 12, 16</entry>
+                <entry>"-msse"</entry>
+              </row>
+
+              <row>
+                <entry>20, 24, 28, 32</entry>
+                <entry>"-mavx"</entry>
+              </row>
+
+              <row>
+                <entry morerows="1"><code>double</code></entry>
+                <entry>16</entry>
+                <entry>"-msse2"</entry>
+              </row>
+
+              <row>
+                <entry>24, 32</entry>
+                <entry>"-mavx"</entry>
+              </row>
+
+              <row>
+                <entry morerows="1">
+                  integral types other than <code>bool</code>
+                </entry>
+                <entry>
+                  <code>Bytes</code> ≤ 16 and <code>Bytes</code> divisible by
+                  <code>sizeof(T)</code>
+                </entry>
+                <entry>"-msse2"</entry>
+              </row>
+
+              <row>
+                <entry>
+                  16 &lt; <code>Bytes</code> ≤ 32 and <code>Bytes</code>
+                  divisible by <code>sizeof(T)</code>
+                </entry>
+                <entry>"-mavx2"</entry>
+              </row>
+
+              <row>
+                <entry morerows="1">
+                  <classname>_VecBuiltin&lt;Bytes&gt;</classname> and
+                  <classname>_VecBltnBtmsk&lt;Bytes&gt;</classname>
+                </entry>
+                <entry>
+                  vectorizable types with <code>sizeof(T)</code> ≥ 4
+                </entry>
+                <entry morerows="1">
+                  32 &lt; <code>Bytes</code> ≤ 64 and <code>Bytes</code>
+                  divisible by <code>sizeof(T)</code>
+                </entry>
+                <entry>"-mavx512f"</entry>
+              </row>
+
+              <row>
+                <entry>
+                  vectorizable types with <code>sizeof(T)</code> &lt; 4
+                </entry>
+                <entry>"-mavx512bw"</entry>
+              </row>
+
+              <row>
+                <entry morerows="1">
+                  <classname>_VecBltnBtmsk&lt;Bytes&gt;</classname>
+                </entry>
+                <entry>
+                  vectorizable types with <code>sizeof(T)</code> ≥ 4
+                </entry>
+                <entry morerows="1">
+                  <code>Bytes</code> ≤ 32 and <code>Bytes</code> divisible by
+                  <code>sizeof(T)</code>
+                </entry>
+                <entry>"-mavx512vl"</entry>
+              </row>
+
+              <row>
+                <entry>
+                  vectorizable types with <code>sizeof(T)</code> &lt; 4
+                </entry>
+                <entry>"-mavx512bw" and "-mavx512vl"</entry>
+              </row>
+
+            </tbody>
+          </tgroup>
+        </table>
+     </para>
+
+   </section>
 
 </section>
 
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 80aeb3f8959..d1c870f620c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -734,6 +734,7 @@ experimental_headers = \
 	${experimental_srcdir}/ratio \
 	${experimental_srcdir}/regex \
 	${experimental_srcdir}/set \
+	${experimental_srcdir}/simd \
 	${experimental_srcdir}/socket \
 	${experimental_srcdir}/source_location \
 	${experimental_srcdir}/string \
@@ -754,6 +755,16 @@ experimental_bits_headers = \
 	${experimental_bits_srcdir}/lfts_config.h \
 	${experimental_bits_srcdir}/net.h \
 	${experimental_bits_srcdir}/shared_ptr.h \
+	${experimental_bits_srcdir}/simd.h \
+	${experimental_bits_srcdir}/simd_builtin.h \
+	${experimental_bits_srcdir}/simd_converter.h \
+	${experimental_bits_srcdir}/simd_detail.h \
+	${experimental_bits_srcdir}/simd_fixed_size.h \
+	${experimental_bits_srcdir}/simd_math.h \
+	${experimental_bits_srcdir}/simd_neon.h \
+	${experimental_bits_srcdir}/simd_scalar.h \
+	${experimental_bits_srcdir}/simd_x86.h \
+	${experimental_bits_srcdir}/simd_x86_conversions.h \
 	${experimental_bits_srcdir}/string_view.tcc \
 	${experimental_bits_filesystem_headers}
 
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index eb437ad8d8d..686331fd15c 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1079,6 +1079,7 @@ experimental_headers = \
 	${experimental_srcdir}/ratio \
 	${experimental_srcdir}/regex \
 	${experimental_srcdir}/set \
+	${experimental_srcdir}/simd \
 	${experimental_srcdir}/socket \
 	${experimental_srcdir}/source_location \
 	${experimental_srcdir}/string \
@@ -1099,6 +1100,16 @@ experimental_bits_headers = \
 	${experimental_bits_srcdir}/lfts_config.h \
 	${experimental_bits_srcdir}/net.h \
 	${experimental_bits_srcdir}/shared_ptr.h \
+	${experimental_bits_srcdir}/simd.h \
+	${experimental_bits_srcdir}/simd_builtin.h \
+	${experimental_bits_srcdir}/simd_converter.h \
+	${experimental_bits_srcdir}/simd_detail.h \
+	${experimental_bits_srcdir}/simd_fixed_size.h \
+	${experimental_bits_srcdir}/simd_math.h \
+	${experimental_bits_srcdir}/simd_neon.h \
+	${experimental_bits_srcdir}/simd_scalar.h \
+	${experimental_bits_srcdir}/simd_x86.h \
+	${experimental_bits_srcdir}/simd_x86_conversions.h \
 	${experimental_bits_srcdir}/string_view.tcc \
 	${experimental_bits_filesystem_headers}
 
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
new file mode 100644
index 00000000000..298ff5957a1
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -0,0 +1,5031 @@
+// Definition of the public simd interfaces -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_H
+#define _GLIBCXX_EXPERIMENTAL_SIMD_H
+
+#if __cplusplus >= 201703L
+
+#include "simd_detail.h"
+#include <bitset>
+#include <climits>
+#include <cstring>
+#include <functional>
+#include <iosfwd>
+#include <limits>
+#include <utility>
+
+#if _GLIBCXX_SIMD_X86INTRIN
+#include <x86intrin.h>
+#elif _GLIBCXX_SIMD_HAVE_NEON
+#include <arm_neon.h>
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+#if !_GLIBCXX_SIMD_X86INTRIN
+using __m128  [[__gnu__::__vector_size__(16)]] = float;
+using __m128d [[__gnu__::__vector_size__(16)]] = double;
+using __m128i [[__gnu__::__vector_size__(16)]] = long long;
+using __m256  [[__gnu__::__vector_size__(32)]] = float;
+using __m256d [[__gnu__::__vector_size__(32)]] = double;
+using __m256i [[__gnu__::__vector_size__(32)]] = long long;
+using __m512  [[__gnu__::__vector_size__(64)]] = float;
+using __m512d [[__gnu__::__vector_size__(64)]] = double;
+using __m512i [[__gnu__::__vector_size__(64)]] = long long;
+#endif
+
+// __next_power_of_2{{{
+/**
+ * \internal
+ * Returns the next power of 2 larger than or equal to \p __x.
+ */
+constexpr std::size_t
+__next_power_of_2(std::size_t __x)
+{
+  return (__x & (__x - 1)) == 0 ? __x
+				: __next_power_of_2((__x | (__x >> 1)) + 1);
+}
+
+// }}}
+namespace simd_abi {
+// {{{
+// implementation details:
+struct _Scalar;
+template <int _Np> struct _Fixed;
+
+// There are two major ABIs that appear on different architectures.
+// Both have non-boolean values packed into an N Byte register
+// -> #elements = N / sizeof(T)
+// Masks differ:
+// 1. Use value vector registers for masks (all 0 or all 1)
+// 2. Use bitmasks (mask registers) with one bit per value in the corresponding
+//    value vector
+//
+// Both can be partially used, masking off the rest when doing horizontal
+// operations or operations that can trap (e.g. FP_INVALID or integer division
+// by 0). This is encoded as the number of used bytes.
+template <int _UsedBytes> struct _VecBuiltin;
+template <int _UsedBytes> struct _VecBltnBtmsk;
+
+template <typename _Tp, int _Np> using _VecN = _VecBuiltin<sizeof(_Tp) * _Np>;
+
+template <int _UsedBytes = 16> using _Sse = _VecBuiltin<_UsedBytes>;
+template <int _UsedBytes = 32> using _Avx = _VecBuiltin<_UsedBytes>;
+template <int _UsedBytes = 64> using _Avx512 = _VecBltnBtmsk<_UsedBytes>;
+template <int _UsedBytes = 16> using _Neon = _VecBuiltin<_UsedBytes>;
+
+// implementation-defined:
+using __sse = _Sse<>;
+using __avx = _Avx<>;
+using __avx512 = _Avx512<>;
+using __neon = _Neon<>;
+
+using __neon128 = _Neon<16>;
+using __neon64 = _Neon<8>;
+
+// standard:
+template <typename _Tp, size_t _Np, typename...> struct deduce;
+template <int _Np> using fixed_size = _Fixed<_Np>;
+using scalar = _Scalar;
+// }}}
+} // namespace simd_abi
+// forward declarations is_simd(_mask), simd(_mask), simd_size {{{
+template <typename _Tp> struct is_simd;
+template <typename _Tp> struct is_simd_mask;
+template <typename _Tp, typename _Abi> class simd;
+template <typename _Tp, typename _Abi> class simd_mask;
+template <typename _Tp, typename _Abi> struct simd_size;
+// }}}
+// load/store flags {{{
+struct element_aligned_tag
+{
+};
+struct vector_aligned_tag
+{
+};
+template <size_t _Np> struct overaligned_tag
+{
+  static constexpr size_t _S_alignment = _Np;
+};
+inline constexpr element_aligned_tag element_aligned = {};
+inline constexpr vector_aligned_tag vector_aligned = {};
+template <size_t _Np> inline constexpr overaligned_tag<_Np> overaligned = {};
+// }}}
+
+// vvv ---- type traits ---- vvv
+// integer type aliases{{{
+using _UChar = unsigned char;
+using _SChar = signed char;
+using _UShort = unsigned short;
+using _UInt = unsigned int;
+using _ULong = unsigned long;
+using _ULLong = unsigned long long;
+using _LLong = long long;
+//}}}
+// __identity/__id{{{
+template <typename _Tp> struct __identity
+{
+  using type = _Tp;
+};
+template <typename _Tp> using __id = typename __identity<_Tp>::type;
+
+// }}}
+// __first_of_pack{{{
+template <typename _T0, typename...> struct __first_of_pack
+{
+  using type = _T0;
+};
+template <typename... _Ts>
+using __first_of_pack_t = typename __first_of_pack<_Ts...>::type;
+
+//}}}
+// __value_type_or_identity_t {{{
+template <typename _Tp>
+typename _Tp::value_type
+__value_type_or_identity_impl(int);
+template <typename _Tp>
+_Tp
+__value_type_or_identity_impl(float);
+template <typename _Tp>
+using __value_type_or_identity_t
+  = decltype(__value_type_or_identity_impl<_Tp>(int()));
+
+// }}}
+// __is_vectorizable {{{
+template <typename _Tp>
+struct __is_vectorizable : public std::is_arithmetic<_Tp>
+{
+};
+template <> struct __is_vectorizable<bool> : public false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_vectorizable_v = __is_vectorizable<_Tp>::value;
+// Deduces to a vectorizable type
+template <typename _Tp, typename = enable_if_t<__is_vectorizable_v<_Tp>>>
+using _Vectorizable = _Tp;
+
+// }}}
+// _LoadStorePtr / __is_possible_loadstore_conversion {{{
+template <typename _Ptr, typename _ValueType>
+struct __is_possible_loadstore_conversion
+  : conjunction<__is_vectorizable<_Ptr>, __is_vectorizable<_ValueType>>
+{
+};
+template <> struct __is_possible_loadstore_conversion<bool, bool> : true_type
+{
+};
+// Deduces to a type allowed for load/store with the given value type.
+template <typename _Ptr, typename _ValueType,
+	  typename = enable_if_t<
+	    __is_possible_loadstore_conversion<_Ptr, _ValueType>::value>>
+using _LoadStorePtr = _Ptr;
+
+// }}}
+// _SizeConstant{{{
+template <size_t _X> using _SizeConstant = integral_constant<size_t, _X>;
+// }}}
+// __is_bitmask{{{
+template <typename _Tp, typename = std::void_t<>>
+struct __is_bitmask : false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_bitmask_v = __is_bitmask<_Tp>::value;
+
+// the __mmaskXX case:
+template <typename _Tp>
+struct __is_bitmask<_Tp, std::void_t<decltype(std::declval<unsigned&>()
+					      = std::declval<_Tp>() & 1u)>>
+  : true_type
+{
+};
+
+// }}}
+// __int_for_sizeof{{{
+template <size_t> struct __int_for_sizeof;
+template <> struct __int_for_sizeof<1>
+{
+  using type = signed char;
+  static_assert(sizeof(type) == 1);
+};
+template <> struct __int_for_sizeof<2>
+{
+  using type = signed short;
+  static_assert(sizeof(type) == 2);
+};
+template <> struct __int_for_sizeof<4>
+{
+  using type = signed int;
+  static_assert(sizeof(type) == 4);
+};
+template <> struct __int_for_sizeof<8>
+{
+  using type = signed long long;
+  static_assert(sizeof(type) == 8);
+};
+#ifdef __SIZEOF_INT128__
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpedantic"
+template <> struct __int_for_sizeof<16>
+{
+  using type = __int128;
+  static_assert(sizeof(type) == 16);
+};
+#pragma GCC diagnostic pop
+#endif // __SIZEOF_INT128__
+template <typename _Tp>
+using __int_for_sizeof_t = typename __int_for_sizeof<sizeof(_Tp)>::type;
+template <size_t _Np>
+using __int_with_sizeof_t = typename __int_for_sizeof<_Np>::type;
+
+// }}}
+// __is_fixed_size_abi{{{
+template <typename _Tp> struct __is_fixed_size_abi : false_type
+{
+};
+template <int _Np>
+struct __is_fixed_size_abi<simd_abi::fixed_size<_Np>> : true_type
+{
+};
+
+template <typename _Tp>
+inline constexpr bool __is_fixed_size_abi_v = __is_fixed_size_abi<_Tp>::value;
+
+// }}}
+// constexpr feature detection{{{
+constexpr inline bool __have_mmx = _GLIBCXX_SIMD_HAVE_MMX;
+constexpr inline bool __have_sse = _GLIBCXX_SIMD_HAVE_SSE;
+constexpr inline bool __have_sse2 = _GLIBCXX_SIMD_HAVE_SSE2;
+constexpr inline bool __have_sse3 = _GLIBCXX_SIMD_HAVE_SSE3;
+constexpr inline bool __have_ssse3 = _GLIBCXX_SIMD_HAVE_SSSE3;
+constexpr inline bool __have_sse4_1 = _GLIBCXX_SIMD_HAVE_SSE4_1;
+constexpr inline bool __have_sse4_2 = _GLIBCXX_SIMD_HAVE_SSE4_2;
+constexpr inline bool __have_xop = _GLIBCXX_SIMD_HAVE_XOP;
+constexpr inline bool __have_avx = _GLIBCXX_SIMD_HAVE_AVX;
+constexpr inline bool __have_avx2 = _GLIBCXX_SIMD_HAVE_AVX2;
+constexpr inline bool __have_bmi = _GLIBCXX_SIMD_HAVE_BMI1;
+constexpr inline bool __have_bmi2 = _GLIBCXX_SIMD_HAVE_BMI2;
+constexpr inline bool __have_lzcnt = _GLIBCXX_SIMD_HAVE_LZCNT;
+constexpr inline bool __have_sse4a = _GLIBCXX_SIMD_HAVE_SSE4A;
+constexpr inline bool __have_fma = _GLIBCXX_SIMD_HAVE_FMA;
+constexpr inline bool __have_fma4 = _GLIBCXX_SIMD_HAVE_FMA4;
+constexpr inline bool __have_f16c = _GLIBCXX_SIMD_HAVE_F16C;
+constexpr inline bool __have_popcnt = _GLIBCXX_SIMD_HAVE_POPCNT;
+constexpr inline bool __have_avx512f = _GLIBCXX_SIMD_HAVE_AVX512F;
+constexpr inline bool __have_avx512dq = _GLIBCXX_SIMD_HAVE_AVX512DQ;
+constexpr inline bool __have_avx512vl = _GLIBCXX_SIMD_HAVE_AVX512VL;
+constexpr inline bool __have_avx512bw = _GLIBCXX_SIMD_HAVE_AVX512BW;
+constexpr inline bool __have_avx512dq_vl = __have_avx512dq && __have_avx512vl;
+constexpr inline bool __have_avx512bw_vl = __have_avx512bw && __have_avx512vl;
+
+constexpr inline bool __have_neon = _GLIBCXX_SIMD_HAVE_NEON;
+constexpr inline bool __have_neon_a32 = _GLIBCXX_SIMD_HAVE_NEON_A32;
+constexpr inline bool __have_neon_a64 = _GLIBCXX_SIMD_HAVE_NEON_A64;
+
+#ifdef __POWER9_VECTOR__
+constexpr inline bool __have_power9vec = true;
+#else
+constexpr inline bool __have_power9vec = false;
+#endif
+#if defined __POWER8_VECTOR__
+constexpr inline bool __have_power8vec = true;
+#else
+constexpr inline bool __have_power8vec = __have_power9vec;
+#endif
+#if defined __VSX__
+constexpr inline bool __have_power_vsx = true;
+#else
+constexpr inline bool __have_power_vsx = __have_power8vec;
+#endif
+#if defined __ALTIVEC__
+constexpr inline bool __have_power_vmx = true;
+#else
+constexpr inline bool __have_power_vmx = __have_power_vsx;
+#endif
+
+// }}}
+// __is_scalar_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_scalar_abi()
+{
+  return std::is_same_v<simd_abi::scalar, _Abi>;
+}
+
+// }}}
+// __abi_bytes_v {{{
+template <template <int> class _Abi, int _Bytes>
+constexpr int
+__abi_bytes_impl(_Abi<_Bytes>*)
+{
+  return _Bytes;
+}
+template <typename _Tp>
+constexpr int
+__abi_bytes_impl(_Tp*)
+{
+  return -1;
+}
+template <typename _Abi>
+inline constexpr int __abi_bytes_v
+  = __abi_bytes_impl(static_cast<_Abi*>(nullptr));
+
+// }}}
+// __is_builtin_bitmask_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_builtin_bitmask_abi()
+{
+  return std::is_same_v<simd_abi::_VecBltnBtmsk<__abi_bytes_v<_Abi>>, _Abi>;
+}
+
+// }}}
+// __is_sse_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_sse_abi()
+{
+  constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+  return _Bytes <= 16 && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_avx_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_avx_abi()
+{
+  constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+  return _Bytes > 16 && _Bytes <= 32
+	 && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_avx512_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_avx512_abi()
+{
+  constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+  return _Bytes <= 64 && std::is_same_v<simd_abi::_Avx512<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_neon_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_neon_abi()
+{
+  constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+  return _Bytes <= 16 && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __make_dependent_t {{{
+template <typename, typename _Up> struct __make_dependent
+{
+  using type = _Up;
+};
+template <typename _Tp, typename _Up>
+using __make_dependent_t = typename __make_dependent<_Tp, _Up>::type;
+
+// }}}
+// ^^^ ---- type traits ---- ^^^
+
+// __assert_unreachable{{{
+template <typename _Tp> struct __assert_unreachable
+{
+  static_assert(!std::is_same_v<_Tp, _Tp>, "this should be unreachable");
+};
+
+// }}}
+// __size_or_zero_v {{{
+template <typename _Tp, typename _Ap, size_t _Np = simd_size<_Tp, _Ap>::value>
+constexpr size_t
+__size_or_zero_dispatch(int)
+{
+  return _Np;
+}
+template <typename _Tp, typename _Ap>
+constexpr size_t
+__size_or_zero_dispatch(float)
+{
+  return 0;
+}
+template <typename _Tp, typename _Ap>
+inline constexpr size_t __size_or_zero_v = __size_or_zero_dispatch<_Tp, _Ap>(0);
+
+// }}}
+// __bit_cast {{{
+template <typename _To, typename _From>
+_GLIBCXX_SIMD_INTRINSIC _To
+__bit_cast(const _From __x)
+{
+  static_assert(sizeof(_To) == sizeof(_From));
+  _To __r;
+  __builtin_memcpy(reinterpret_cast<char*>(&__r),
+		   reinterpret_cast<const char*>(&__x), sizeof(_To));
+  return __r;
+}
+
+// }}}
+// __div_roundup {{{
+inline constexpr std::size_t
+__div_roundup(std::size_t __a, std::size_t __b)
+{
+  return (__a + __b - 1) / __b;
+}
+
+// }}}
+// _ExactBool{{{
+class _ExactBool
+{
+  const bool _M_data;
+
+public:
+  _GLIBCXX_SIMD_INTRINSIC constexpr _ExactBool(bool __b) : _M_data(__b) {}
+  _ExactBool(int) = delete;
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator bool() const { return _M_data; }
+};
+
+// }}}
+// __execute_n_times{{{
+template <typename _Fp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_on_index_sequence(_Fp&& __f, std::index_sequence<_I...>)
+{
+  [[maybe_unused]] auto&& __x = {(__f(_SizeConstant<_I>()), 0)...};
+}
+
+template <typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_on_index_sequence(_Fp&&, std::index_sequence<>)
+{}
+
+template <size_t _Np, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_n_times(_Fp&& __f)
+{
+  __execute_on_index_sequence(static_cast<_Fp&&>(__f),
+			      std::make_index_sequence<_Np>{});
+}
+
+// }}}
+// __generate_from_n_evaluations{{{
+template <typename _R, typename _Fp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__execute_on_index_sequence_with_return(_Fp&& __f, std::index_sequence<_I...>)
+{
+  return _R{__f(_SizeConstant<_I>())...};
+}
+
+template <size_t _Np, typename _R, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__generate_from_n_evaluations(_Fp&& __f)
+{
+  return __execute_on_index_sequence_with_return<_R>(
+    static_cast<_Fp&&>(__f), std::make_index_sequence<_Np>{});
+}
+
+// }}}
+// __call_with_n_evaluations{{{
+template <size_t... _I, typename _F0, typename _FArgs>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_n_evaluations(std::index_sequence<_I...>, _F0&& __f0,
+			  _FArgs&& __fargs)
+{
+  return __f0(__fargs(_SizeConstant<_I>())...);
+}
+
+template <size_t _Np, typename _F0, typename _FArgs>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_n_evaluations(_F0&& __f0, _FArgs&& __fargs)
+{
+  return __call_with_n_evaluations(std::make_index_sequence<_Np>{},
+				   static_cast<_F0&&>(__f0),
+				   static_cast<_FArgs&&>(__fargs));
+}
+
+// }}}
+// __call_with_subscripts{{{
+template <size_t _First = 0, size_t... _It, typename _Tp, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_subscripts(_Tp&& __x, index_sequence<_It...>, _Fp&& __fun)
+{
+  return __fun(__x[_First + _It]...);
+}
+
+template <size_t _Np, size_t _First = 0, typename _Tp, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_subscripts(_Tp&& __x, _Fp&& __fun)
+{
+  return __call_with_subscripts<_First>(static_cast<_Tp&&>(__x),
+					std::make_index_sequence<_Np>(),
+					static_cast<_Fp&&>(__fun));
+}
+
+// }}}
+// __may_alias{{{
+/**\internal
+ * Helper __may_alias<_Tp> that turns _Tp into the type to be used for an
+ * aliasing pointer. This adds the __may_alias attribute to _Tp (with compilers
+ * that support it).
+ */
+template <typename _Tp> using __may_alias [[__gnu__::__may_alias__]] = _Tp;
+
+// }}}
+// _UnsupportedBase {{{
+// simd and simd_mask base for unsupported <_Tp, _Abi>
+struct _UnsupportedBase
+{
+  _UnsupportedBase() = delete;
+  _UnsupportedBase(const _UnsupportedBase&) = delete;
+  _UnsupportedBase& operator=(const _UnsupportedBase&) = delete;
+  ~_UnsupportedBase() = delete;
+};
+
+// }}}
+// _InvalidTraits {{{
+/**
+ * \internal
+ * Defines the implementation of __a given <_Tp, _Abi>.
+ *
+ * Implementations must ensure that only valid <_Tp, _Abi> instantiations are
+ * possible. Static assertions in the type definition do not suffice. It is
+ * important that SFINAE works.
+ */
+struct _InvalidTraits
+{
+  using _IsValid = false_type;
+  using _SimdBase = _UnsupportedBase;
+  using _MaskBase = _UnsupportedBase;
+
+  static constexpr size_t _S_simd_align = 1;
+  struct _SimdImpl;
+  struct _SimdMember
+  {
+  };
+  struct _SimdCastType;
+
+  static constexpr size_t _S_mask_align = 1;
+  struct _MaskImpl;
+  struct _MaskMember
+  {
+  };
+  struct _MaskCastType;
+};
+// }}}
+// _SimdTraits {{{
+template <typename _Tp, typename _Abi, typename = std::void_t<>>
+struct _SimdTraits : _InvalidTraits
+{
+};
+
+// }}}
+// __private_init, __bitset_init{{{
+/**
+ * \internal
+ * Tag used for private init constructor of simd and simd_mask
+ */
+inline constexpr struct _PrivateInit
+{
+} __private_init = {};
+inline constexpr struct _BitsetInit
+{
+} __bitset_init = {};
+
+// }}}
+// __is_narrowing_conversion<_From, _To>{{{
+template <typename _From, typename _To, bool = std::is_arithmetic<_From>::value,
+	  bool = std::is_arithmetic<_To>::value>
+struct __is_narrowing_conversion;
+
+// ignore "warning C4018: '<': signed/unsigned mismatch" in the following trait.
+// The implicit conversions will do the right thing here.
+template <typename _From, typename _To>
+struct __is_narrowing_conversion<_From, _To, true, true>
+  : public __bool_constant<(
+      std::numeric_limits<_From>::digits > std::numeric_limits<_To>::digits
+      || std::numeric_limits<_From>::max() > std::numeric_limits<_To>::max()
+      || std::numeric_limits<_From>::lowest()
+	   < std::numeric_limits<_To>::lowest()
+      || (std::is_signed<_From>::value && std::is_unsigned<_To>::value))>
+{
+};
+
+template <typename _Tp>
+struct __is_narrowing_conversion<bool, _Tp, true, true> : public true_type
+{
+};
+template <>
+struct __is_narrowing_conversion<bool, bool, true, true> : public false_type
+{
+};
+template <typename _Tp>
+struct __is_narrowing_conversion<_Tp, _Tp, true, true> : public false_type
+{
+};
+
+template <typename _From, typename _To>
+struct __is_narrowing_conversion<_From, _To, false, true>
+  : public negation<std::is_convertible<_From, _To>>
+{
+};
+
+// }}}
+// __converts_to_higher_integer_rank{{{
+template <typename _From, typename _To, bool = (sizeof(_From) < sizeof(_To))>
+struct __converts_to_higher_integer_rank : public true_type
+{
+};
+// this may fail for char -> short if sizeof(char) == sizeof(short)
+template <typename _From, typename _To>
+struct __converts_to_higher_integer_rank<_From, _To, false>
+  : public std::is_same<decltype(std::declval<_From>() + std::declval<_To>()),
+			_To>
+{
+};
+
+// }}}
+// __is_aligned(_v){{{
+template <typename _Flag, size_t _Alignment> struct __is_aligned;
+template <size_t _Alignment>
+struct __is_aligned<vector_aligned_tag, _Alignment> : public true_type
+{
+};
+template <size_t _Alignment>
+struct __is_aligned<element_aligned_tag, _Alignment> : public false_type
+{
+};
+template <size_t _GivenAlignment, size_t _Alignment>
+struct __is_aligned<overaligned_tag<_GivenAlignment>, _Alignment>
+  : public std::integral_constant<bool, (_GivenAlignment % _Alignment == 0)>
+{
+};
+template <typename _Flag, size_t _Alignment>
+inline constexpr bool __is_aligned_v = __is_aligned<_Flag, _Alignment>::value;
+
+// }}}
+// __data(simd/simd_mask) {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd<_Tp, _Ap>& __x);
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd<_Tp, _Ap>& __x);
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd_mask<_Tp, _Ap>& __x);
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd_mask<_Tp, _Ap>& __x);
+
+// }}}
+// _SimdConverter {{{
+template <typename _FromT, typename _FromA, typename _ToT, typename _ToA,
+	  typename = void>
+struct _SimdConverter;
+
+template <typename _Tp, typename _Ap>
+struct _SimdConverter<_Tp, _Ap, _Tp, _Ap, void>
+{
+  template <typename _Up>
+  _GLIBCXX_SIMD_INTRINSIC const _Up& operator()(const _Up& __x)
+  {
+    return __x;
+  }
+};
+
+// }}}
+// __to_value_type_or_member_type {{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__to_value_type_or_member_type(const _V& __x) -> decltype(__data(__x))
+{
+  return __data(__x);
+}
+
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr const typename _V::value_type&
+__to_value_type_or_member_type(const typename _V::value_type& __x)
+{
+  return __x;
+}
+
+// }}}
+// __bool_storage_member_type{{{
+template <size_t _Size> struct __bool_storage_member_type;
+
+template <size_t _Size>
+using __bool_storage_member_type_t =
+  typename __bool_storage_member_type<_Size>::type;
+
+// }}}
+// _SimdTuple {{{
+// why not std::tuple?
+// 1. std::tuple gives no guarantee about the storage order, but I require
+// storage
+//    equivalent to std::array<_Tp, _Np>
+// 2. direct access to the element type (first template argument)
+// 3. enforces equal element type, only different _Abi types are allowed
+template <typename _Tp, typename... _Abis> struct _SimdTuple;
+
+//}}}
+// __fixed_size_storage_t {{{
+template <typename _Tp, int _Np> struct __fixed_size_storage;
+
+template <typename _Tp, int _Np>
+using __fixed_size_storage_t = typename __fixed_size_storage<_Tp, _Np>::type;
+
+// }}}
+// _SimdWrapper fwd decl{{{
+template <typename _Tp, size_t _Size, typename = std::void_t<>>
+struct _SimdWrapper;
+
+template <typename _Tp>
+using _SimdWrapper8 = _SimdWrapper<_Tp, 8 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper16 = _SimdWrapper<_Tp, 16 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper32 = _SimdWrapper<_Tp, 32 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper64 = _SimdWrapper<_Tp, 64 / sizeof(_Tp)>;
+
+// }}}
+// __is_simd_wrapper {{{
+template <typename _Tp> struct __is_simd_wrapper : false_type
+{
+};
+template <typename _Tp, size_t _Np>
+struct __is_simd_wrapper<_SimdWrapper<_Tp, _Np>> : true_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_simd_wrapper_v = __is_simd_wrapper<_Tp>::value;
+
+// }}}
+// _BitOps {{{
+struct _BitOps
+{
+  // __popcount {{{
+  static constexpr _UInt __popcount(_UInt __x)
+  {
+    return __builtin_popcount(__x);
+  }
+  static constexpr _ULong __popcount(_ULong __x)
+  {
+    return __builtin_popcountl(__x);
+  }
+  static constexpr _ULLong __popcount(_ULLong __x)
+  {
+    return __builtin_popcountll(__x);
+  }
+
+  // }}}
+  // __ctz/__clz {{{
+  static constexpr _UInt __ctz(_UInt __x) { return __builtin_ctz(__x); }
+  static constexpr _ULong __ctz(_ULong __x) { return __builtin_ctzl(__x); }
+  static constexpr _ULLong __ctz(_ULLong __x) { return __builtin_ctzll(__x); }
+  static constexpr _UInt __clz(_UInt __x) { return __builtin_clz(__x); }
+  static constexpr _ULong __clz(_ULong __x) { return __builtin_clzl(__x); }
+  static constexpr _ULLong __clz(_ULLong __x) { return __builtin_clzll(__x); }
+
+  // }}}
+  // __bit_iteration {{{
+  template <typename _Tp, typename _Fp>
+  static void __bit_iteration(_Tp __mask, _Fp&& __f)
+  {
+    static_assert(sizeof(_ULLong) >= sizeof(_Tp));
+    std::conditional_t<sizeof(_Tp) <= sizeof(_UInt), _UInt, _ULLong> __k;
+    if constexpr (std::is_convertible_v<_Tp, decltype(__k)>)
+      __k = __mask;
+    else
+      __k = __mask.to_ullong();
+    switch (__popcount(__k))
+      {
+      default:
+	do
+	  {
+	    __f(__ctz(__k));
+	    __k &= (__k - 1);
+	  }
+	while (__k);
+	break;
+      /*case 3:
+	  __f(__ctz(__k));
+	  __k &= (__k - 1);
+	  [[fallthrough]];*/
+      case 2:
+	__f(__ctz(__k));
+	[[fallthrough]];
+      case 1:
+	__f(__popcount(~decltype(__k)()) - 1 - __clz(__k));
+	[[fallthrough]];
+      case 0:
+	break;
+      }
+  }
+
+  //}}}
+  // __firstbit{{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST static auto __firstbit(_Tp __bits)
+  {
+    static_assert(std::is_integral_v<_Tp>,
+		  "__firstbit requires an integral argument");
+    if constexpr (sizeof(_Tp) <= sizeof(int))
+      return __builtin_ctz(__bits);
+    else if constexpr (alignof(_ULLong) == 8)
+      return __builtin_ctzll(__bits);
+    else
+      {
+	_UInt __lo = __bits;
+	return __lo == 0 ? 32 + __builtin_ctz(__bits >> 32)
+			 : __builtin_ctz(__lo);
+      }
+  }
+
+  // }}}
+  // __lastbit{{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST static auto __lastbit(_Tp __bits)
+  {
+    static_assert(std::is_integral_v<_Tp>,
+		  "__lastbit requires an integral argument");
+    if constexpr (sizeof(_Tp) <= sizeof(int))
+      return 31 - __builtin_clz(__bits);
+    else if constexpr (alignof(_ULLong) == 8)
+      return 63 - __builtin_clzll(__bits);
+    else
+      {
+	_UInt __lo = __bits;
+	_UInt __hi = __bits >> 32u;
+	return __hi == 0 ? 31 - __builtin_clz(__lo) : 63 - __builtin_clz(__hi);
+      }
+  }
+
+  // }}}
+};
+
+//}}}
+// __increment, __decrement {{{
+template <typename _Tp = void> struct __increment
+{
+  constexpr _Tp operator()(_Tp __a) const { return ++__a; }
+};
+template <> struct __increment<void>
+{
+  template <typename _Tp> constexpr _Tp operator()(_Tp __a) const
+  {
+    return ++__a;
+  }
+};
+template <typename _Tp = void> struct __decrement
+{
+  constexpr _Tp operator()(_Tp __a) const { return --__a; }
+};
+template <> struct __decrement<void>
+{
+  template <typename _Tp> constexpr _Tp operator()(_Tp __a) const
+  {
+    return --__a;
+  }
+};
+
+// }}}
+// _ValuePreserving(OrInt) {{{
+template <typename _From, typename _To,
+	  typename = enable_if_t<negation<
+	    __is_narrowing_conversion<__remove_cvref_t<_From>, _To>>::value>>
+using _ValuePreserving = _From;
+
+template <typename _From, typename _To,
+	  typename _DecayedFrom = __remove_cvref_t<_From>,
+	  typename = enable_if_t<conjunction<
+	    is_convertible<_From, _To>,
+	    disjunction<
+	      is_same<_DecayedFrom, _To>, is_same<_DecayedFrom, int>,
+	      conjunction<is_same<_DecayedFrom, _UInt>, is_unsigned<_To>>,
+	      negation<__is_narrowing_conversion<_DecayedFrom, _To>>>>::value>>
+using _ValuePreservingOrInt = _From;
+
+// }}}
+// __intrinsic_type {{{
+template <typename _Tp, size_t _Bytes, typename = std::void_t<>>
+struct __intrinsic_type;
+template <typename _Tp, size_t _Size>
+using __intrinsic_type_t =
+  typename __intrinsic_type<_Tp, _Size * sizeof(_Tp)>::type;
+template <typename _Tp>
+using __intrinsic_type2_t = typename __intrinsic_type<_Tp, 2>::type;
+template <typename _Tp>
+using __intrinsic_type4_t = typename __intrinsic_type<_Tp, 4>::type;
+template <typename _Tp>
+using __intrinsic_type8_t = typename __intrinsic_type<_Tp, 8>::type;
+template <typename _Tp>
+using __intrinsic_type16_t = typename __intrinsic_type<_Tp, 16>::type;
+template <typename _Tp>
+using __intrinsic_type32_t = typename __intrinsic_type<_Tp, 32>::type;
+template <typename _Tp>
+using __intrinsic_type64_t = typename __intrinsic_type<_Tp, 64>::type;
+template <typename _Tp>
+using __intrinsic_type128_t = typename __intrinsic_type<_Tp, 128>::type;
+
+// }}}
+// _BitMask {{{
+template <size_t _Np, bool _Sanitized = false> struct _BitMask;
+
+template <size_t _Np, bool _Sanitized>
+struct __is_bitmask<_BitMask<_Np, _Sanitized>, void> : true_type
+{
+};
+
+template <size_t _Np> using _SanitizedBitMask = _BitMask<_Np, true>;
+
+template <size_t _Np, bool _Sanitized> struct _BitMask
+{
+  static_assert(_Np > 0);
+  static constexpr size_t _NBytes = __div_roundup(_Np, CHAR_BIT);
+  using _Tp = conditional_t<_Np == 1, bool,
+			    make_unsigned_t<__int_with_sizeof_t<std::min(
+			      sizeof(_ULLong), __next_power_of_2(_NBytes))>>>;
+  static constexpr int _S_array_size = __div_roundup(_NBytes, sizeof(_Tp));
+  _Tp _M_bits[_S_array_size];
+  static constexpr int _S_unused_bits
+    = _Np == 1 ? 0 : _S_array_size * sizeof(_Tp) * CHAR_BIT - _Np;
+  static constexpr _Tp _S_bitmask = +_Tp(~_Tp()) >> _S_unused_bits;
+
+  constexpr _BitMask() noexcept = default;
+  constexpr _BitMask(unsigned long long __x) noexcept
+    : _M_bits{static_cast<_Tp>(__x)}
+  {}
+  _BitMask(std::bitset<_Np> __x) noexcept : _BitMask(__x.to_ullong()) {}
+
+  constexpr _BitMask(const _BitMask&) noexcept = default;
+
+  template <bool _RhsSanitized, typename = enable_if_t<_RhsSanitized == false
+						       && _Sanitized == true>>
+  constexpr _BitMask(const _BitMask<_Np, _RhsSanitized>& __rhs) noexcept
+    : _BitMask(__rhs._M_sanitized())
+  {}
+
+  constexpr operator _SimdWrapper<bool, _Np>() const noexcept
+  {
+    static_assert(_S_array_size == 1);
+    return _M_bits[0];
+  }
+
+  // precondition: is sanitized
+  constexpr _Tp _M_to_bits() const noexcept
+  {
+    static_assert(_S_array_size == 1);
+    return _M_bits[0];
+  }
+  // precondition: is sanitized
+  constexpr unsigned long long to_ullong() const noexcept
+  {
+    static_assert(_S_array_size == 1);
+    return _M_bits[0];
+  }
+  // precondition: is sanitized
+  constexpr unsigned long to_ulong() const noexcept
+  {
+    static_assert(_S_array_size == 1);
+    return _M_bits[0];
+  }
+  constexpr std::bitset<_Np> _M_to_bitset() const noexcept
+  {
+    static_assert(_S_array_size == 1);
+    return _M_bits[0];
+  }
+
+  constexpr decltype(auto) _M_sanitized() const noexcept
+  {
+    if constexpr (_Sanitized)
+      return *this;
+    else if constexpr (_Np == 1)
+      return _SanitizedBitMask<_Np>(_M_bits[0]);
+    else
+      {
+	_SanitizedBitMask<_Np> __r = {};
+	for (int __i = 0; __i < _S_array_size; ++__i)
+	  __r._M_bits[__i] = _M_bits[__i];
+	if constexpr (_S_unused_bits > 0)
+	  __r._M_bits[_S_array_size - 1] &= _S_bitmask;
+	return __r;
+      }
+  }
+
+  template <size_t _Mp, bool _LSanitized>
+  constexpr _BitMask<_Np + _Mp, _Sanitized>
+  _M_prepend(_BitMask<_Mp, _LSanitized> __lsb) const noexcept
+  {
+    constexpr size_t _RN = _Np + _Mp;
+    using _Rp = _BitMask<_RN, _Sanitized>;
+    if constexpr (_Rp::_S_array_size == 1)
+      {
+	_Rp __r{{_M_bits[0]}};
+	__r._M_bits[0] <<= _Mp;
+	__r._M_bits[0] |= __lsb._M_sanitized()._M_bits[0];
+	return __r;
+      }
+    else
+      __assert_unreachable<_Rp>();
+  }
+
+  // Return a new _BitMask with size _NewSize while dropping _DropLsb least
+  // significant bits. If the operation implicitly produces a sanitized bitmask,
+  // the result type will have _Sanitized set.
+  template <size_t _DropLsb, size_t _NewSize = _Np - _DropLsb>
+  constexpr auto _M_extract() const noexcept
+  {
+    static_assert(_Np > _DropLsb);
+    static_assert(_DropLsb + _NewSize <= sizeof(_ULLong) * CHAR_BIT,
+		  "not implemented for bitmasks larger than one ullong");
+    if constexpr (_NewSize == 1) // must sanitize because the return _Tp is bool
+      return _SanitizedBitMask<1>{
+	{static_cast<bool>(_M_bits[0] & (_Tp(1) << _DropLsb))}};
+    else
+      return _BitMask<_NewSize,
+		      ((_NewSize + _DropLsb == sizeof(_Tp) * CHAR_BIT
+			&& _NewSize + _DropLsb <= _Np)
+		       || ((_Sanitized || _Np == sizeof(_Tp) * CHAR_BIT)
+			   && _NewSize + _DropLsb >= _Np))>(_M_bits[0]
+							    >> _DropLsb);
+  }
+
+  // True if all bits are set. Implicitly sanitizes if _Sanitized == false.
+  constexpr bool all() const noexcept
+  {
+    if constexpr (_Np == 1)
+      return _M_bits[0];
+    else if constexpr (!_Sanitized)
+      return _M_sanitized().all();
+    else
+      {
+	constexpr _Tp __allbits = ~_Tp();
+	for (int __i = 0; __i < _S_array_size - 1; ++__i)
+	  if (_M_bits[__i] != __allbits)
+	    return false;
+	return _M_bits[_S_array_size - 1] == _S_bitmask;
+      }
+  }
+
+  // True if at least one bit is set. Implicitly sanitizes if _Sanitized ==
+  // false.
+  constexpr bool any() const noexcept
+  {
+    if constexpr (_Np == 1)
+      return _M_bits[0];
+    else if constexpr (!_Sanitized)
+      return _M_sanitized().any();
+    else
+      {
+	for (int __i = 0; __i < _S_array_size - 1; ++__i)
+	  if (_M_bits[__i] != 0)
+	    return true;
+	return _M_bits[_S_array_size - 1] != 0;
+      }
+  }
+
+  // True if no bit is set. Implicitly sanitizes if _Sanitized == false.
+  constexpr bool none() const noexcept
+  {
+    if constexpr (_Np == 1)
+      return !_M_bits[0];
+    else if constexpr (!_Sanitized)
+      return _M_sanitized().none();
+    else
+      {
+	for (int __i = 0; __i < _S_array_size - 1; ++__i)
+	  if (_M_bits[__i] != 0)
+	    return false;
+	return _M_bits[_S_array_size - 1] == 0;
+      }
+  }
+
+  // Returns the number of set bits. Implicitly sanitizes if _Sanitized ==
+  // false.
+  constexpr int count() const noexcept
+  {
+    if constexpr (_Np == 1)
+      return _M_bits[0];
+    else if constexpr (!_Sanitized)
+      return _M_sanitized().none();
+    else
+      {
+	int __result = __builtin_popcountll(_M_bits[0]);
+	for (int __i = 1; __i < _S_array_size; ++__i)
+	  __result += __builtin_popcountll(_M_bits[__i]);
+	return __result;
+      }
+  }
+
+  // Returns the bit at offset __i as bool.
+  constexpr bool operator[](size_t __i) const noexcept
+  {
+    if constexpr (_Np == 1)
+      return _M_bits[0];
+    else if constexpr (_S_array_size == 1)
+      return (_M_bits[0] >> __i) & 1;
+    else
+      {
+	const size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+	const size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+	return (_M_bits[__j] >> __shift) & 1;
+      }
+  }
+  template <size_t __i>
+  constexpr bool operator[](_SizeConstant<__i>) const noexcept
+  {
+    static_assert(__i < _Np);
+    constexpr size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+    constexpr size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+    return static_cast<bool>(_M_bits[__j] & (_Tp(1) << __shift));
+  }
+
+  // Set the bit at offset __i to __x.
+  constexpr void set(size_t __i, bool __x) noexcept
+  {
+    if constexpr (_Np == 1)
+      _M_bits[0] = __x;
+    else if constexpr (_S_array_size == 1)
+      {
+	_M_bits[0] &= ~_Tp(_Tp(1) << __i);
+	_M_bits[0] |= _Tp(_Tp(__x) << __i);
+      }
+    else
+      {
+	const size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+	const size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+	_M_bits[__j] &= ~_Tp(_Tp(1) << __shift);
+	_M_bits[__j] |= _Tp(_Tp(__x) << __shift);
+      }
+  }
+  template <size_t __i>
+  constexpr void set(_SizeConstant<__i>, bool __x) noexcept
+  {
+    static_assert(__i < _Np);
+    if constexpr (_Np == 1)
+      _M_bits[0] = __x;
+    else
+      {
+	constexpr size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+	constexpr size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+	constexpr _Tp __mask = ~_Tp(_Tp(1) << __shift);
+	_M_bits[__j] &= __mask;
+	_M_bits[__j] |= _Tp(_Tp(__x) << __shift);
+      }
+  }
+
+  // Inverts all bits. Sanitized input leads to sanitized output.
+  constexpr _BitMask operator~() const noexcept
+  {
+    if constexpr (_Np == 1)
+      return !_M_bits[0];
+    else
+      {
+	_BitMask __result{};
+	for (int __i = 0; __i < _S_array_size - 1; ++__i)
+	  __result._M_bits[__i] = ~_M_bits[__i];
+	if constexpr (_Sanitized)
+	  __result._M_bits[_S_array_size - 1]
+	    = _M_bits[_S_array_size - 1] ^ _S_bitmask;
+	else
+	  __result._M_bits[_S_array_size - 1] = ~_M_bits[_S_array_size - 1];
+	return __result;
+      }
+  }
+
+  constexpr _BitMask& operator^=(const _BitMask& __b) & noexcept
+  {
+    __execute_n_times<_S_array_size>(
+      [&](auto __i) { _M_bits[__i] ^= __b._M_bits[__i]; });
+    return *this;
+  }
+  constexpr _BitMask& operator|=(const _BitMask& __b) & noexcept
+  {
+    __execute_n_times<_S_array_size>(
+      [&](auto __i) { _M_bits[__i] |= __b._M_bits[__i]; });
+    return *this;
+  }
+  constexpr _BitMask& operator&=(const _BitMask& __b) & noexcept
+  {
+    __execute_n_times<_S_array_size>(
+      [&](auto __i) { _M_bits[__i] &= __b._M_bits[__i]; });
+    return *this;
+  }
+  friend constexpr _BitMask operator^(const _BitMask& __a,
+				      const _BitMask& __b) noexcept
+  {
+    _BitMask __r = __a;
+    __r ^= __b;
+    return __r;
+  }
+  friend constexpr _BitMask operator|(const _BitMask& __a,
+				      const _BitMask& __b) noexcept
+  {
+    _BitMask __r = __a;
+    __r |= __b;
+    return __r;
+  }
+  friend constexpr _BitMask operator&(const _BitMask& __a,
+				      const _BitMask& __b) noexcept
+  {
+    _BitMask __r = __a;
+    __r &= __b;
+    return __r;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    if constexpr (_S_array_size == 0)
+      return __builtin_constant_p(_M_bits[0]);
+    else
+      {
+	for (int __i = 0; __i < _S_array_size; ++__i)
+	  if (!__builtin_constant_p(_M_bits[__i]))
+	    return false;
+	return true;
+      }
+  }
+};
+
+// }}}
+
+// vvv ---- builtin vector types [[gnu::vector_size(N)]] and operations ---- vvv
+// __min_vector_size {{{
+template <typename _Tp = void>
+static inline constexpr int __min_vector_size = 2 * sizeof(_Tp);
+#if _GLIBCXX_SIMD_HAVE_NEON
+template <> inline constexpr int __min_vector_size<void> = 8;
+#else
+template <> inline constexpr int __min_vector_size<void> = 16;
+#endif
+
+// }}}
+// __vector_type {{{
+template <typename _Tp, size_t _Np, typename = void> struct __vector_type_n
+{
+};
+
+// substition failure for 0-element case
+template <typename _Tp> struct __vector_type_n<_Tp, 0, void>
+{
+};
+
+// special case 1-element to be _Tp itself
+template <typename _Tp>
+struct __vector_type_n<_Tp, 1, enable_if_t<__is_vectorizable_v<_Tp>>>
+{
+  using type = _Tp;
+};
+
+// else, use GNU-style builtin vector types
+template <typename _Tp, size_t _Np>
+struct __vector_type_n<_Tp, _Np,
+		       enable_if_t<__is_vectorizable_v<_Tp> && _Np >= 2>>
+{
+  static constexpr size_t _Bytes = _Np * sizeof(_Tp) < __min_vector_size<_Tp>
+				     ? __min_vector_size<_Tp>
+				     : __next_power_of_2(_Np * sizeof(_Tp));
+  using type [[__gnu__::__vector_size__(_Bytes)]] = _Tp;
+};
+
+template <typename _Tp, size_t _Bytes, size_t = _Bytes % sizeof(_Tp)>
+struct __vector_type;
+
+template <typename _Tp, size_t _Bytes>
+struct __vector_type<_Tp, _Bytes, 0>
+  : __vector_type_n<_Tp, _Bytes / sizeof(_Tp)>
+{
+};
+
+template <typename _Tp, size_t _Size>
+using __vector_type_t = typename __vector_type_n<_Tp, _Size>::type;
+template <typename _Tp>
+using __vector_type2_t = typename __vector_type<_Tp, 2>::type;
+template <typename _Tp>
+using __vector_type4_t = typename __vector_type<_Tp, 4>::type;
+template <typename _Tp>
+using __vector_type8_t = typename __vector_type<_Tp, 8>::type;
+template <typename _Tp>
+using __vector_type16_t = typename __vector_type<_Tp, 16>::type;
+template <typename _Tp>
+using __vector_type32_t = typename __vector_type<_Tp, 32>::type;
+template <typename _Tp>
+using __vector_type64_t = typename __vector_type<_Tp, 64>::type;
+template <typename _Tp>
+using __vector_type128_t = typename __vector_type<_Tp, 128>::type;
+
+// }}}
+// __is_vector_type {{{
+template <typename _Tp, typename = std::void_t<>>
+struct __is_vector_type : false_type
+{
+};
+template <typename _Tp>
+struct __is_vector_type<
+  _Tp, std::void_t<typename __vector_type<decltype(std::declval<_Tp>()[0]),
+					  sizeof(_Tp)>::type>>
+  : std::is_same<_Tp, typename __vector_type<decltype(std::declval<_Tp>()[0]),
+					     sizeof(_Tp)>::type>
+{
+};
+
+template <typename _Tp>
+inline constexpr bool __is_vector_type_v = __is_vector_type<_Tp>::value;
+
+// }}}
+// _VectorTraits{{{
+template <typename _Tp, typename = std::void_t<>> struct _VectorTraitsImpl;
+template <typename _Tp>
+struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>>>
+{
+  using type = _Tp;
+  using value_type = decltype(std::declval<_Tp>()[0]);
+  static constexpr int _S_width = sizeof(_Tp) / sizeof(value_type);
+  using _Wrapper = _SimdWrapper<value_type, _S_width>;
+  template <typename _Up, int _W = _S_width>
+  static constexpr bool __is = std::is_same_v<value_type, _Up>&& _W == _S_width;
+};
+template <typename _Tp, size_t _Np>
+struct _VectorTraitsImpl<_SimdWrapper<_Tp, _Np>,
+			 std::void_t<__vector_type_t<_Tp, _Np>>>
+{
+  using type = __vector_type_t<_Tp, _Np>;
+  using value_type = _Tp;
+  static constexpr int _S_width = sizeof(type) / sizeof(value_type);
+  using _Wrapper = _SimdWrapper<_Tp, _Np>;
+  static constexpr bool _S_is_partial = (_Np == _S_width);
+  static constexpr int _S_partial_width = _Np;
+  template <typename _Up, int _W = _S_width>
+  static constexpr bool __is = std::is_same_v<value_type, _Up>&& _W == _S_width;
+};
+
+template <typename _Tp, typename = typename _VectorTraitsImpl<_Tp>::type>
+using _VectorTraits = _VectorTraitsImpl<_Tp>;
+
+// }}}
+// __as_vector{{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__as_vector(_V __x)
+{
+  if constexpr (__is_vector_type_v<_V>)
+    return __x;
+  else if constexpr (is_simd<_V>::value || is_simd_mask<_V>::value)
+    return __data(__x)._M_data;
+  else if constexpr (__is_vectorizable_v<_V>)
+    return __vector_type_t<_V, 2>{__x};
+  else
+    return __x._M_data;
+}
+
+// }}}
+// __as_wrapper{{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__as_wrapper(_V __x)
+{
+  if constexpr (__is_vector_type_v<_V>)
+    return _SimdWrapper<typename _VectorTraits<_V>::value_type,
+			_VectorTraits<_V>::_S_width>(__x);
+  else if constexpr (is_simd<_V>::value || is_simd_mask<_V>::value)
+    return __data(__x);
+  else
+    return __x;
+}
+
+// }}}
+// __intrin_bitcast{{{
+template <typename _To, typename _From>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__intrin_bitcast(_From __v)
+{
+  static_assert(__is_vector_type_v<_From> && __is_vector_type_v<_To>);
+  if constexpr (sizeof(_To) == sizeof(_From))
+    return reinterpret_cast<_To>(__v);
+  else if constexpr (sizeof(_From) > sizeof(_To))
+    if constexpr (sizeof(_To) >= 16)
+      return reinterpret_cast<const __may_alias<_To>&>(__v);
+    else
+      {
+	_To __r;
+	__builtin_memcpy(&__r, &__v, sizeof(_To));
+	return __r;
+      }
+#if _GLIBCXX_SIMD_X86INTRIN
+  else if constexpr (__have_avx && sizeof(_From) == 16 && sizeof(_To) == 32)
+    return reinterpret_cast<_To>(__builtin_ia32_ps256_ps(
+      reinterpret_cast<__vector_type_t<float, 4>>(__v)));
+  else if constexpr (__have_avx512f && sizeof(_From) == 16 && sizeof(_To) == 64)
+    return reinterpret_cast<_To>(__builtin_ia32_ps512_ps(
+      reinterpret_cast<__vector_type_t<float, 4>>(__v)));
+  else if constexpr (__have_avx512f && sizeof(_From) == 32 && sizeof(_To) == 64)
+    return reinterpret_cast<_To>(__builtin_ia32_ps512_256ps(
+      reinterpret_cast<__vector_type_t<float, 8>>(__v)));
+#endif // _GLIBCXX_SIMD_X86INTRIN
+  else if constexpr (sizeof(__v) <= 8)
+    return reinterpret_cast<_To>(
+      __vector_type_t<__int_for_sizeof_t<_From>, sizeof(_To) / sizeof(_From)>{
+	reinterpret_cast<__int_for_sizeof_t<_From>>(__v)});
+  else
+    {
+      static_assert(sizeof(_To) > sizeof(_From));
+      _To __r = {};
+      __builtin_memcpy(&__r, &__v, sizeof(_From));
+      return __r;
+    }
+}
+
+// }}}
+// __vector_bitcast{{{
+template <typename _To, size_t _NN = 0, typename _From,
+	  typename _FromVT = _VectorTraits<_From>,
+	  size_t _Np = _NN == 0 ? sizeof(_From) / sizeof(_To) : _NN>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_To, _Np>
+__vector_bitcast(_From __x)
+{
+  using _R = __vector_type_t<_To, _Np>;
+  return __intrin_bitcast<_R>(__x);
+}
+template <typename _To, size_t _NN = 0, typename _Tp, size_t _Nx,
+	  size_t _Np
+	  = _NN == 0 ? sizeof(_SimdWrapper<_Tp, _Nx>) / sizeof(_To) : _NN>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_To, _Np>
+__vector_bitcast(const _SimdWrapper<_Tp, _Nx>& __x)
+{
+  static_assert(_Np > 1);
+  return __intrin_bitcast<__vector_type_t<_To, _Np>>(__x._M_data);
+}
+
+// }}}
+// __convert_x86 declarations {{{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp,
+		  _Tp, _Tp, _Tp, _Tp);
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR85048
+
+//}}}
+// __to_intrin {{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+	  typename _R
+	  = __intrinsic_type_t<typename _TVT::value_type, _TVT::_S_width>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__to_intrin(_Tp __x)
+{
+  static_assert(sizeof(__x) <= sizeof(_R),
+		"__to_intrin may never drop values off the end");
+  if constexpr (sizeof(__x) == sizeof(_R))
+    return reinterpret_cast<_R>(__as_vector(__x));
+  else
+    {
+      using _Up = __int_for_sizeof_t<_Tp>;
+      return reinterpret_cast<_R>(
+	__vector_type_t<_Up, sizeof(_R) / sizeof(_Up)>{__bit_cast<_Up>(__x)});
+    }
+}
+
+// }}}
+// __make_vector{{{
+template <typename _Tp, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, sizeof...(_Args)>
+__make_vector(const _Args&... __args)
+{
+  return __vector_type_t<_Tp, sizeof...(_Args)>{static_cast<_Tp>(__args)...};
+}
+
+// }}}
+// __vector_broadcast{{{
+template <size_t _Np, typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__vector_broadcast(_Tp __x)
+{
+  return __call_with_n_evaluations<_Np>(
+    [](auto... __xx) { return __vector_type_t<_Tp, _Np>{__xx...}; },
+    [&__x](int) { return __x; });
+}
+
+// }}}
+// __generate_vector{{{
+template <typename _Tp, size_t _Np, typename _Gp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__generate_vector_impl(_Gp&& __gen, std::index_sequence<_I...>)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR89229
+  // Using -S -fverbose-asm this function turned up as the place where the
+  // invalid instruction was produced. Using some arbitrary memory clobbers to
+  // kill the optimizer and thus avoid the problem.
+  if constexpr (__have_avx512f && !__have_avx512vl && sizeof(_Tp) == 8
+		&& std::is_integral_v<_Tp>)
+    if (!__builtin_is_constant_evaluated())
+      [] { asm("" ::: "memory"); }();
+#endif
+  return __vector_type_t<_Tp, _Np>{
+    static_cast<_Tp>(__gen(_SizeConstant<_I>()))...};
+}
+
+template <typename _V, typename _VVT = _VectorTraits<_V>, typename _Gp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _V
+__generate_vector(_Gp&& __gen)
+{
+  if constexpr (__is_vector_type_v<_V>)
+    return __generate_vector_impl<typename _VVT::value_type, _VVT::_S_width>(
+      static_cast<_Gp&&>(__gen), std::make_index_sequence<_VVT::_S_width>());
+  else
+    return __generate_vector_impl<typename _VVT::value_type,
+				  _VVT::_S_partial_width>(
+      static_cast<_Gp&&>(__gen),
+      std::make_index_sequence<_VVT::_S_partial_width>());
+}
+
+template <typename _Tp, size_t _Np, typename _Gp>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__generate_vector(_Gp&& __gen)
+{
+  return __generate_vector_impl<_Tp, _Np>(static_cast<_Gp&&>(__gen),
+					  std::make_index_sequence<_Np>());
+}
+
+// }}}
+// __xor{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__xor(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+  static_assert(sizeof...(_Dummy) == 0);
+  using _Up = typename _TVT::value_type;
+  using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+  return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+			       ^ __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() ^ _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__xor(_Tp __a, _Tp __b) noexcept
+{
+  return __a ^ __b;
+}
+
+// }}}
+// __or{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__or(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+  static_assert(sizeof...(_Dummy) == 0);
+  using _Up = typename _TVT::value_type;
+  using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+  return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+			       | __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() | _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__or(_Tp __a, _Tp __b) noexcept
+{
+  return __a | __b;
+}
+
+// }}}
+// __and{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__and(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+  static_assert(sizeof...(_Dummy) == 0);
+  using _Up = typename _TVT::value_type;
+  using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+  return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+			       & __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() & _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__and(_Tp __a, _Tp __b) noexcept
+{
+  return __a & __b;
+}
+
+// }}}
+// __andnot{{{
+#if _GLIBCXX_SIMD_X86INTRIN && !defined __clang__
+static constexpr struct
+{
+  _GLIBCXX_SIMD_INTRINSIC __v4sf operator()(__v4sf __a,
+					    __v4sf __b) const noexcept
+  {
+    return __builtin_ia32_andnps(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v2df operator()(__v2df __a,
+					    __v2df __b) const noexcept
+  {
+    return __builtin_ia32_andnpd(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v2di operator()(__v2di __a,
+					    __v2di __b) const noexcept
+  {
+    return __builtin_ia32_pandn128(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v8sf operator()(__v8sf __a,
+					    __v8sf __b) const noexcept
+  {
+    return __builtin_ia32_andnps256(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v4df operator()(__v4df __a,
+					    __v4df __b) const noexcept
+  {
+    return __builtin_ia32_andnpd256(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v4di operator()(__v4di __a,
+					    __v4di __b) const noexcept
+  {
+    return __builtin_ia32_andnotsi256(__a, __b);
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v16sf operator()(__v16sf __a,
+					     __v16sf __b) const noexcept
+  {
+    if constexpr (__have_avx512dq)
+      return _mm512_andnot_ps(__a, __b);
+    else
+      return reinterpret_cast<__v16sf>(
+	_mm512_andnot_si512(reinterpret_cast<__v8di>(__a),
+			    reinterpret_cast<__v8di>(__b)));
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v8df operator()(__v8df __a,
+					    __v8df __b) const noexcept
+  {
+    if constexpr (__have_avx512dq)
+      return _mm512_andnot_pd(__a, __b);
+    else
+      return reinterpret_cast<__v8df>(
+	_mm512_andnot_si512(reinterpret_cast<__v8di>(__a),
+			    reinterpret_cast<__v8di>(__b)));
+  }
+  _GLIBCXX_SIMD_INTRINSIC __v8di operator()(__v8di __a,
+					    __v8di __b) const noexcept
+  {
+    return _mm512_andnot_si512(__a, __b);
+  }
+} _S_x86_andnot;
+#endif // _GLIBCXX_SIMD_X86INTRIN && !__clang__
+
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__andnot(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+  static_assert(sizeof...(_Dummy) == 0);
+#if _GLIBCXX_SIMD_X86INTRIN && !defined __clang__
+  if constexpr (sizeof(_Tp) >= 16)
+    {
+      const auto __ai = __to_intrin(__a);
+      const auto __bi = __to_intrin(__b);
+      if (!__builtin_is_constant_evaluated()
+	  && !(__builtin_constant_p(__ai) && __builtin_constant_p(__bi)))
+	{
+	  const auto __r = _S_x86_andnot(__ai, __bi);
+	  if constexpr (is_convertible_v<decltype(__r), _Tp>)
+	    return __r;
+	  else
+	    return reinterpret_cast<_Tp>(__r);
+	}
+    }
+#endif // _GLIBCXX_SIMD_X86INTRIN
+  using _Up = typename _TVT::value_type;
+  using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+  return __vector_bitcast<_Up>(~__vector_bitcast<_Ip>(__a)
+			       & __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(~_Tp() & _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__andnot(_Tp __a, _Tp __b) noexcept
+{
+  return ~__a & __b;
+}
+
+// }}}
+// __not{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__not(_Tp __a) noexcept
+{
+  if constexpr (std::is_floating_point_v<typename _TVT::value_type>)
+    return reinterpret_cast<typename _TVT::type>(
+      ~__vector_bitcast<unsigned>(__a));
+  else
+    return ~__a;
+}
+
+// }}}
+// __concat{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+	  typename _R
+	  = __vector_type_t<typename _TVT::value_type, _TVT::_S_width * 2>>
+constexpr _R
+__concat(_Tp a_, _Tp b_)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_1
+  using _W
+    = std::conditional_t<std::is_floating_point_v<typename _TVT::value_type>,
+			 double,
+			 conditional_t<(sizeof(_Tp) >= 2 * sizeof(long long)),
+				       long long, typename _TVT::value_type>>;
+  constexpr int input_width = sizeof(_Tp) / sizeof(_W);
+  const auto __a = __vector_bitcast<_W>(a_);
+  const auto __b = __vector_bitcast<_W>(b_);
+  using _Up = __vector_type_t<_W, sizeof(_R) / sizeof(_W)>;
+#else
+  constexpr int input_width = _TVT::_S_width;
+  const _Tp& __a = a_;
+  const _Tp& __b = b_;
+  using _Up = _R;
+#endif
+  if constexpr (input_width == 2)
+    return reinterpret_cast<_R>(_Up{__a[0], __a[1], __b[0], __b[1]});
+  else if constexpr (input_width == 4)
+    return reinterpret_cast<_R>(
+      _Up{__a[0], __a[1], __a[2], __a[3], __b[0], __b[1], __b[2], __b[3]});
+  else if constexpr (input_width == 8)
+    return reinterpret_cast<_R>(
+      _Up{__a[0], __a[1], __a[2], __a[3], __a[4], __a[5], __a[6], __a[7],
+	  __b[0], __b[1], __b[2], __b[3], __b[4], __b[5], __b[6], __b[7]});
+  else if constexpr (input_width == 16)
+    return reinterpret_cast<_R>(
+      _Up{__a[0],  __a[1],  __a[2],  __a[3],  __a[4],  __a[5],  __a[6],
+	  __a[7],  __a[8],  __a[9],  __a[10], __a[11], __a[12], __a[13],
+	  __a[14], __a[15], __b[0],  __b[1],  __b[2],  __b[3],  __b[4],
+	  __b[5],  __b[6],  __b[7],  __b[8],  __b[9],  __b[10], __b[11],
+	  __b[12], __b[13], __b[14], __b[15]});
+  else if constexpr (input_width == 32)
+    return reinterpret_cast<_R>(_Up{
+      __a[0],  __a[1],  __a[2],  __a[3],  __a[4],  __a[5],  __a[6],  __a[7],
+      __a[8],  __a[9],  __a[10], __a[11], __a[12], __a[13], __a[14], __a[15],
+      __a[16], __a[17], __a[18], __a[19], __a[20], __a[21], __a[22], __a[23],
+      __a[24], __a[25], __a[26], __a[27], __a[28], __a[29], __a[30], __a[31],
+      __b[0],  __b[1],  __b[2],  __b[3],  __b[4],  __b[5],  __b[6],  __b[7],
+      __b[8],  __b[9],  __b[10], __b[11], __b[12], __b[13], __b[14], __b[15],
+      __b[16], __b[17], __b[18], __b[19], __b[20], __b[21], __b[22], __b[23],
+      __b[24], __b[25], __b[26], __b[27], __b[28], __b[29], __b[30], __b[31]});
+}
+
+// }}}
+// __zero_extend {{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+struct _ZeroExtendProxy
+{
+  using value_type = typename _TVT::value_type;
+  static constexpr size_t _Np = _TVT::_S_width;
+  const _Tp __x;
+
+  template <typename _To, typename _ToVT = _VectorTraits<_To>,
+	    typename
+	    = enable_if_t<is_same_v<typename _ToVT::value_type, value_type>>>
+  _GLIBCXX_SIMD_INTRINSIC operator _To() const
+  {
+    constexpr size_t _ToN = _ToVT::_S_width;
+    if constexpr (_ToN == _Np)
+      return __x;
+    else if constexpr (_ToN == 2 * _Np)
+      {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_3
+	if constexpr (__have_avx && _TVT::template __is<float, 4>)
+	  return __vector_bitcast<value_type>(
+	    _mm256_insertf128_ps(__m256(), __x, 0));
+	else if constexpr (__have_avx && _TVT::template __is<double, 2>)
+	  return __vector_bitcast<value_type>(
+	    _mm256_insertf128_pd(__m256d(), __x, 0));
+	else if constexpr (__have_avx2 && _Np * sizeof(value_type) == 16)
+	  return __vector_bitcast<value_type>(
+	    _mm256_insertf128_si256(__m256i(), __to_intrin(__x), 0));
+	else if constexpr (__have_avx512f && _TVT::template __is<float, 8>)
+	  {
+	    if constexpr (__have_avx512dq)
+	      return __vector_bitcast<value_type>(
+		_mm512_insertf32x8(__m512(), __x, 0));
+	    else
+	      return reinterpret_cast<__m512>(
+		_mm512_insertf64x4(__m512d(), reinterpret_cast<__m256d>(__x),
+				   0));
+	  }
+	else if constexpr (__have_avx512f && _TVT::template __is<double, 4>)
+	  return __vector_bitcast<value_type>(
+	    _mm512_insertf64x4(__m512d(), __x, 0));
+	else if constexpr (__have_avx512f && _Np * sizeof(value_type) == 32)
+	  return __vector_bitcast<value_type>(
+	    _mm512_inserti64x4(__m512i(), __to_intrin(__x), 0));
+#endif
+	return __concat(__x, _Tp());
+      }
+    else if constexpr (_ToN == 4 * _Np)
+      {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_3
+	if constexpr (__have_avx512dq && _TVT::template __is<double, 2>)
+	  {
+	    return __vector_bitcast<value_type>(
+	      _mm512_insertf64x2(__m512d(), __x, 0));
+	  }
+	else if constexpr (__have_avx512f
+			   && std::is_floating_point_v<value_type>)
+	  {
+	    return __vector_bitcast<value_type>(
+	      _mm512_insertf32x4(__m512(), reinterpret_cast<__m128>(__x), 0));
+	  }
+	else if constexpr (__have_avx512f && _Np * sizeof(value_type) == 16)
+	  {
+	    return __vector_bitcast<value_type>(
+	      _mm512_inserti32x4(__m512i(), __to_intrin(__x), 0));
+	  }
+#endif
+	return __concat(__concat(__x, _Tp()),
+			__vector_type_t<value_type, _Np * 2>());
+      }
+    else if constexpr (_ToN == 8 * _Np)
+      return __concat(operator __vector_type_t<value_type, _Np * 4>(),
+		      __vector_type_t<value_type, _Np * 4>());
+    else if constexpr (_ToN == 16 * _Np)
+      return __concat(operator __vector_type_t<value_type, _Np * 8>(),
+		      __vector_type_t<value_type, _Np * 8>());
+    else
+      __assert_unreachable<_Tp>();
+  }
+};
+
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _ZeroExtendProxy<_Tp, _TVT>
+__zero_extend(_Tp __x)
+{
+  return {__x};
+}
+
+// }}}
+// __extract<_Np, By>{{{
+template <
+  int _Offset, int _SplitBy, typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+  typename _R
+  = __vector_type_t<typename _TVT::value_type, _TVT::_S_width / _SplitBy>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__extract(_Tp __in)
+{
+  using value_type = typename _TVT::value_type;
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+  if constexpr (sizeof(_Tp) == 64 && _SplitBy == 4 && _Offset > 0)
+    {
+      if constexpr (__have_avx512dq && std::is_same_v<double, value_type>)
+	return _mm512_extractf64x2_pd(__to_intrin(__in), _Offset);
+      else if constexpr (std::is_floating_point_v<value_type>)
+	return __vector_bitcast<value_type>(
+	  _mm512_extractf32x4_ps(__intrin_bitcast<__m512>(__in), _Offset));
+      else
+	return reinterpret_cast<_R>(
+	  _mm512_extracti32x4_epi32(__intrin_bitcast<__m512i>(__in), _Offset));
+    }
+  else
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+    {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_1
+      using _W = std::conditional_t<
+	std::is_floating_point_v<value_type>, double,
+	std::conditional_t<(sizeof(_R) >= 16), long long, value_type>>;
+      static_assert(sizeof(_R) % sizeof(_W) == 0);
+      constexpr int __return_width = sizeof(_R) / sizeof(_W);
+      using _Up = __vector_type_t<_W, __return_width>;
+      const auto __x = __vector_bitcast<_W>(__in);
+#else
+    constexpr int __return_width = _TVT::_S_width / _SplitBy;
+    using _Up = _R;
+    const __vector_type_t<value_type, _TVT::_S_width>& __x
+      = __in; // only needed for _Tp = _SimdWrapper<value_type, _Np>
+#endif
+      constexpr int _O = _Offset * __return_width;
+      return __call_with_subscripts<__return_width, _O>(
+	__x, [](auto... __entries) {
+	  return reinterpret_cast<_R>(_Up{__entries...});
+	});
+    }
+}
+
+// }}}
+// __lo/__hi64[z]{{{
+template <typename _Tp,
+	  typename _R
+	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__lo64(_Tp __x)
+{
+  _R __r{};
+  __builtin_memcpy(&__r, &__x, 8);
+  return __r;
+}
+
+template <typename _Tp,
+	  typename _R
+	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__hi64(_Tp __x)
+{
+  static_assert(sizeof(_Tp) == 16, "use __hi64z if you meant it");
+  _R __r{};
+  __builtin_memcpy(&__r, reinterpret_cast<const char*>(&__x) + 8, 8);
+  return __r;
+}
+
+template <typename _Tp,
+	  typename _R
+	  = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__hi64z([[maybe_unused]] _Tp __x)
+{
+  _R __r{};
+  if constexpr (sizeof(_Tp) == 16)
+    __builtin_memcpy(&__r, reinterpret_cast<const char*>(&__x) + 8, 8);
+  return __r;
+}
+
+// }}}
+// __lo/__hi128{{{
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__lo128(_Tp __x)
+{
+  return __extract<0, sizeof(_Tp) / 16>(__x);
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__hi128(_Tp __x)
+{
+  static_assert(sizeof(__x) == 32);
+  return __extract<1, 2>(__x);
+}
+
+// }}}
+// __lo/__hi256{{{
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__lo256(_Tp __x)
+{
+  static_assert(sizeof(__x) == 64);
+  return __extract<0, 2>(__x);
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__hi256(_Tp __x)
+{
+  static_assert(sizeof(__x) == 64);
+  return __extract<1, 2>(__x);
+}
+
+// }}}
+// __auto_bitcast{{{
+template <typename _Tp> struct _AutoCast
+{
+  static_assert(__is_vector_type_v<_Tp>);
+  const _Tp __x;
+  template <typename _Up, typename _UVT = _VectorTraits<_Up>>
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator _Up() const
+  {
+    return __intrin_bitcast<typename _UVT::type>(__x);
+  }
+};
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _AutoCast<_Tp>
+__auto_bitcast(const _Tp& __x)
+{
+  return {__x};
+}
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC constexpr _AutoCast<
+  typename _SimdWrapper<_Tp, _Np>::_BuiltinType>
+__auto_bitcast(const _SimdWrapper<_Tp, _Np>& __x)
+{
+  return {__x._M_data};
+}
+
+// }}}
+// ^^^ ---- builtin vector types [[gnu::vector_size(N)]] and operations ---- ^^^
+
+#if _GLIBCXX_SIMD_HAVE_SSE_ABI
+// __bool_storage_member_type{{{
+#if _GLIBCXX_SIMD_HAVE_AVX512F && _GLIBCXX_SIMD_X86INTRIN
+template <size_t _Size> struct __bool_storage_member_type
+{
+  static_assert((_Size & (_Size - 1)) != 0,
+		"This trait may only be used for non-power-of-2 sizes. "
+		"Power-of-2 sizes must be specialized.");
+  using type =
+    typename __bool_storage_member_type<__next_power_of_2(_Size)>::type;
+};
+template <> struct __bool_storage_member_type<1>
+{
+  using type = bool;
+};
+template <> struct __bool_storage_member_type<2>
+{
+  using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<4>
+{
+  using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<8>
+{
+  using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<16>
+{
+  using type = __mmask16;
+};
+template <> struct __bool_storage_member_type<32>
+{
+  using type = __mmask32;
+};
+template <> struct __bool_storage_member_type<64>
+{
+  using type = __mmask64;
+};
+#endif // _GLIBCXX_SIMD_HAVE_AVX512F
+
+// }}}
+// __intrinsic_type (x86){{{
+// the following excludes bool via __is_vectorizable
+#if _GLIBCXX_SIMD_HAVE_SSE
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<
+  _Tp, _Bytes, std::enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 64>>
+{
+  static_assert(!std::is_same_v<_Tp, long double>,
+		"no __intrinsic_type support for long double on x86");
+  static constexpr std::size_t _VBytes
+    = _Bytes <= 16 ? 16 : _Bytes <= 32 ? 32 : 64;
+  using type [[__gnu__::__vector_size__(_VBytes)]]
+  = std::conditional_t<std::is_integral_v<_Tp>, long long int, _Tp>;
+};
+#endif // _GLIBCXX_SIMD_HAVE_SSE
+
+// }}}
+#endif // _GLIBCXX_SIMD_HAVE_SSE_ABI
+// __intrinsic_type (ARM){{{
+#if _GLIBCXX_SIMD_HAVE_NEON
+#define _GLIBCXX_SIMD_NEON_INTRIN(_Tp)                                         \
+  template <>                                                                  \
+  struct __intrinsic_type<__remove_cvref_t<decltype(_Tp()[0])>, sizeof(_Tp),   \
+			  void>                                                \
+  {                                                                            \
+    using type = _Tp;                                                          \
+  }
+_GLIBCXX_SIMD_NEON_INTRIN(int8x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int8x16_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int16x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int32x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint8x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint8x16_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint16x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint32x4_t);
+#if defined _ARM_FEATURE_FP16_VECTOR_ARITHMETIC
+_GLIBCXX_SIMD_NEON_INTRIN(float16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float16x8_t);
+#endif
+_GLIBCXX_SIMD_NEON_INTRIN(float32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float32x4_t);
+#if defined __aarch64__
+_GLIBCXX_SIMD_NEON_INTRIN(int64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float64x2_t);
+#endif
+_GLIBCXX_SIMD_NEON_INTRIN(int64x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint64x2_t);
+#undef _GLIBCXX_SIMD_NEON_INTRIN
+
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<_Tp, _Bytes,
+			enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+{
+  static constexpr int _VBytes = _Bytes <= 8 ? 8 : 16;
+  using _Tmp = conditional_t<
+    sizeof(_Tp) == 1, __remove_cvref_t<decltype(int8x16_t()[0])>,
+    conditional_t<
+      sizeof(_Tp) == 2, short,
+      conditional_t<
+	sizeof(_Tp) == 4, int,
+	conditional_t<sizeof(_Tp) == 8,
+		      __remove_cvref_t<decltype(int64x2_t()[0])>, void>>>>;
+  using _Up = conditional_t<
+    is_floating_point_v<_Tp>, _Tp,
+    conditional_t<is_unsigned_v<_Tp>, make_unsigned_t<_Tmp>, _Tmp>>;
+  using type = typename __intrinsic_type<_Up, _VBytes>::type;
+};
+#endif // _GLIBCXX_SIMD_HAVE_NEON
+
+// }}}
+// __intrinsic_type (PPC){{{
+#ifdef __ALTIVEC__
+template <typename _Tp> struct __intrinsic_type_impl;
+#define _GLIBCXX_SIMD_PPC_INTRIN(_Tp)                                          \
+  template <> struct __intrinsic_type_impl<_Tp>                                \
+  {                                                                            \
+    using type = __vector _Tp;                                                 \
+  }
+_GLIBCXX_SIMD_PPC_INTRIN(float);
+_GLIBCXX_SIMD_PPC_INTRIN(double);
+_GLIBCXX_SIMD_PPC_INTRIN(signed char);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned char);
+_GLIBCXX_SIMD_PPC_INTRIN(signed short);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned short);
+_GLIBCXX_SIMD_PPC_INTRIN(signed int);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned int);
+_GLIBCXX_SIMD_PPC_INTRIN(signed long long);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned long long);
+#undef _GLIBCXX_SIMD_PPC_INTRIN
+
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<
+  _Tp, _Bytes, std::enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+{
+  static_assert(!std::is_same_v<_Tp, long double>,
+		"no __intrinsic_type support for long double on PPC");
+#ifndef __VSX__
+  static_assert(!std::is_same_v<_Tp, double>,
+		"no __intrinsic_type support for double on PPC w/o VSX");
+#endif
+#ifndef __POWER8_VECTOR__
+  static_assert(!(std::is_integral_v<_Tp> && sizeof(_Tp) > 4),
+		"no __intrinsic_type support for integers larger than 4 Bytes "
+		"on PPC w/o POWER8 vectors");
+#endif
+  using type = typename __intrinsic_type_impl<conditional_t<
+    is_floating_point_v<_Tp>, _Tp, __int_for_sizeof_t<_Tp>>>::type;
+};
+#endif // __ALTIVEC__
+
+// }}}
+// _SimdWrapper<bool>{{{1
+template <size_t _Width>
+struct _SimdWrapper<
+  bool, _Width, std::void_t<typename __bool_storage_member_type<_Width>::type>>
+{
+  using _BuiltinType = typename __bool_storage_member_type<_Width>::type;
+  using value_type = bool;
+  static constexpr size_t _S_width = sizeof(_BuiltinType) * CHAR_BIT;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<bool, _S_width>
+  __as_full_vector() const
+  {
+    return _M_data;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_BuiltinType __k)
+    : _M_data(__k){};
+
+  _GLIBCXX_SIMD_INTRINSIC operator const _BuiltinType &() const
+  {
+    return _M_data;
+  }
+  _GLIBCXX_SIMD_INTRINSIC operator _BuiltinType&() { return _M_data; }
+
+  _GLIBCXX_SIMD_INTRINSIC _BuiltinType __intrin() const { return _M_data; }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator[](size_t __i) const
+  {
+    return _M_data & (_BuiltinType(1) << __i);
+  }
+  template <size_t __i>
+  _GLIBCXX_SIMD_INTRINSIC constexpr value_type
+  operator[](_SizeConstant<__i>) const
+  {
+    return _M_data & (_BuiltinType(1) << __i);
+  }
+  _GLIBCXX_SIMD_INTRINSIC constexpr void __set(size_t __i, value_type __x)
+  {
+    if (__x)
+      _M_data |= (_BuiltinType(1) << __i);
+    else
+      _M_data &= ~(_BuiltinType(1) << __i);
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    return __builtin_constant_p(_M_data);
+  }
+
+  _BuiltinType _M_data;
+};
+
+// _SimdWrapperBase{{{1
+template <bool> struct _SimdWrapperBase;
+
+template <> struct _SimdWrapperBase<true> // no padding or no SNaNs
+{
+};
+
+#ifdef __SUPPORT_SNAN__
+template <>
+struct _SimdWrapperBase<false> // with padding that needs to never become SNaN
+{
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase() : _M_data() {}
+};
+#endif // __SUPPORT_SNAN__
+
+// }}}
+// _SimdWrapper{{{
+template <typename _Tp, size_t _Width>
+struct _SimdWrapper<
+  _Tp, _Width,
+  std::void_t<__vector_type_t<_Tp, _Width>, __intrinsic_type_t<_Tp, _Width>>>
+  : _SimdWrapperBase<
+#ifdef __SUPPORT_SNAN__
+      !std::numeric_limits<_Tp>::has_signaling_NaN
+      || sizeof(_Tp) * _Width == sizeof(__vector_type_t<_Tp, _Width>)
+#else
+      true
+#endif
+      >
+{
+  static_assert(__is_vectorizable_v<_Tp>);
+  static_assert(_Width >= 2); // 1 doesn't make sense, use _Tp directly then
+  using _BuiltinType = __vector_type_t<_Tp, _Width>;
+  using value_type = _Tp;
+  static constexpr size_t _S_width = sizeof(_BuiltinType) / sizeof(value_type);
+  static inline constexpr int __size = _Width;
+
+  _BuiltinType _M_data;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, _S_width>
+  __as_full_vector() const
+  {
+    return _M_data;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(
+    std::initializer_list<_Tp> __init)
+    : _M_data(__generate_from_n_evaluations<_Width, _BuiltinType>(
+      [&](auto __i) { return __init.begin()[__i.value]; }))
+  {}
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(const _SimdWrapper&) = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_SimdWrapper&&) = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper& operator=(const _SimdWrapper&)
+    = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper& operator=(_SimdWrapper&&)
+    = default;
+
+  template <typename _V, typename = std::enable_if_t<std::disjunction_v<
+			   is_same<_V, __vector_type_t<_Tp, _Width>>,
+			   is_same<_V, __intrinsic_type_t<_Tp, _Width>>>>>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_V __x)
+    : _M_data(__vector_bitcast<_Tp, _Width>(
+      __x)) // __vector_bitcast can convert e.g. __m128 to __vector(2) float
+  {}
+
+  template <typename... _As,
+	    typename
+	    = enable_if_t<((std::is_same_v<simd_abi::scalar, _As> && ...)
+			   && sizeof...(_As) <= _Width)>>
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator _SimdTuple<_Tp, _As...>() const
+  {
+    const auto& dd = _M_data; // workaround for GCC7 ICE
+    return __generate_from_n_evaluations<sizeof...(_As),
+					 _SimdTuple<_Tp, _As...>>([&](
+      auto __i) constexpr { return dd[int(__i)]; });
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator const _BuiltinType &() const
+  {
+    return _M_data;
+  }
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator _BuiltinType&() { return _M_data; }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](size_t __i) const
+  {
+    return _M_data[__i];
+  }
+  template <size_t __i>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](_SizeConstant<__i>) const
+  {
+    return _M_data[__i];
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr void __set(size_t __i, _Tp __x)
+  {
+    _M_data[__i] = __x;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    return __builtin_constant_p(_M_data);
+  }
+};
+
+// }}}
+
+// __vectorized_sizeof {{{
+template <typename _Tp>
+constexpr size_t
+__vectorized_sizeof()
+{
+  if constexpr (!__is_vectorizable_v<_Tp>)
+    return 0;
+
+  if constexpr (sizeof(_Tp) <= 8)
+    {
+      // X86:
+      if constexpr (__have_avx512bw)
+	return 64;
+      if constexpr (__have_avx512f && sizeof(_Tp) >= 4)
+	return 64;
+      if constexpr (__have_avx2)
+	return 32;
+      if constexpr (__have_avx && std::is_floating_point_v<_Tp>)
+	return 32;
+      if constexpr (__have_sse2)
+	return 16;
+      if constexpr (__have_sse && std::is_same_v<_Tp, float>)
+	return 16;
+      if constexpr (__have_mmx && sizeof(_Tp) <= 4 && std::is_integral_v<_Tp>)
+	return 8;
+
+      // PowerPC:
+      if constexpr (__have_power8vec || (__have_power_vmx && (sizeof(_Tp) < 8))
+		    || (__have_power_vsx && std::is_floating_point_v<_Tp>) )
+	return 16;
+
+      // ARM:
+      if constexpr (__have_neon_a64
+		    || (__have_neon_a32 && !is_same_v<_Tp, double>) )
+	return 16;
+      if constexpr (__have_neon
+		    && sizeof(_Tp) < 8
+		    // Only allow fp if the user allows non-ICE559 fp (e.g. via
+		    // -ffast-math). ARMv7 NEON fp is not conforming to IEC559.
+		    && (__GCC_IEC_559 == 0 || !std::is_floating_point_v<_Tp>) )
+	return 16;
+    }
+
+  return sizeof(_Tp);
+};
+
+// }}}
+namespace simd_abi {
+// most of simd_abi is defined in simd_detail.h
+template <typename _Tp>
+inline constexpr int max_fixed_size
+  = (__have_avx512bw && sizeof(_Tp) == 1) ? 64 : 32;
+// compatible {{{
+#if defined __x86_64__ || defined __aarch64__
+template <typename _Tp>
+using compatible
+  = std::conditional_t<(sizeof(_Tp) <= 8), _VecBuiltin<16>, scalar>;
+#elif defined __ARM_NEON
+// FIXME: not sure, probably needs to be scalar (or dependent on the hard-float
+// ABI?)
+template <typename _Tp>
+using compatible
+  = std::conditional_t<(sizeof(_Tp) < 8), _VecBuiltin<16>, scalar>;
+#else
+template <typename> using compatible = scalar;
+#endif
+
+// }}}
+// native {{{
+template <typename _Tp>
+constexpr auto
+__determine_native_abi()
+{
+  constexpr size_t __bytes = __vectorized_sizeof<_Tp>();
+  if constexpr (__bytes == sizeof(_Tp))
+    return static_cast<scalar*>(nullptr);
+  else if constexpr (__have_avx512vl || (__have_avx512f && __bytes == 64))
+    return static_cast<_VecBltnBtmsk<__bytes>*>(nullptr);
+  else
+    return static_cast<_VecBuiltin<__bytes>*>(nullptr);
+}
+
+template <typename _Tp, typename = enable_if_t<__is_vectorizable_v<_Tp>>>
+using native = std::remove_pointer_t<decltype(__determine_native_abi<_Tp>())>;
+
+// }}}
+// __default_abi {{{
+#if defined _GLIBCXX_SIMD_DEFAULT_ABI
+template <typename _Tp> using __default_abi = _GLIBCXX_SIMD_DEFAULT_ABI<_Tp>;
+#else
+template <typename _Tp> using __default_abi = compatible<_Tp>;
+#endif
+
+// }}}
+} // namespace simd_abi
+
+// traits {{{1
+// is_abi_tag {{{2
+template <typename _Tp, typename = std::void_t<>> struct is_abi_tag : false_type
+{
+};
+template <typename _Tp>
+struct is_abi_tag<_Tp, std::void_t<typename _Tp::_IsValidAbiTag>>
+  : public _Tp::_IsValidAbiTag
+{
+};
+template <typename _Tp>
+inline constexpr bool is_abi_tag_v = is_abi_tag<_Tp>::value;
+
+// is_simd(_mask) {{{2
+template <typename _Tp> struct is_simd : public false_type
+{
+};
+template <typename _Tp> inline constexpr bool is_simd_v = is_simd<_Tp>::value;
+
+template <typename _Tp> struct is_simd_mask : public false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool is_simd_mask_v = is_simd_mask<_Tp>::value;
+
+// simd_size {{{2
+template <typename _Tp, typename _Abi, typename = void> struct __simd_size_impl
+{
+};
+template <typename _Tp, typename _Abi>
+struct __simd_size_impl<
+  _Tp, _Abi,
+  enable_if_t<std::conjunction_v<__is_vectorizable<_Tp>,
+				 std::experimental::is_abi_tag<_Abi>>>>
+  : _SizeConstant<_Abi::template size<_Tp>>
+{
+};
+
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+struct simd_size : __simd_size_impl<_Tp, _Abi>
+{
+};
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+inline constexpr size_t simd_size_v = simd_size<_Tp, _Abi>::value;
+
+// simd_abi::deduce {{{2
+template <typename _Tp, std::size_t _Np, typename = void> struct __deduce_impl;
+namespace simd_abi {
+/**
+ * \tparam _Tp   The requested `value_type` for the elements.
+ * \tparam _Np    The requested number of elements.
+ * \tparam _Abis This parameter is ignored, since this implementation cannot
+ * make any use of it. Either __a good native ABI is matched and used as `type`
+ * alias, or the `fixed_size<_Np>` ABI is used, which internally is built from
+ * the best matching native ABIs.
+ */
+template <typename _Tp, std::size_t _Np, typename...>
+struct deduce : std::experimental::__deduce_impl<_Tp, _Np>
+{
+};
+
+template <typename _Tp, size_t _Np, typename... _Abis>
+using deduce_t = typename deduce<_Tp, _Np, _Abis...>::type;
+} // namespace simd_abi
+
+// }}}2
+// rebind_simd {{{2
+template <typename _Tp, typename _V, typename = void> struct rebind_simd;
+template <typename _Tp, typename _Up, typename _Abi>
+struct rebind_simd<
+  _Tp, simd<_Up, _Abi>,
+  void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+{
+  using type = simd<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
+};
+template <typename _Tp, typename _Up, typename _Abi>
+struct rebind_simd<
+  _Tp, simd_mask<_Up, _Abi>,
+  void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+{
+  using type
+    = simd_mask<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
+};
+template <typename _Tp, typename _V>
+using rebind_simd_t = typename rebind_simd<_Tp, _V>::type;
+
+// resize_simd {{{2
+template <int _Np, typename _V, typename = void> struct resize_simd;
+template <int _Np, typename _Tp, typename _Abi>
+struct resize_simd<_Np, simd<_Tp, _Abi>,
+		   void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+{
+  using type = simd<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>;
+};
+template <int _Np, typename _Tp, typename _Abi>
+struct resize_simd<_Np, simd_mask<_Tp, _Abi>,
+		   void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+{
+  using type = simd_mask<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>;
+};
+template <int _Np, typename _V>
+using resize_simd_t = typename resize_simd<_Np, _V>::type;
+
+// }}}2
+// memory_alignment {{{2
+template <typename _Tp, typename _Up = typename _Tp::value_type>
+struct memory_alignment
+  : public _SizeConstant<__next_power_of_2(sizeof(_Up) * _Tp::size())>
+{
+};
+template <typename _Tp, typename _Up = typename _Tp::value_type>
+inline constexpr size_t memory_alignment_v = memory_alignment<_Tp, _Up>::value;
+
+// class template simd [simd] {{{1
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+class simd;
+template <typename _Tp, typename _Abi>
+struct is_simd<simd<_Tp, _Abi>> : public true_type
+{
+};
+template <typename _Tp> using native_simd = simd<_Tp, simd_abi::native<_Tp>>;
+template <typename _Tp, int _Np>
+using fixed_size_simd = simd<_Tp, simd_abi::fixed_size<_Np>>;
+template <typename _Tp, size_t _Np>
+using __deduced_simd = simd<_Tp, simd_abi::deduce_t<_Tp, _Np>>;
+
+// class template simd_mask [simd_mask] {{{1
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+class simd_mask;
+template <typename _Tp, typename _Abi>
+struct is_simd_mask<simd_mask<_Tp, _Abi>> : public true_type
+{
+};
+template <typename _Tp>
+using native_simd_mask = simd_mask<_Tp, simd_abi::native<_Tp>>;
+template <typename _Tp, int _Np>
+using fixed_size_simd_mask = simd_mask<_Tp, simd_abi::fixed_size<_Np>>;
+template <typename _Tp, size_t _Np>
+using __deduced_simd_mask = simd_mask<_Tp, simd_abi::deduce_t<_Tp, _Np>>;
+
+// casts [simd.casts] {{{1
+// static_simd_cast {{{2
+template <typename _Tp, typename _Up, typename _Ap, bool = is_simd_v<_Tp>,
+	  typename = void>
+struct __static_simd_cast_return_type;
+
+template <typename _Tp, typename _A0, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<simd_mask<_Tp, _A0>, _Up, _Ap, false,
+				      void>
+  : __static_simd_cast_return_type<simd<_Tp, _A0>, _Up, _Ap>
+{
+};
+
+template <typename _Tp, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<
+  _Tp, _Up, _Ap, true, enable_if_t<_Tp::size() == simd_size_v<_Up, _Ap>>>
+{
+  using type = _Tp;
+};
+
+template <typename _Tp, typename _Ap>
+struct __static_simd_cast_return_type<_Tp, _Tp, _Ap, false,
+#ifdef _GLIBCXX_SIMD_FIX_P2TS_ISSUE66
+				      enable_if_t<__is_vectorizable_v<_Tp>>
+#else
+				      void
+#endif
+				      >
+{
+  using type = simd<_Tp, _Ap>;
+};
+
+template <typename _Tp, typename = void> struct __safe_make_signed
+{
+  using type = _Tp;
+};
+template <typename _Tp>
+struct __safe_make_signed<_Tp, enable_if_t<std::is_integral_v<_Tp>>>
+{
+  // the extra make_unsigned_t is because of PR85951
+  using type = std::make_signed_t<std::make_unsigned_t<_Tp>>;
+};
+template <typename _Tp>
+using safe_make_signed_t = typename __safe_make_signed<_Tp>::type;
+
+template <typename _Tp, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<_Tp, _Up, _Ap, false,
+#ifdef _GLIBCXX_SIMD_FIX_P2TS_ISSUE66
+				      enable_if_t<__is_vectorizable_v<_Tp>>
+#else
+				      void
+#endif
+				      >
+{
+  using type = std::conditional_t<
+    (std::is_integral_v<_Up> && std::is_integral_v<_Tp> &&
+#ifndef _GLIBCXX_SIMD_FIX_P2TS_ISSUE65
+     std::is_signed_v<_Up> != std::is_signed_v<_Tp> &&
+#endif
+     std::is_same_v<safe_make_signed_t<_Up>, safe_make_signed_t<_Tp>>),
+    simd<_Tp, _Ap>, fixed_size_simd<_Tp, simd_size_v<_Up, _Ap>>>;
+};
+
+template <typename _Tp, typename _Up, typename _Ap,
+	  typename _R
+	  = typename __static_simd_cast_return_type<_Tp, _Up, _Ap>::type>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _R
+static_simd_cast(const simd<_Up, _Ap>& __x)
+{
+  if constexpr (std::is_same<_R, simd<_Up, _Ap>>::value)
+    {
+      return __x;
+    }
+  else
+    {
+      _SimdConverter<_Up, _Ap, typename _R::value_type, typename _R::abi_type>
+	__c;
+      return _R(__private_init, __c(__data(__x)));
+    }
+}
+
+namespace __proposed {
+template <typename _Tp, typename _Up, typename _Ap,
+	  typename _R
+	  = typename __static_simd_cast_return_type<_Tp, _Up, _Ap>::type>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR typename _R::mask_type
+static_simd_cast(const simd_mask<_Up, _Ap>& __x)
+{
+  using _RM = typename _R::mask_type;
+  return {__private_init, _RM::abi_type::_MaskImpl::template __convert<
+			    typename _RM::simd_type::value_type>(__x)};
+}
+} // namespace __proposed
+
+// simd_cast {{{2
+template <typename _Tp, typename _Up, typename _Ap,
+	  typename _To = __value_type_or_identity_t<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR auto
+simd_cast(const simd<_ValuePreserving<_Up, _To>, _Ap>& __x)
+  -> decltype(static_simd_cast<_Tp>(__x))
+{
+  return static_simd_cast<_Tp>(__x);
+}
+
+namespace __proposed {
+template <typename _Tp, typename _Up, typename _Ap,
+	  typename _To = __value_type_or_identity_t<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR auto
+simd_cast(const simd_mask<_ValuePreserving<_Up, _To>, _Ap>& __x)
+  -> decltype(static_simd_cast<_Tp>(__x))
+{
+  return static_simd_cast<_Tp>(__x);
+}
+} // namespace __proposed
+
+// }}}2
+// resizing_simd_cast {{{
+namespace __proposed {
+/* Proposed spec:
+
+template <class T, class U, class Abi>
+T resizing_simd_cast(const simd<U, Abi>& x)
+
+p1  Constraints:
+    - is_simd_v<T> is true and
+    - T::value_type is the same type as U
+
+p2  Returns:
+    A simd object with the i^th element initialized to x[i] for all i in the
+    range of [0, min(T::size(), simd_size_v<U, Abi>)). If T::size() is larger
+    than simd_size_v<U, Abi>, the remaining elements are value-initialized.
+
+template <class T, class U, class Abi>
+T resizing_simd_cast(const simd_mask<U, Abi>& x)
+
+p1  Constraints: is_simd_mask_v<T> is true
+
+p2  Returns:
+    A simd_mask object with the i^th element initialized to x[i] for all i in
+the range of [0, min(T::size(), simd_size_v<U, Abi>)). If T::size() is larger
+    than simd_size_v<U, Abi>, the remaining elements are initialized to false.
+
+ */
+
+template <typename _Tp, typename _Up, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR enable_if_t<
+  conjunction_v<is_simd<_Tp>, is_same<typename _Tp::value_type, _Up>>, _Tp>
+resizing_simd_cast(const simd<_Up, _Ap>& __x)
+{
+  if constexpr (is_same_v<typename _Tp::abi_type, _Ap>)
+    return __x;
+  else if constexpr (simd_size_v<_Up, _Ap> == 1)
+    {
+      _Tp __r{};
+      __r[0] = __x[0];
+      return __r;
+    }
+  else if constexpr (_Tp::size() == 1)
+    return __x[0];
+  else if constexpr (sizeof(_Tp) == sizeof(__x) && !__is_fixed_size_abi_v<_Ap>)
+    return {__private_init,
+	    __vector_bitcast<typename _Tp::value_type, _Tp::size()>(
+	      _Ap::__masked(__data(__x))._M_data)};
+  else
+    {
+      _Tp __r{};
+      __builtin_memcpy(&__data(__r), &__data(__x),
+		       sizeof(_Up)
+			 * std::min(_Tp::size(), simd_size_v<_Up, _Ap>));
+      return __r;
+    }
+}
+
+template <typename _Tp, typename _Up, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_CONSTEXPR enable_if_t<is_simd_mask_v<_Tp>, _Tp>
+  resizing_simd_cast(const simd_mask<_Up, _Ap>& __x)
+{
+  return {__private_init, _Tp::abi_type::_MaskImpl::template __convert<
+			    typename _Tp::simd_type::value_type>(__x)};
+}
+} // namespace __proposed
+
+// }}}
+// to_fixed_size {{{2
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC fixed_size_simd<_Tp, _Np>
+to_fixed_size(const fixed_size_simd<_Tp, _Np>& __x)
+{
+  return __x;
+}
+
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC fixed_size_simd_mask<_Tp, _Np>
+to_fixed_size(const fixed_size_simd_mask<_Tp, _Np>& __x)
+{
+  return __x;
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC auto
+to_fixed_size(const simd<_Tp, _Ap>& __x)
+{
+  return simd<_Tp, simd_abi::fixed_size<simd_size_v<_Tp, _Ap>>>([&__x](
+    auto __i) constexpr { return __x[__i]; });
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC auto
+to_fixed_size(const simd_mask<_Tp, _Ap>& __x)
+{
+  constexpr int _Np = simd_mask<_Tp, _Ap>::size();
+  fixed_size_simd_mask<_Tp, _Np> __r;
+  __execute_n_times<_Np>([&](auto __i) constexpr { __r[__i] = __x[__i]; });
+  return __r;
+}
+
+// to_native {{{2
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC
+  enable_if_t<(_Np == native_simd<_Tp>::size()), native_simd<_Tp>>
+  to_native(const fixed_size_simd<_Tp, _Np>& __x)
+{
+  alignas(memory_alignment_v<native_simd<_Tp>>) _Tp __mem[_Np];
+  __x.copy_to(__mem, vector_aligned);
+  return {__mem, vector_aligned};
+}
+
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+  enable_if_t<(_Np == native_simd_mask<_Tp>::size()), native_simd_mask<_Tp>>
+  to_native(const fixed_size_simd_mask<_Tp, _Np>& __x)
+{
+  return native_simd_mask<_Tp>([&](auto __i) constexpr { return __x[__i]; });
+}
+
+// to_compatible {{{2
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC enable_if_t<(_Np == simd<_Tp>::size()), simd<_Tp>>
+to_compatible(const simd<_Tp, simd_abi::fixed_size<_Np>>& __x)
+{
+  alignas(memory_alignment_v<simd<_Tp>>) _Tp __mem[_Np];
+  __x.copy_to(__mem, vector_aligned);
+  return {__mem, vector_aligned};
+}
+
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+  enable_if_t<(_Np == simd_mask<_Tp>::size()), simd_mask<_Tp>>
+  to_compatible(const simd_mask<_Tp, simd_abi::fixed_size<_Np>>& __x)
+{
+  return simd_mask<_Tp>([&](auto __i) constexpr { return __x[__i]; });
+}
+
+// masked assignment [simd_mask.where] {{{1
+
+// where_expression {{{1
+template <typename _M, typename _Tp> class const_where_expression //{{{2
+{
+  using _V = _Tp;
+  static_assert(std::is_same_v<_V, __remove_cvref_t<_Tp>>);
+  struct Wrapper
+  {
+    using value_type = _V;
+  };
+
+protected:
+  using _Impl = typename _V::_Impl;
+
+  using value_type = typename std::conditional_t<std::is_arithmetic<_V>::value,
+						 Wrapper, _V>::value_type;
+  _GLIBCXX_SIMD_INTRINSIC friend const _M&
+  __get_mask(const const_where_expression& __x)
+  {
+    return __x._M_k;
+  }
+  _GLIBCXX_SIMD_INTRINSIC friend const _Tp&
+  __get_lvalue(const const_where_expression& __x)
+  {
+    return __x._M_value;
+  }
+  const _M& _M_k;
+  _Tp& _M_value;
+
+public:
+  const_where_expression(const const_where_expression&) = delete;
+  const_where_expression& operator=(const const_where_expression&) = delete;
+
+  _GLIBCXX_SIMD_INTRINSIC const_where_expression(const _M& __kk, const _Tp& dd)
+    : _M_k(__kk), _M_value(const_cast<_Tp&>(dd))
+  {}
+
+  _GLIBCXX_SIMD_INTRINSIC _V operator-() const&&
+  {
+    return {__private_init,
+	    _Impl::template __masked_unary<std::negate>(__data(_M_k),
+							__data(_M_value))};
+  }
+
+  template <typename _Up, typename _Flags>
+  [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+  copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags __f) const&&
+  {
+    return {__private_init,
+	    _Impl::__masked_load(__data(_M_value), __data(_M_k), __mem, __f)};
+  }
+
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC void copy_to(_LoadStorePtr<_Up, value_type>* __mem,
+				       _Flags __f) const&&
+  {
+    _Impl::__masked_store(__data(_M_value), __mem, __f, __data(_M_k));
+  }
+};
+
+template <typename _Tp> class const_where_expression<bool, _Tp> //{{{2
+{
+  using _M = bool;
+  using _V = _Tp;
+  static_assert(std::is_same_v<_V, __remove_cvref_t<_Tp>>);
+  struct Wrapper
+  {
+    using value_type = _V;
+  };
+
+protected:
+  using value_type = typename std::conditional_t<std::is_arithmetic<_V>::value,
+						 Wrapper, _V>::value_type;
+  _GLIBCXX_SIMD_INTRINSIC friend const _M&
+  __get_mask(const const_where_expression& __x)
+  {
+    return __x._M_k;
+  }
+  _GLIBCXX_SIMD_INTRINSIC friend const _Tp&
+  __get_lvalue(const const_where_expression& __x)
+  {
+    return __x._M_value;
+  }
+  const bool _M_k;
+  _Tp& _M_value;
+
+public:
+  const_where_expression(const const_where_expression&) = delete;
+  const_where_expression& operator=(const const_where_expression&) = delete;
+
+  _GLIBCXX_SIMD_INTRINSIC const_where_expression(const bool __kk, const _Tp& dd)
+    : _M_k(__kk), _M_value(const_cast<_Tp&>(dd))
+  {}
+
+  _GLIBCXX_SIMD_INTRINSIC _V operator-() const&&
+  {
+    return _M_k ? -_M_value : _M_value;
+  }
+
+  template <typename _Up, typename _Flags>
+  [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+  copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags) const&&
+  {
+    return _M_k ? static_cast<_V>(__mem[0]) : _M_value;
+  }
+
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC void copy_to(_LoadStorePtr<_Up, value_type>* __mem,
+				       _Flags) const&&
+  {
+    if (_M_k)
+      {
+	__mem[0] = _M_value;
+      }
+  }
+};
+
+// where_expression {{{2
+template <typename _M, typename _Tp>
+class where_expression : public const_where_expression<_M, _Tp>
+{
+  using _Impl = typename const_where_expression<_M, _Tp>::_Impl;
+
+  static_assert(!std::is_const<_Tp>::value,
+		"where_expression may only be instantiated with __a non-const "
+		"_Tp parameter");
+  using typename const_where_expression<_M, _Tp>::value_type;
+  using const_where_expression<_M, _Tp>::_M_k;
+  using const_where_expression<_M, _Tp>::_M_value;
+  static_assert(
+    std::is_same<typename _M::abi_type, typename _Tp::abi_type>::value, "");
+  static_assert(_M::size() == _Tp::size(), "");
+
+  _GLIBCXX_SIMD_INTRINSIC friend _Tp& __get_lvalue(where_expression& __x)
+  {
+    return __x._M_value;
+  }
+
+public:
+  where_expression(const where_expression&) = delete;
+  where_expression& operator=(const where_expression&) = delete;
+
+  _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
+    : const_where_expression<_M, _Tp>(__kk, dd)
+  {}
+
+  template <typename _Up> _GLIBCXX_SIMD_INTRINSIC void operator=(_Up&& __x) &&
+  {
+    _Impl::__masked_assign(__data(_M_k), __data(_M_value),
+			   __to_value_type_or_member_type<_Tp>(
+			     static_cast<_Up&&>(__x)));
+  }
+
+#define _GLIBCXX_SIMD_OP_(__op, __name)                                        \
+  template <typename _Up>                                                      \
+  _GLIBCXX_SIMD_INTRINSIC void operator __op##=(_Up&& __x)&&                   \
+  {                                                                            \
+    _Impl::template __masked_cassign(                                          \
+      __data(_M_k), __data(_M_value),                                          \
+      __to_value_type_or_member_type<_Tp>(static_cast<_Up&&>(__x)),            \
+      [](auto __impl, auto __lhs, auto __rhs) constexpr {                      \
+	return __impl.__name(__lhs, __rhs);                                    \
+      });                                                                      \
+  }                                                                            \
+  static_assert(true)
+  _GLIBCXX_SIMD_OP_(+, __plus);
+  _GLIBCXX_SIMD_OP_(-, __minus);
+  _GLIBCXX_SIMD_OP_(*, __multiplies);
+  _GLIBCXX_SIMD_OP_(/, __divides);
+  _GLIBCXX_SIMD_OP_(%, __modulus);
+  _GLIBCXX_SIMD_OP_(&, __bit_and);
+  _GLIBCXX_SIMD_OP_(|, __bit_or);
+  _GLIBCXX_SIMD_OP_(^, __bit_xor);
+  _GLIBCXX_SIMD_OP_(<<, __shift_left);
+  _GLIBCXX_SIMD_OP_(>>, __shift_right);
+#undef _GLIBCXX_SIMD_OP_
+
+  _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+  {
+    __data(_M_value)
+      = _Impl::template __masked_unary<__increment>(__data(_M_k),
+						    __data(_M_value));
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+  {
+    __data(_M_value)
+      = _Impl::template __masked_unary<__increment>(__data(_M_k),
+						    __data(_M_value));
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+  {
+    __data(_M_value)
+      = _Impl::template __masked_unary<__decrement>(__data(_M_k),
+						    __data(_M_value));
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+  {
+    __data(_M_value)
+      = _Impl::template __masked_unary<__decrement>(__data(_M_k),
+						    __data(_M_value));
+  }
+
+  // intentionally hides const_where_expression::copy_from
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC void
+  copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags __f) &&
+  {
+    __data(_M_value)
+      = _Impl::__masked_load(__data(_M_value), __data(_M_k), __mem, __f);
+  }
+};
+
+// where_expression<bool> {{{2
+template <typename _Tp>
+class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
+{
+  using _M = bool;
+  using typename const_where_expression<_M, _Tp>::value_type;
+  using const_where_expression<_M, _Tp>::_M_k;
+  using const_where_expression<_M, _Tp>::_M_value;
+
+public:
+  where_expression(const where_expression&) = delete;
+  where_expression& operator=(const where_expression&) = delete;
+
+  _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
+    : const_where_expression<_M, _Tp>(__kk, dd)
+  {}
+
+#define _GLIBCXX_SIMD_OP_(__op)                                                \
+  template <typename _Up>                                                      \
+  _GLIBCXX_SIMD_INTRINSIC void operator __op(_Up&& __x)&&                      \
+  {                                                                            \
+    if (_M_k)                                                                  \
+      {                                                                        \
+	_M_value __op static_cast<_Up&&>(__x);                                 \
+      }                                                                        \
+  }                                                                            \
+  static_assert(true)
+  _GLIBCXX_SIMD_OP_(=);
+  _GLIBCXX_SIMD_OP_(+=);
+  _GLIBCXX_SIMD_OP_(-=);
+  _GLIBCXX_SIMD_OP_(*=);
+  _GLIBCXX_SIMD_OP_(/=);
+  _GLIBCXX_SIMD_OP_(%=);
+  _GLIBCXX_SIMD_OP_(&=);
+  _GLIBCXX_SIMD_OP_(|=);
+  _GLIBCXX_SIMD_OP_(^=);
+  _GLIBCXX_SIMD_OP_(<<=);
+  _GLIBCXX_SIMD_OP_(>>=);
+#undef _GLIBCXX_SIMD_OP_
+  _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+  {
+    if (_M_k)
+      {
+	++_M_value;
+      }
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+  {
+    if (_M_k)
+      {
+	++_M_value;
+      }
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+  {
+    if (_M_k)
+      {
+	--_M_value;
+      }
+  }
+  _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+  {
+    if (_M_k)
+      {
+	--_M_value;
+      }
+  }
+
+  // intentionally hides const_where_expression::copy_from
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC void
+  copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags) &&
+  {
+    if (_M_k)
+      {
+	_M_value = __mem[0];
+      }
+  }
+};
+
+// where_expression<_M, tuple<...>> {{{2
+
+// where {{{1
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+where(const typename simd<_Tp, _Ap>::mask_type& __k, simd<_Tp, _Ap>& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+  const_where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+  where(const typename simd<_Tp, _Ap>::mask_type& __k,
+	const simd<_Tp, _Ap>& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+  where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+  where(const std::remove_const_t<simd_mask<_Tp, _Ap>>& __k,
+	simd_mask<_Tp, _Ap>& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+  const_where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+  where(const std::remove_const_t<simd_mask<_Tp, _Ap>>& __k,
+	const simd_mask<_Tp, _Ap>& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC where_expression<bool, _Tp>
+where(_ExactBool __k, _Tp& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC const_where_expression<bool, _Tp>
+where(_ExactBool __k, const _Tp& __value)
+{
+  return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+void
+where(bool __k, simd<_Tp, _Ap>& __value)
+  = delete;
+template <typename _Tp, typename _Ap>
+void
+where(bool __k, const simd<_Tp, _Ap>& __value)
+  = delete;
+
+// proposed mask iterations {{{1
+namespace __proposed {
+template <size_t _Np> class where_range
+{
+  const std::bitset<_Np> __bits;
+
+public:
+  where_range(std::bitset<_Np> __b) : __bits(__b) {}
+
+  class iterator
+  {
+    size_t __mask;
+    size_t __bit;
+
+    _GLIBCXX_SIMD_INTRINSIC void __next_bit()
+    {
+      __bit = __builtin_ctzl(__mask);
+    }
+    _GLIBCXX_SIMD_INTRINSIC void __reset_lsb()
+    {
+      // 01100100 - 1 = 01100011
+      __mask &= (__mask - 1);
+      // __asm__("btr %1,%0" : "+r"(__mask) : "r"(__bit));
+    }
+
+  public:
+    iterator(decltype(__mask) __m) : __mask(__m) { __next_bit(); }
+    iterator(const iterator&) = default;
+    iterator(iterator&&) = default;
+
+    _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator->() const { return __bit; }
+    _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator*() const { return __bit; }
+
+    _GLIBCXX_SIMD_ALWAYS_INLINE iterator& operator++()
+    {
+      __reset_lsb();
+      __next_bit();
+      return *this;
+    }
+    _GLIBCXX_SIMD_ALWAYS_INLINE iterator operator++(int)
+    {
+      iterator __tmp = *this;
+      __reset_lsb();
+      __next_bit();
+      return __tmp;
+    }
+
+    _GLIBCXX_SIMD_ALWAYS_INLINE bool operator==(const iterator& __rhs) const
+    {
+      return __mask == __rhs.__mask;
+    }
+    _GLIBCXX_SIMD_ALWAYS_INLINE bool operator!=(const iterator& __rhs) const
+    {
+      return __mask != __rhs.__mask;
+    }
+  };
+
+  iterator begin() const { return __bits.to_ullong(); }
+  iterator end() const { return 0; }
+};
+
+template <typename _Tp, typename _Ap>
+where_range<simd_size_v<_Tp, _Ap>>
+where(const simd_mask<_Tp, _Ap>& __k)
+{
+  return __k.__to_bitset();
+}
+
+} // namespace __proposed
+
+// }}}1
+// reductions [simd.reductions] {{{1
+template <typename _Tp, typename _Abi, typename _BinaryOperation = std::plus<>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+reduce(const simd<_Tp, _Abi>& __v,
+       _BinaryOperation __binary_op = _BinaryOperation())
+{
+  return _Abi::_SimdImpl::__reduce(__v, __binary_op);
+}
+
+template <typename _M, typename _V, typename _BinaryOperation = std::plus<>>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x,
+       typename _V::value_type __identity_element, _BinaryOperation __binary_op)
+{
+  if (__builtin_expect(none_of(__get_mask(__x)), false))
+    return __identity_element;
+
+  _V __tmp = __identity_element;
+  _V::_Impl::__masked_assign(__data(__get_mask(__x)), __data(__tmp),
+			     __data(__get_lvalue(__x)));
+  return reduce(__tmp, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::plus<> __binary_op = {})
+{
+  return reduce(__x, 0, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::multiplies<> __binary_op)
+{
+  return reduce(__x, 1, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_and<> __binary_op)
+{
+  return reduce(__x, ~typename _V::value_type(), __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_or<> __binary_op)
+{
+  return reduce(__x, 0, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_xor<> __binary_op)
+{
+  return reduce(__x, 0, __binary_op);
+}
+
+// }}}1
+// algorithms [simd.alg] {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+min(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  return {__private_init, _Ap::_SimdImpl::__min(__data(__a), __data(__b))};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+max(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  return {__private_init, _Ap::_SimdImpl::__max(__data(__a), __data(__b))};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_CONSTEXPR std::pair<simd<_Tp, _Ap>, simd<_Tp, _Ap>>
+  minmax(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  const auto pair_of_members
+    = _Ap::_SimdImpl::__minmax(__data(__a), __data(__b));
+  return {simd<_Tp, _Ap>(__private_init, pair_of_members.first),
+	  simd<_Tp, _Ap>(__private_init, pair_of_members.second)};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo,
+      const simd<_Tp, _Ap>& __hi)
+{
+  using _Impl = typename _Ap::_SimdImpl;
+  return {__private_init,
+	  _Impl::__min(__data(__hi), _Impl::__max(__data(__lo), __data(__v)))};
+}
+
+// }}}
+
+namespace _P0918 {
+// shuffle {{{1
+template <int _Stride, int _Offset = 0> struct strided
+{
+  static constexpr int _S_stride = _Stride;
+  static constexpr int _S_offset = _Offset;
+  template <typename _Tp, typename _Ap>
+  using __shuffle_return_type = simd<
+    _Tp, simd_abi::deduce_t<
+	   _Tp, __div_roundup(simd_size_v<_Tp, _Ap> - _Offset, _Stride), _Ap>>;
+  // alternative, always use fixed_size:
+  // fixed_size_simd<_Tp, __div_roundup(simd_size_v<_Tp, _Ap> - _Offset,
+  // _Stride)>;
+  template <typename _Tp> static constexpr auto __src_index(_Tp __dst_index)
+  {
+    return _Offset + __dst_index * _Stride;
+  }
+};
+
+// SFINAE for the return type ensures _P is a type that provides the alias
+// template member
+// __shuffle_return_type and the static member function __src_index
+template <typename _P, typename _Tp, typename _Ap,
+	  typename _R = typename _P::template __shuffle_return_type<_Tp, _Ap>,
+	  typename
+	  = decltype(_P::__src_index(std::experimental::_SizeConstant<0>()))>
+_GLIBCXX_SIMD_INTRINSIC _R
+shuffle(const simd<_Tp, _Ap>& __x)
+{
+  return _R([&__x](auto __i) constexpr { return __x[_P::__src_index(__i)]; });
+}
+
+// }}}1
+} // namespace _P0918
+
+namespace __proposed {
+using namespace _P0918;
+} // namespace __proposed
+
+template <size_t... _Sizes, typename _Tp, typename _Ap,
+	  typename = enable_if_t<((_Sizes + ...) == simd<_Tp, _Ap>::size())>>
+inline std::tuple<simd<_Tp, simd_abi::deduce_t<_Tp, _Sizes>>...>
+split(const simd<_Tp, _Ap>&);
+
+// __extract_part {{{
+template <int _Index, int _Total, int _Combine = 1, typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_CONST _SimdWrapper<_Tp, _Np / _Total * _Combine>
+  __extract_part(const _SimdWrapper<_Tp, _Np> __x);
+
+template <int Index, int Parts, int _Combine = 1, typename _Tp, typename _A0,
+	  typename... _As>
+_GLIBCXX_SIMD_INTRINSIC auto
+__extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x);
+
+// }}}
+// _SizeList {{{
+template <size_t _V0, size_t... _Values> struct _SizeList
+{
+  template <size_t _I> static constexpr size_t __at(_SizeConstant<_I> = {})
+  {
+    if constexpr (_I == 0)
+      {
+	return _V0;
+      }
+    else
+      {
+	return _SizeList<_Values...>::template __at<_I - 1>();
+      }
+  }
+
+  template <size_t _I> static constexpr auto __before(_SizeConstant<_I> = {})
+  {
+    if constexpr (_I == 0)
+      {
+	return _SizeConstant<0>();
+      }
+    else
+      {
+	return _SizeConstant<
+	  _V0 + _SizeList<_Values...>::template __before<_I - 1>()>();
+      }
+  }
+
+  template <size_t _Np>
+  static constexpr auto __pop_front(_SizeConstant<_Np> = {})
+  {
+    if constexpr (_Np == 0)
+      {
+	return _SizeList();
+      }
+    else
+      {
+	return _SizeList<_Values...>::template __pop_front<_Np - 1>();
+      }
+  }
+};
+// }}}
+// __extract_center {{{
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC _SimdWrapper<_Tp, _Np / 2>
+__extract_center(_SimdWrapper<_Tp, _Np> __x)
+{
+  static_assert(_Np >= 4);
+  static_assert(_Np % 4 == 0); // x0 - x1 - x2 - x3 -> return {x1, x2}
+#if _GLIBCXX_SIMD_X86INTRIN    // {{{
+  if constexpr (__have_avx512f && sizeof(_Tp) * _Np == 64)
+    {
+      const auto __intrin = __to_intrin(__x);
+      if constexpr (std::is_integral_v<_Tp>)
+	return __vector_bitcast<_Tp>(_mm512_castsi512_si256(
+	  _mm512_shuffle_i32x4(__intrin, __intrin,
+			       1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+      else if constexpr (sizeof(_Tp) == 4)
+	return __vector_bitcast<_Tp>(_mm512_castps512_ps256(
+	  _mm512_shuffle_f32x4(__intrin, __intrin,
+			       1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+      else if constexpr (sizeof(_Tp) == 8)
+	return __vector_bitcast<_Tp>(_mm512_castpd512_pd256(
+	  _mm512_shuffle_f64x2(__intrin, __intrin,
+			       1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+      else
+	__assert_unreachable<_Tp>();
+    }
+  else if constexpr (sizeof(_Tp) * _Np == 32 && std::is_floating_point_v<_Tp>)
+    return __vector_bitcast<_Tp>(
+      _mm_shuffle_pd(__lo128(__vector_bitcast<double>(__x)),
+		     __hi128(__vector_bitcast<double>(__x)), 1));
+  else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) * _Np <= 32)
+    return __vector_bitcast<_Tp>(
+      _mm_alignr_epi8(__hi128(__vector_bitcast<_LLong>(__x)),
+		      __lo128(__vector_bitcast<_LLong>(__x)),
+		      sizeof(_Tp) * _Np / 4));
+  else
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+    {
+      __vector_type_t<_Tp, _Np / 2> __r;
+      __builtin_memcpy(&__r,
+		       reinterpret_cast<const char*>(&__x)
+			 + sizeof(_Tp) * _Np / 4,
+		       sizeof(_Tp) * _Np / 2);
+      return __r;
+    }
+}
+
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC
+  _SimdWrapper<_Tp, _SimdTuple<_Tp, _A0, _As...>::size() / 2>
+  __extract_center(const _SimdTuple<_Tp, _A0, _As...>& __x)
+{
+  if constexpr (sizeof...(_As) == 0)
+    return __extract_center(__x.first);
+  else
+    return __extract_part<1, 4, 2>(__x);
+}
+
+// }}}
+// __split_wrapper {{{
+template <size_t... _Sizes, typename _Tp, typename... _As>
+auto
+__split_wrapper(_SizeList<_Sizes...>, const _SimdTuple<_Tp, _As...>& __x)
+{
+  return std::experimental::split<_Sizes...>(
+    fixed_size_simd<_Tp, _SimdTuple<_Tp, _As...>::size()>(__private_init, __x));
+}
+
+// }}}
+
+// split<simd>(simd) {{{
+template <typename _V, typename _Ap,
+	  size_t Parts = simd_size_v<typename _V::value_type, _Ap> / _V::size()>
+inline enable_if_t<
+  (is_simd<_V>::value
+   && simd_size_v<typename _V::value_type, _Ap> == Parts * _V::size()),
+  std::array<_V, Parts>>
+split(const simd<typename _V::value_type, _Ap>& __x)
+{
+  using _Tp = typename _V::value_type;
+  if constexpr (Parts == 1)
+    {
+      return {simd_cast<_V>(__x)};
+    }
+  else if (__x._M_is_constprop())
+    {
+      return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+	auto __i) constexpr {
+	return _V([&](auto __j) constexpr {
+	  return __x[__i * _V::size() + __j];
+	});
+      });
+    }
+  else if constexpr (
+      __is_fixed_size_abi_v<_Ap>
+      && (std::is_same_v<typename _V::abi_type, simd_abi::scalar>
+	|| (__is_fixed_size_abi_v<typename _V::abi_type>
+	  && sizeof(_V) == sizeof(_Tp) * _V::size() // _V doesn't have padding
+	  )))
+    {
+      // fixed_size -> fixed_size (w/o padding) or scalar
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+      const __may_alias<_Tp>* const __element_ptr
+	= reinterpret_cast<const __may_alias<_Tp>*>(&__data(__x));
+      return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+	auto __i) constexpr {
+	return _V(__element_ptr + __i * _V::size(), vector_aligned);
+      });
+#else
+      const auto& __xx = __data(__x);
+      return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+	auto __i) constexpr {
+	[[maybe_unused]] constexpr size_t __offset
+	  = decltype(__i)::value * _V::size();
+	return _V([&](auto __j) constexpr {
+	  constexpr _SizeConstant<__j + __offset> __k;
+	  return __xx[__k];
+	});
+      });
+#endif
+    }
+  else if constexpr (std::is_same_v<typename _V::abi_type, simd_abi::scalar>)
+    {
+      // normally memcpy should work here as well
+      return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+	auto __i) constexpr { return __x[__i]; });
+    }
+  else
+    {
+      return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+	auto __i) constexpr {
+	if constexpr (__is_fixed_size_abi_v<typename _V::abi_type>)
+	  {
+	    return _V([&](auto __j) constexpr {
+	      return __x[__i * _V::size() + __j];
+	    });
+	  }
+	else
+	  {
+	    return _V(__private_init,
+		      __extract_part<decltype(__i)::value, Parts>(__data(__x)));
+	  }
+      });
+    }
+}
+
+// }}}
+// split<simd_mask>(simd_mask) {{{
+template <typename _V, typename _Ap,
+	  size_t _Parts
+	  = simd_size_v<typename _V::simd_type::value_type, _Ap> / _V::size()>
+enable_if_t<
+  (is_simd_mask_v<
+     _V> && simd_size_v<typename _V::simd_type::value_type, _Ap> == _Parts * _V::size()),
+  std::array<_V, _Parts>>
+split(const simd_mask<typename _V::simd_type::value_type, _Ap>& __x)
+{
+  if constexpr (std::is_same_v<_Ap, typename _V::abi_type>)
+    {
+      return {__x};
+    }
+  else if constexpr (_Parts == 1)
+    {
+      return {__proposed::static_simd_cast<_V>(__x)};
+    }
+  else if constexpr (_Parts == 2 && __is_sse_abi<typename _V::abi_type>()
+		     && __is_avx_abi<_Ap>())
+    {
+      return {_V(__private_init, __lo128(__data(__x))),
+	      _V(__private_init, __hi128(__data(__x)))};
+    }
+  else if constexpr (_V::size() <= CHAR_BIT * sizeof(_ULLong))
+    {
+      const std::bitset __bits = __x.__to_bitset();
+      return __generate_from_n_evaluations<_Parts, std::array<_V, _Parts>>([&](
+	auto __i) constexpr {
+	constexpr size_t __offset = __i * _V::size();
+	return _V(__bitset_init, (__bits >> __offset).to_ullong());
+      });
+    }
+  else
+    {
+      return __generate_from_n_evaluations<_Parts, std::array<_V, _Parts>>([&](
+	auto __i) constexpr {
+	constexpr size_t __offset = __i * _V::size();
+	return _V(
+	  __private_init, [&](auto __j) constexpr {
+	    return __x[__j + __offset];
+	  });
+      });
+    }
+}
+
+// }}}
+// split<_Sizes...>(simd) {{{
+template <size_t... _Sizes, typename _Tp, typename _Ap, typename>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+  std::tuple<simd<_Tp, simd_abi::deduce_t<_Tp, _Sizes>>...>
+  split(const simd<_Tp, _Ap>& __x)
+{
+  using _SL = _SizeList<_Sizes...>;
+  using _Tuple = std::tuple<__deduced_simd<_Tp, _Sizes>...>;
+  constexpr size_t _Np = simd_size_v<_Tp, _Ap>;
+  constexpr size_t _N0 = _SL::template __at<0>();
+  using _V = __deduced_simd<_Tp, _N0>;
+
+  if (__x._M_is_constprop())
+    return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+      auto __i) constexpr {
+      using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+      constexpr size_t __offset = _SL::__before(__i);
+      return _Vi([&](auto __j) constexpr { return __x[__offset + __j]; });
+    });
+  else if constexpr (_Np == _N0)
+    {
+      static_assert(sizeof...(_Sizes) == 1);
+      return {simd_cast<_V>(__x)};
+    }
+  else if constexpr // split from fixed_size, such that __x::first.size == _N0
+    (__is_fixed_size_abi_v<
+       _Ap> && __fixed_size_storage_t<_Tp, _Np>::_S_first_size == _N0)
+    {
+      static_assert(!__is_fixed_size_abi_v<typename _V::abi_type>,
+		    "How can <_Tp, _Np> be __a single _SimdTuple entry but __a "
+		    "fixed_size_simd "
+		    "when deduced?");
+      // extract first and recurse (__split_wrapper is needed to deduce a new
+      // _Sizes pack)
+      return std::tuple_cat(
+	std::make_tuple(_V(__private_init, __data(__x).first)),
+	__split_wrapper(_SL::template __pop_front<1>(), __data(__x).second));
+    }
+  else if constexpr ((!std::is_same_v<simd_abi::scalar,
+				      simd_abi::deduce_t<_Tp, _Sizes>> && ...)
+		     && (!__is_fixed_size_abi_v<
+			   simd_abi::deduce_t<_Tp, _Sizes>> && ...))
+    {
+      if constexpr (((_Sizes * 2 == _Np) && ...))
+	return {{__private_init, __extract_part<0, 2>(__data(__x))},
+		{__private_init, __extract_part<1, 2>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<_Np / 3, _Np / 3, _Np / 3>>)
+	return {{__private_init, __extract_part<0, 3>(__data(__x))},
+		{__private_init, __extract_part<1, 3>(__data(__x))},
+		{__private_init, __extract_part<2, 3>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<2 * _Np / 3, _Np / 3>>)
+	return {{__private_init, __extract_part<0, 3, 2>(__data(__x))},
+		{__private_init, __extract_part<2, 3>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<_Np / 3, 2 * _Np / 3>>)
+	return {{__private_init, __extract_part<0, 3>(__data(__x))},
+		{__private_init, __extract_part<1, 3, 2>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<_Np / 2, _Np / 4, _Np / 4>>)
+	return {{__private_init, __extract_part<0, 2>(__data(__x))},
+		{__private_init, __extract_part<2, 4>(__data(__x))},
+		{__private_init, __extract_part<3, 4>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<_Np / 4, _Np / 4, _Np / 2>>)
+	return {{__private_init, __extract_part<0, 4>(__data(__x))},
+		{__private_init, __extract_part<1, 4>(__data(__x))},
+		{__private_init, __extract_part<1, 2>(__data(__x))}};
+      else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+					_SizeList<_Np / 4, _Np / 2, _Np / 4>>)
+	return {{__private_init, __extract_part<0, 4>(__data(__x))},
+		{__private_init, __extract_center(__data(__x))},
+		{__private_init, __extract_part<3, 4>(__data(__x))}};
+      else if constexpr (((_Sizes * 4 == _Np) && ...))
+	return {{__private_init, __extract_part<0, 4>(__data(__x))},
+		{__private_init, __extract_part<1, 4>(__data(__x))},
+		{__private_init, __extract_part<2, 4>(__data(__x))},
+		{__private_init, __extract_part<3, 4>(__data(__x))}};
+      // else fall through
+    }
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+  const __may_alias<_Tp>* const __element_ptr
+    = reinterpret_cast<const __may_alias<_Tp>*>(&__x);
+  return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+    auto __i) constexpr {
+    using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+    constexpr size_t __offset = _SL::__before(__i);
+    constexpr size_t __base_align = alignof(simd<_Tp, _Ap>);
+    constexpr size_t __a
+      = __base_align - ((__offset * sizeof(_Tp)) % __base_align);
+    constexpr size_t __b = ((__a - 1) & __a) ^ __a;
+    constexpr size_t __alignment = __b == 0 ? __a : __b;
+    return _Vi(__element_ptr + __offset, overaligned<__alignment>);
+  });
+#else
+  return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+    auto __i) constexpr {
+    using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+    const auto& __xx = __data(__x);
+    using _Offset = decltype(_SL::__before(__i));
+    return _Vi([&](auto __j) constexpr {
+      constexpr _SizeConstant<_Offset::value + __j> __k;
+      return __xx[__k];
+    });
+  });
+#endif
+}
+
+// }}}
+
+// __subscript_in_pack {{{
+template <size_t _I, typename _Tp, typename _Ap, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__subscript_in_pack(const simd<_Tp, _Ap>& __x, const simd<_Tp, _As>&... __xs)
+{
+  if constexpr (_I < simd_size_v<_Tp, _Ap>)
+    return __x[_I];
+  else
+    return __subscript_in_pack<_I - simd_size_v<_Tp, _Ap>>(__xs...);
+}
+
+// }}}
+// __store_pack_of_simd {{{
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC void
+__store_pack_of_simd(char* __mem, const simd<_Tp, _A0>& __x0,
+		     const simd<_Tp, _As>&... __xs)
+{
+  constexpr size_t __n_bytes = sizeof(_Tp) * simd_size_v<_Tp, _A0>;
+  __builtin_memcpy(__mem, &__data(__x0), __n_bytes);
+  if constexpr (sizeof...(__xs) > 0)
+    __store_pack_of_simd(__mem + __n_bytes, __xs...);
+}
+
+// }}}
+// concat(simd...) {{{
+template <typename _Tp, typename... _As>
+inline _GLIBCXX_SIMD_CONSTEXPR
+simd<_Tp, simd_abi::deduce_t<_Tp, (simd_size_v<_Tp, _As> + ...)>>
+concat(const simd<_Tp, _As>&... __xs)
+{
+  using _Rp = __deduced_simd<_Tp, (simd_size_v<_Tp, _As> + ...)>;
+  if constexpr(sizeof...(__xs) == 1)
+    return simd_cast<_Rp>(__xs...);
+  else if ((... && __xs._M_is_constprop()))
+    return simd<_Tp,
+		simd_abi::deduce_t<_Tp, (simd_size_v<_Tp, _As> + ...)>>([&](
+      auto __i) constexpr { return __subscript_in_pack<__i>(__xs...); });
+  else
+    {
+      _Rp __r{};
+      __store_pack_of_simd(reinterpret_cast<char*>(&__data(__r)), __xs...);
+      return __r;
+    }
+}
+
+// }}}
+// concat(array<simd>) {{{
+template <typename _Tp, typename _Abi, size_t _Np>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
+__deduced_simd<_Tp, simd_size_v<_Tp, _Abi> * _Np>
+concat(const std::array<simd<_Tp, _Abi>, _Np>& __x)
+{
+  return __call_with_subscripts<_Np>(__x, [](const auto&... __xs) {
+    return concat(__xs...);
+  });
+}
+
+// }}}
+
+// _SmartReference {{{
+template <typename _Up, typename _Accessor = _Up,
+	  typename _ValueType = typename _Up::value_type>
+class _SmartReference
+{
+  friend _Accessor;
+  int _M_index;
+  _Up& _M_obj;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _ValueType __read() const noexcept
+  {
+    if constexpr (std::is_arithmetic_v<_Up>)
+      {
+	_GLIBCXX_DEBUG_ASSERT(_M_index == 0);
+	return _M_obj;
+      }
+    else
+      {
+	return _M_obj[_M_index];
+      }
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr void __write(_Tp&& __x) const
+  {
+    _Accessor::__set(_M_obj, _M_index, static_cast<_Tp&&>(__x));
+  }
+
+public:
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference(_Up& __o, int __i) noexcept
+    : _M_index(__i), _M_obj(__o)
+  {}
+
+  using value_type = _ValueType;
+
+  _GLIBCXX_SIMD_INTRINSIC _SmartReference(const _SmartReference&) = delete;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr operator value_type() const noexcept
+  {
+    return __read();
+  }
+
+  template <typename _Tp,
+	    typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, value_type>>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator=(_Tp&& __x) &&
+  {
+    __write(static_cast<_Tp&&>(__x));
+    return {_M_obj, _M_index};
+  }
+
+  // TODO: improve with operator.()
+
+#define _GLIBCXX_SIMD_OP_(__op)                                                \
+  template <typename _Tp,                                                      \
+	    typename _TT                                                       \
+	    = decltype(std::declval<value_type>() __op std::declval<_Tp>()),   \
+	    typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, _TT>,      \
+	    typename = _ValuePreservingOrInt<_TT, value_type>>                 \
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator __op##=(          \
+    _Tp&& __x)&&                                                               \
+  {                                                                            \
+    const value_type& __lhs = __read();                                        \
+    __write(__lhs __op __x);                                                   \
+    return {_M_obj, _M_index};                                                 \
+  }
+  _GLIBCXX_SIMD_ALL_ARITHMETICS(_GLIBCXX_SIMD_OP_);
+  _GLIBCXX_SIMD_ALL_SHIFTS(_GLIBCXX_SIMD_OP_);
+  _GLIBCXX_SIMD_ALL_BINARY(_GLIBCXX_SIMD_OP_);
+#undef _GLIBCXX_SIMD_OP_
+
+  template <typename _Tp = void,
+	    typename = decltype(
+	      ++std::declval<std::conditional_t<true, value_type, _Tp>&>())>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator++() &&
+  {
+    value_type __x = __read();
+    __write(++__x);
+    return {_M_obj, _M_index};
+  }
+
+  template <typename _Tp = void,
+	    typename = decltype(
+	      std::declval<std::conditional_t<true, value_type, _Tp>&>()++)>
+  _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator++(int) &&
+  {
+    const value_type __r = __read();
+    value_type __x = __r;
+    __write(++__x);
+    return __r;
+  }
+
+  template <typename _Tp = void,
+	    typename = decltype(
+	      --std::declval<std::conditional_t<true, value_type, _Tp>&>())>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator--() &&
+  {
+    value_type __x = __read();
+    __write(--__x);
+    return {_M_obj, _M_index};
+  }
+
+  template <typename _Tp = void,
+	    typename = decltype(
+	      std::declval<std::conditional_t<true, value_type, _Tp>&>()--)>
+  _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator--(int) &&
+  {
+    const value_type __r = __read();
+    value_type __x = __r;
+    __write(--__x);
+    return __r;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC friend void
+  swap(_SmartReference&& __a, _SmartReference&& __b) noexcept(
+    conjunction<
+      std::is_nothrow_constructible<value_type, _SmartReference&&>,
+      std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+  {
+    value_type __tmp = static_cast<_SmartReference&&>(__a);
+    static_cast<_SmartReference&&>(__a) = static_cast<value_type>(__b);
+    static_cast<_SmartReference&&>(__b) = std::move(__tmp);
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC friend void
+  swap(value_type& __a, _SmartReference&& __b) noexcept(
+    conjunction<
+      std::is_nothrow_constructible<value_type, value_type&&>,
+      std::is_nothrow_assignable<value_type&, value_type&&>,
+      std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+  {
+    value_type __tmp(std::move(__a));
+    __a = static_cast<value_type>(__b);
+    static_cast<_SmartReference&&>(__b) = std::move(__tmp);
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC friend void
+  swap(_SmartReference&& __a, value_type& __b) noexcept(
+    conjunction<
+      std::is_nothrow_constructible<value_type, _SmartReference&&>,
+      std::is_nothrow_assignable<value_type&, value_type&&>,
+      std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+  {
+    value_type __tmp(__a);
+    static_cast<_SmartReference&&>(__a) = std::move(__b);
+    __b = std::move(__tmp);
+  }
+};
+
+// }}}
+// __scalar_abi_wrapper {{{
+template <int _Bytes> struct __scalar_abi_wrapper
+{
+  template <typename _Tp, typename _Abi = simd_abi::scalar>
+  static constexpr bool _S_is_valid_v
+    = _Abi::template _IsValid<_Tp>::value && sizeof(_Tp) == _Bytes;
+};
+
+// }}}
+// __decay_abi metafunction {{{
+template <typename _Tp> struct __decay_abi
+{
+  using type = _Tp;
+};
+template <int _Bytes> struct __decay_abi<__scalar_abi_wrapper<_Bytes>>
+{
+  using type = simd_abi::scalar;
+};
+
+// }}}
+// __full_abi metafunction {{{1
+// Given an ABI tag A where A::_S_is_partial == true, define type to be such
+// that _S_is_partial == false and A::_S_full_size<T> == type::size<T> for all
+// valid T
+template <template <int> class _Abi, int _Bytes, typename _Tp> struct __full_abi
+{
+  static constexpr auto __choose()
+  {
+    using _High = _Abi<__next_power_of_2(_Bytes) / 2>;
+    if constexpr (_High::template _S_is_valid_v<
+		    _Tp> || _Bytes <= sizeof(_Tp) * 2)
+      return _High();
+    else
+      return
+	typename __full_abi<_Abi, __next_power_of_2(_Bytes) / 2, _Tp>::type();
+  }
+  using type = decltype(__choose());
+};
+
+template <int _Bytes, typename _Tp>
+struct __full_abi<__scalar_abi_wrapper, _Bytes, _Tp>
+{
+  using type = simd_abi::scalar;
+};
+
+// _AbiList {{{1
+template <template <int> class...> struct _AbiList
+{
+  template <typename, int> static constexpr bool _S_has_valid_abi = false;
+  template <typename, int> using _FirstValidAbi = void;
+  template <typename, int> using _BestAbi = void;
+};
+
+template <template <int> class _A0, template <int> class... _Rest>
+struct _AbiList<_A0, _Rest...>
+{
+  template <typename _Tp, int _Np>
+  static constexpr bool _S_has_valid_abi
+    = _A0<sizeof(_Tp) * _Np>::template _S_is_valid_v<
+	_Tp> || _AbiList<_Rest...>::template _S_has_valid_abi<_Tp, _Np>;
+
+  template <typename _Tp, int _Np>
+  using _FirstValidAbi = std::conditional_t<
+    _A0<sizeof(_Tp) * _Np>::template _S_is_valid_v<_Tp>,
+    typename __decay_abi<_A0<sizeof(_Tp) * _Np>>::type,
+    typename _AbiList<_Rest...>::template _FirstValidAbi<_Tp, _Np>>;
+
+  template <typename _Tp, int _Np> static constexpr auto __determine_best_abi()
+  {
+    constexpr int _Bytes = sizeof(_Tp) * _Np;
+    if constexpr (_A0<_Bytes>::template _S_is_valid_v<_Tp>)
+      return typename __decay_abi<_A0<_Bytes>>::type{};
+    else
+      {
+	using _B = typename __full_abi<_A0, _Bytes, _Tp>::type;
+	if constexpr (_B::template _S_is_valid_v<
+			_Tp> && _B::template size<_Tp> <= _Np)
+	  return _B{};
+	else
+	  return typename _AbiList<_Rest...>::template _BestAbi<_Tp, _Np>{};
+      }
+  }
+
+  template <typename _Tp, int _Np>
+  using _BestAbi = decltype(__determine_best_abi<_Tp, _Np>());
+};
+
+// }}}1
+
+// the following lists all native ABIs, which makes them accessible to
+// simd_abi::deduce and select_best_vector_type_t (for fixed_size). Order
+// matters: Whatever comes first has higher priority.
+using _AllNativeAbis = _AbiList<simd_abi::_VecBltnBtmsk, simd_abi::_VecBuiltin,
+				__scalar_abi_wrapper>;
+
+// valid _SimdTraits specialization {{{1
+template <typename _Tp, typename _Abi>
+struct _SimdTraits<_Tp, _Abi,
+		   std::void_t<typename _Abi::template _IsValid<_Tp>>>
+  : _Abi::template __traits<_Tp>
+{
+};
+
+// __deduce_impl specializations {{{1
+// try all native ABIs (including scalar) first
+template <typename _Tp, std::size_t _Np>
+struct __deduce_impl<
+  _Tp, _Np, enable_if_t<_AllNativeAbis::template _S_has_valid_abi<_Tp, _Np>>>
+{
+  using type = _AllNativeAbis::_FirstValidAbi<_Tp, _Np>;
+};
+
+// fall back to fixed_size only if scalar and native ABIs don't match
+template <typename _Tp, std::size_t _Np, typename = void>
+struct __deduce_fixed_size_fallback
+{
+};
+template <typename _Tp, std::size_t _Np>
+struct __deduce_fixed_size_fallback<
+  _Tp, _Np, enable_if_t<simd_abi::fixed_size<_Np>::template _S_is_valid_v<_Tp>>>
+{
+  using type = simd_abi::fixed_size<_Np>;
+};
+template <typename _Tp, std::size_t _Np, typename>
+struct __deduce_impl : public __deduce_fixed_size_fallback<_Tp, _Np>
+{
+};
+
+//}}}1
+
+// simd_mask {{{
+template <typename _Tp, typename _Abi>
+class simd_mask : public _SimdTraits<_Tp, _Abi>::_MaskBase
+{
+  // types, tags, and friends {{{
+  using _Traits = _SimdTraits<_Tp, _Abi>;
+  using _MemberType = typename _Traits::_MaskMember;
+  static constexpr _Tp* _S_type_tag = nullptr;
+  friend typename _Traits::_MaskBase;
+  friend class simd<_Tp, _Abi>;       // to construct masks on return
+  friend typename _Traits::_SimdImpl; // to construct masks on return and
+				      // inspect data on masked operations
+public:
+  using _Impl = typename _Traits::_MaskImpl;
+  friend _Impl;
+  // }}}
+  // member types {{{
+  using value_type = bool;
+  using reference = _SmartReference<_MemberType, _Impl, value_type>;
+  using simd_type = simd<_Tp, _Abi>;
+  using abi_type = _Abi;
+
+  // }}}
+  static constexpr size_t size() { return __size_or_zero_v<_Tp, _Abi>; }
+  // constructors & assignment {{{
+  simd_mask() = default;
+  simd_mask(const simd_mask&) = default;
+  simd_mask(simd_mask&&) = default;
+  simd_mask& operator=(const simd_mask&) = default;
+  simd_mask& operator=(simd_mask&&) = default;
+
+  // }}}
+
+  // access to internal representation (suggested extension) {{{
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit simd_mask(
+    typename _Traits::_MaskCastType __init)
+    : _M_data{__init}
+  {}
+  // conversions to internal type is done in _MaskBase
+
+  // }}}
+  // bitset interface (extension to be proposed) {{{
+  // TS_FEEDBACK:
+  // Conversion of simd_mask to and from bitset makes it much easier to
+  // interface with other facilities. I suggest adding `static
+  // simd_mask::from_bitset` and `simd_mask::to_bitset`.
+  _GLIBCXX_SIMD_ALWAYS_INLINE static simd_mask
+  __from_bitset(std::bitset<size()> bs)
+  {
+    return {__bitset_init, bs};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE std::bitset<size()> __to_bitset() const
+  {
+    return _Impl::__to_bits(_M_data)._M_to_bitset();
+  }
+
+  // }}}
+  // explicit broadcast constructor {{{
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+  simd_mask(value_type __x)
+    : _M_data(_Impl::template __broadcast<_Tp>(__x))
+  {}
+
+  // }}}
+  // implicit type conversion constructor {{{
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+  // proposed improvement
+  template <typename _Up, typename _A2,
+	    typename = enable_if_t<simd_size_v<_Up, _A2> == size()>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit(
+    sizeof(_MemberType) != sizeof(typename _SimdTraits<_Up, _A2>::_MaskMember))
+    simd_mask(const simd_mask<_Up, _A2>& __x)
+    : simd_mask(__proposed::static_simd_cast<simd_mask>(__x))
+  {}
+#else
+  // conforming to ISO/IEC 19570:2018
+  template <typename _Up, typename = enable_if_t<conjunction<
+			    is_same<abi_type, simd_abi::fixed_size<size()>>,
+			    is_same<_Up, _Up>>::value>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE
+  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
+    : _M_data(_Impl::__from_bitmask(__data(__x), _S_type_tag))
+  {}
+#endif
+  // }}}
+  // load constructor {{{
+  template <typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask(const value_type* __mem, _Flags)
+    : _M_data(_Impl::template __load<_Tp, _Flags>(__mem))
+  {}
+  template <typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask(const value_type* __mem, simd_mask __k,
+					_Flags __f)
+    : _M_data{}
+  {
+    _M_data = _Impl::__masked_load(_M_data, __k._M_data, __mem, __f);
+  }
+
+  // }}}
+  // loads [simd_mask.load] {{{
+  template <typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE void copy_from(const value_type* __mem, _Flags)
+  {
+    _M_data = _Impl::template __load<_Tp, _Flags>(__mem);
+  }
+
+  // }}}
+  // stores [simd_mask.store] {{{
+  template <typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE void copy_to(value_type* __mem, _Flags __f) const
+  {
+    _Impl::__store(_M_data, __mem, __f);
+  }
+
+  // }}}
+  // scalar access {{{
+  _GLIBCXX_SIMD_ALWAYS_INLINE reference operator[](size_t __i)
+  {
+    return {_M_data, int(__i)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE value_type operator[]([
+    [maybe_unused]] size_t __i) const
+  {
+    if constexpr (__is_scalar_abi<_Abi>())
+      {
+	_GLIBCXX_DEBUG_ASSERT(__i == 0);
+	return _M_data;
+      }
+    else
+      return static_cast<bool>(_M_data[__i]);
+  }
+
+  // }}}
+  // negation {{{
+  _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask operator!() const
+  {
+    return {__private_init, _Impl::__bit_not(_M_data)};
+  }
+
+  // }}}
+  // simd_mask binary operators [simd_mask.binary] {{{
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+  // simd_mask<int> && simd_mask<uint> needs disambiguation
+  template <typename _Up, typename _A2,
+	    typename
+	    = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+  operator&&(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
+  {
+    return {__private_init,
+	    _Impl::__logical_and(__x._M_data, simd_mask(__y)._M_data)};
+  }
+  template <typename _Up, typename _A2,
+	    typename
+	    = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+  operator||(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
+  {
+    return {__private_init,
+	    _Impl::__logical_or(__x._M_data, simd_mask(__y)._M_data)};
+  }
+#endif // _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator&&(const simd_mask& __x,
+							  const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__logical_and(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator||(const simd_mask& __x,
+							  const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__logical_or(__x._M_data, __y._M_data)};
+  }
+
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator&(const simd_mask& __x,
+							 const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__bit_and(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator|(const simd_mask& __x,
+							 const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__bit_or(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator^(const simd_mask& __x,
+							 const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__bit_xor(__x._M_data, __y._M_data)};
+  }
+
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator&=(simd_mask& __x,
+							   const simd_mask& __y)
+  {
+    __x._M_data = _Impl::__bit_and(__x._M_data, __y._M_data);
+    return __x;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator|=(simd_mask& __x,
+							   const simd_mask& __y)
+  {
+    __x._M_data = _Impl::__bit_or(__x._M_data, __y._M_data);
+    return __x;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator^=(simd_mask& __x,
+							   const simd_mask& __y)
+  {
+    __x._M_data = _Impl::__bit_xor(__x._M_data, __y._M_data);
+    return __x;
+  }
+
+  // }}}
+  // simd_mask compares [simd_mask.comparison] {{{
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+  operator==(const simd_mask& __x, const simd_mask& __y)
+  {
+    return !operator!=(__x, __y);
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+  operator!=(const simd_mask& __x, const simd_mask& __y)
+  {
+    return {__private_init, _Impl::__bit_xor(__x._M_data, __y._M_data)};
+  }
+
+  // }}}
+  // private_init ctor {{{
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  simd_mask(_PrivateInit, typename _Traits::_MaskMember __init)
+    : _M_data(__init)
+  {}
+
+  // }}}
+  // private_init generator ctor {{{
+  template <typename _Fp,
+	    typename = decltype(bool(std::declval<_Fp>()(size_t())))>
+  _GLIBCXX_SIMD_INTRINSIC constexpr simd_mask(_PrivateInit, _Fp&& __gen)
+    : _M_data()
+  {
+    __execute_n_times<size()>(
+      [&](auto __i) constexpr { _Impl::__set(_M_data, __i, __gen(__i)); });
+  }
+
+  // }}}
+  // bitset_init ctor {{{
+  _GLIBCXX_SIMD_INTRINSIC simd_mask(_BitsetInit, std::bitset<size()> __init)
+    : _M_data(
+      _Impl::__from_bitmask(_SanitizedBitMask<size()>(__init), _S_type_tag))
+  {}
+
+  // }}}
+  // __cvt {{{
+  // TS_FEEDBACK:
+  // The conversion operator this implements should be a ctor on simd_mask.
+  // Once you call .__cvt() on a simd_mask it converts conveniently.
+  // A useful variation: add `explicit(sizeof(_Tp) != sizeof(_Up))`
+  struct _CvtProxy
+  {
+    template <typename _Up, typename _A2,
+	      typename
+	      = enable_if_t<simd_size_v<_Up, _A2> == simd_size_v<_Tp, _Abi>>>
+    operator simd_mask<_Up, _A2>() &&
+    {
+      using namespace std::experimental::__proposed;
+      return static_simd_cast<simd_mask<_Up, _A2>>(_M_data);
+    }
+
+    const simd_mask<_Tp, _Abi>& _M_data;
+  };
+  _GLIBCXX_SIMD_INTRINSIC _CvtProxy __cvt() const { return {*this}; }
+  // }}}
+  // operator?: overloads (suggested extension) {{{
+#ifdef __GXX_CONDITIONAL_IS_OVERLOADABLE__
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+  operator?:(const simd_mask& __k, const simd_mask& __where_true,
+	     const simd_mask& __where_false)
+  {
+    auto __ret = __where_false;
+    _Impl::__masked_assign(__k._M_data, __ret._M_data, __where_true._M_data);
+    return __ret;
+  }
+
+  template <typename _U1, typename _U2,
+	    typename _Rp = simd<common_type_t<_U1, _U2>, _Abi>,
+	    typename = enable_if_t<conjunction_v<
+	      is_convertible<_U1, _Rp>, is_convertible<_U2, _Rp>,
+	      is_convertible<simd_mask, typename _Rp::mask_type>>>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend _Rp
+  operator?:(const simd_mask& __k, const _U1& __where_true,
+	     const _U2& __where_false)
+  {
+    _Rp __ret = __where_false;
+    _Rp::_Impl::__masked_assign(__data(
+				  static_cast<typename _Rp::mask_type>(__k)),
+				__data(__ret),
+				__data(static_cast<_Rp>(__where_true)));
+    return __ret;
+  }
+
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+  template <typename _Kp, typename _Ak, typename _Up, typename _Au,
+	    typename = enable_if_t<
+	      conjunction_v<is_convertible<simd_mask<_Kp, _Ak>, simd_mask>,
+			    is_convertible<simd_mask<_Up, _Au>, simd_mask>>>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+  operator?:(const simd_mask<_Kp, _Ak>& __k, const simd_mask& __where_true,
+	     const simd_mask<_Up, _Au>& __where_false)
+  {
+    simd_mask __ret = __where_false;
+    _Impl::__masked_assign(simd_mask(__k)._M_data, __ret._M_data,
+			   __where_true._M_data);
+    return __ret;
+  }
+#endif // _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+#endif // __GXX_CONDITIONAL_IS_OVERLOADABLE__
+  // }}}
+  // _M_is_constprop {{{
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    if constexpr (__is_scalar_abi<_Abi>())
+      return __builtin_constant_p(_M_data);
+    else
+      return _M_data._M_is_constprop();
+  }
+
+  // }}}
+
+private:
+  friend const auto& __data<_Tp, abi_type>(const simd_mask&);
+  friend auto& __data<_Tp, abi_type>(simd_mask&);
+  alignas(_Traits::_S_mask_align) _MemberType _M_data;
+};
+
+// }}}
+
+// __data(simd_mask) {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd_mask<_Tp, _Ap>& __x)
+{
+  return __x._M_data;
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd_mask<_Tp, _Ap>& __x)
+{
+  return __x._M_data;
+}
+// }}}
+
+// simd_mask reductions [simd_mask.reductions] {{{
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+all_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (!__k[__i])
+	  return false;
+      return true;
+    }
+  else
+    return _Abi::_MaskImpl::__all_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+any_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (__k[__i])
+	  return true;
+      return false;
+    }
+  else
+    return _Abi::_MaskImpl::__any_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+none_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (__k[__i])
+	  return false;
+      return true;
+    }
+  else
+    return _Abi::_MaskImpl::__none_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+some_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = 1; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (__k[__i] != __k[__i - 1])
+	  return true;
+      return false;
+    }
+  else
+    return _Abi::_MaskImpl::__some_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+popcount(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      int __r = 0;
+      for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (__k[__i])
+	  ++__r;
+      return __r;
+    }
+  else
+    return _Abi::_MaskImpl::__popcount(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+find_first_set(const simd_mask<_Tp, _Abi>& __k)
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+	if (__k[__i])
+	  return __i;
+      __builtin_unreachable(); // make none_of(__k) UB/ill-formed
+    }
+  else
+    return _Abi::_MaskImpl::__find_first_set(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+find_last_set(const simd_mask<_Tp, _Abi>& __k)
+{
+  if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+    {
+      for (size_t __i = simd_size_v<_Tp, _Abi>; __i > 0; --__i)
+	if (__k[__i - 1])
+	  return __i - 1;
+      __builtin_unreachable(); // make none_of(__k) UB/ill-formed
+    }
+  else
+    return _Abi::_MaskImpl::__find_last_set(__k);
+}
+
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+all_of(_ExactBool __x) noexcept
+{
+  return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+any_of(_ExactBool __x) noexcept
+{
+  return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+none_of(_ExactBool __x) noexcept
+{
+  return !__x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+  some_of(_ExactBool) noexcept
+{
+  return false;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+popcount(_ExactBool __x) noexcept
+{
+  return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+  find_first_set(_ExactBool)
+{
+  return 0;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+  find_last_set(_ExactBool)
+{
+  return 0;
+}
+
+// }}}
+
+// _SimdIntOperators{{{1
+template <typename _V, typename _Impl, bool> class _SimdIntOperators
+{
+};
+
+template <typename _V, typename _Impl> class _SimdIntOperators<_V, _Impl, true>
+{
+  _GLIBCXX_SIMD_INTRINSIC const _V& __derived() const
+  {
+    return *static_cast<const _V*>(this);
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _GLIBCXX_SIMD_CONSTEXPR _V
+  __make_derived(_Tp&& __d)
+  {
+    return {__private_init, static_cast<_Tp&&>(__d)};
+  }
+
+public:
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator%=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs % __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator&=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs & __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator|=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs | __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator^=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs ^ __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator<<=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs << __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator>>=(_V& __lhs, const _V& __x)
+  {
+    return __lhs = __lhs >> __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator<<=(_V& __lhs, int __x)
+  {
+    return __lhs = __lhs << __x;
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V& operator>>=(_V& __lhs, int __x)
+  {
+    return __lhs = __lhs >> __x;
+  }
+
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator%(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__modulus(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator&(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_and(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator|(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_or(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator^(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_xor(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator<<(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_shift_left(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator>>(const _V& __x, const _V& __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_shift_right(__data(__x), __data(__y)));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator<<(const _V& __x, int __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_shift_left(__data(__x), __y));
+  }
+  _GLIBCXX_SIMD_CONSTEXPR friend _V operator>>(const _V& __x, int __y)
+  {
+    return _SimdIntOperators::__make_derived(
+      _Impl::__bit_shift_right(__data(__x), __y));
+  }
+
+  // unary operators (for integral _Tp)
+  _GLIBCXX_SIMD_CONSTEXPR _V operator~() const
+  {
+    return {__private_init, _Impl::__complement(__derived()._M_data)};
+  }
+};
+
+//}}}1
+
+// simd {{{
+template <typename _Tp, typename _Abi>
+class simd : public _SimdIntOperators<
+	       simd<_Tp, _Abi>, typename _SimdTraits<_Tp, _Abi>::_SimdImpl,
+	       conjunction<std::is_integral<_Tp>,
+			   typename _SimdTraits<_Tp, _Abi>::_IsValid>::value>,
+	     public _SimdTraits<_Tp, _Abi>::_SimdBase
+{
+  using _Traits = _SimdTraits<_Tp, _Abi>;
+  using _MemberType = typename _Traits::_SimdMember;
+  using _CastType = typename _Traits::_SimdCastType;
+  static constexpr _Tp* _S_type_tag = nullptr;
+  friend typename _Traits::_SimdBase;
+
+public:
+  using _Impl = typename _Traits::_SimdImpl;
+  friend _Impl;
+  friend _SimdIntOperators<simd, _Impl, true>;
+
+  using value_type = _Tp;
+  using reference = _SmartReference<_MemberType, _Impl, value_type>;
+  using mask_type = simd_mask<_Tp, _Abi>;
+  using abi_type = _Abi;
+
+  static constexpr size_t size() { return __size_or_zero_v<_Tp, _Abi>; }
+  _GLIBCXX_SIMD_CONSTEXPR simd() = default;
+  _GLIBCXX_SIMD_CONSTEXPR simd(const simd&) = default;
+  _GLIBCXX_SIMD_CONSTEXPR simd(simd&&) noexcept = default;
+  _GLIBCXX_SIMD_CONSTEXPR simd& operator=(const simd&) = default;
+  _GLIBCXX_SIMD_CONSTEXPR simd& operator=(simd&&) noexcept = default;
+
+  // implicit broadcast constructor
+  template <typename _Up, typename = _ValuePreservingOrInt<_Up, value_type>>
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd(_Up&& __x)
+    : _M_data(
+      _Impl::__broadcast(static_cast<value_type>(static_cast<_Up&&>(__x))))
+  {}
+
+  // implicit type conversion constructor (convert from fixed_size to
+  // fixed_size)
+  template <typename _Up>
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
+  simd(const simd<_Up, simd_abi::fixed_size<size()>>& __x,
+       enable_if_t<
+	 conjunction<std::is_same<simd_abi::fixed_size<size()>, abi_type>,
+		     std::negation<__is_narrowing_conversion<_Up, value_type>>,
+		     __converts_to_higher_integer_rank<_Up, value_type>>::value,
+	 void*> = nullptr)
+    : simd{static_cast<std::array<_Up, size()>>(__x).data(), vector_aligned}
+  {}
+
+  // explicit type conversion constructor
+#ifdef _GLIBCXX_SIMD_ENABLE_STATIC_CAST
+  template <typename _Up, typename _A2,
+	    typename = decltype(
+	      static_simd_cast<simd>(std::declval<const simd<_Up, _A2>&>()))>
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+  simd(const simd<_Up, _A2>& __x)
+    : simd(static_simd_cast<simd>(__x))
+  {}
+#endif // _GLIBCXX_SIMD_ENABLE_STATIC_CAST
+
+  // generator constructor
+  template <typename _Fp>
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+  simd(_Fp&& __gen, _ValuePreservingOrInt<decltype(std::declval<_Fp>()(
+					    std::declval<_SizeConstant<0>&>())),
+					  value_type>* = nullptr)
+    : _M_data(_Impl::__generator(static_cast<_Fp&&>(__gen), _S_type_tag))
+  {}
+
+  // load constructor
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE simd(const _Up* __mem, _Flags __f)
+    : _M_data(_Impl::__load(__mem, __f, _S_type_tag))
+  {}
+
+  // loads [simd.load]
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE void copy_from(const _Vectorizable<_Up>* __mem,
+					     _Flags __f)
+  {
+    _M_data
+      = static_cast<decltype(_M_data)>(_Impl::__load(__mem, __f, _S_type_tag));
+  }
+
+  // stores [simd.store]
+  template <typename _Up, typename _Flags>
+  _GLIBCXX_SIMD_ALWAYS_INLINE void copy_to(_Vectorizable<_Up>* __mem,
+					   _Flags __f) const
+  {
+    _Impl::__store(_M_data, __mem, __f, _S_type_tag);
+  }
+
+  // scalar access
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR reference
+  operator[](size_t __i)
+  {
+    return {_M_data, int(__i)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR value_type operator[]([
+    [maybe_unused]] size_t __i) const
+  {
+    if constexpr (__is_scalar_abi<_Abi>())
+      {
+	_GLIBCXX_DEBUG_ASSERT(__i == 0);
+	return _M_data;
+      }
+    else
+      {
+	return _M_data[__i];
+      }
+  }
+
+  // increment and decrement:
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd& operator++()
+  {
+    _Impl::__increment(_M_data);
+    return *this;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator++(int)
+  {
+    simd __r = *this;
+    _Impl::__increment(_M_data);
+    return __r;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd& operator--()
+  {
+    _Impl::__decrement(_M_data);
+    return *this;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator--(int)
+  {
+    simd __r = *this;
+    _Impl::__decrement(_M_data);
+    return __r;
+  }
+
+  // unary operators (for any _Tp)
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR mask_type
+  operator!() const
+  {
+    return {__private_init, _Impl::__negate(_M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator+() const
+  {
+    return *this;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator-() const
+  {
+    return {__private_init, _Impl::__unary_minus(_M_data)};
+  }
+
+  // access to internal representation (suggested extension)
+  _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+  simd(_CastType __init)
+    : _M_data(__init)
+  {}
+
+  // compound assignment [simd.cassign]
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+  operator+=(simd& __lhs, const simd& __x)
+  {
+    return __lhs = __lhs + __x;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+  operator-=(simd& __lhs, const simd& __x)
+  {
+    return __lhs = __lhs - __x;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+  operator*=(simd& __lhs, const simd& __x)
+  {
+    return __lhs = __lhs * __x;
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+  operator/=(simd& __lhs, const simd& __x)
+  {
+    return __lhs = __lhs / __x;
+  }
+
+  // binary operators [simd.binary]
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+  operator+(const simd& __x, const simd& __y)
+  {
+    return {__private_init, _Impl::__plus(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+  operator-(const simd& __x, const simd& __y)
+  {
+    return {__private_init, _Impl::__minus(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+  operator*(const simd& __x, const simd& __y)
+  {
+    return {__private_init, _Impl::__multiplies(__x._M_data, __y._M_data)};
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+  operator/(const simd& __x, const simd& __y)
+  {
+    return {__private_init, _Impl::__divides(__x._M_data, __y._M_data)};
+  }
+
+  // compares [simd.comparison]
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator==(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__equal_to(__x._M_data, __y._M_data));
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator!=(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__not_equal_to(__x._M_data, __y._M_data));
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator<(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__less(__x._M_data, __y._M_data));
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator<=(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__less_equal(__x._M_data, __y._M_data));
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator>(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__less(__y._M_data, __x._M_data));
+  }
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+  operator>=(const simd& __x, const simd& __y)
+  {
+    return simd::__make_mask(_Impl::__less_equal(__y._M_data, __x._M_data));
+  }
+
+  // operator?: overloads (suggested extension) {{{
+#ifdef __GXX_CONDITIONAL_IS_OVERLOADABLE__
+  _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+  operator?:(const mask_type& __k, const simd& __where_true,
+	     const simd& __where_false)
+  {
+    auto __ret = __where_false;
+    _Impl::__masked_assign(__data(__k), __data(__ret), __data(__where_true));
+    return __ret;
+  }
+#endif // __GXX_CONDITIONAL_IS_OVERLOADABLE__
+  // }}}
+
+  // "private" because of the first arguments's namespace
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  simd(_PrivateInit, const _MemberType& __init)
+    : _M_data(__init)
+  {}
+
+  // "private" because of the first arguments's namespace
+  _GLIBCXX_SIMD_INTRINSIC simd(_BitsetInit, std::bitset<size()> __init)
+    : _M_data()
+  {
+    where(mask_type(__bitset_init, __init), *this) = ~*this;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    if constexpr (__is_scalar_abi<_Abi>())
+      return __builtin_constant_p(_M_data);
+    else
+      return _M_data._M_is_constprop();
+  }
+
+private:
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR static mask_type
+  __make_mask(typename mask_type::_MemberType __k)
+  {
+    return {__private_init, __k};
+  }
+
+  friend const auto& __data<value_type, abi_type>(const simd&);
+  friend auto& __data<value_type, abi_type>(simd&);
+  alignas(_Traits::_S_simd_align) _MemberType _M_data;
+};
+
+// }}}
+// __data {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd<_Tp, _Ap>& __x)
+{
+  return __x._M_data;
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd<_Tp, _Ap>& __x)
+{
+  return __x._M_data;
+}
+// }}}
+
+namespace __proposed {
+namespace float_bitwise_operators {
+// float_bitwise_operators {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator^(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  return {__private_init, _Ap::_SimdImpl::__bit_xor(__data(__a), __data(__b))};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator|(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  return {__private_init, _Ap::_SimdImpl::__bit_or(__data(__a), __data(__b))};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator&(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+  return {__private_init, _Ap::_SimdImpl::__bit_and(__data(__a), __data(__b))};
+}
+// }}}
+} // namespace float_bitwise_operators
+} // namespace __proposed
+
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_H
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
new file mode 100644
index 00000000000..4dbdce95797
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -0,0 +1,2854 @@
+// Simd Abi specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+
+#if __cplusplus >= 201703L
+
+#include <array>
+#include <cmath>
+#include <cstdlib>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+// _S_allbits{{{
+template <typename _V>
+static inline constexpr _V _S_allbits
+  = reinterpret_cast<_V>(~__vector_type_t<char, sizeof(_V) / sizeof(char)>());
+
+// }}}
+// _S_signmask, _S_absmask{{{
+template <typename _V, typename = _VectorTraits<_V>>
+static inline constexpr _V _S_signmask = __xor(_V() + 1, _V() - 1);
+template <typename _V, typename = _VectorTraits<_V>>
+static inline constexpr _V _S_absmask
+  = __andnot(_S_signmask<_V>, _S_allbits<_V>);
+
+//}}}
+// __vector_permute<Indices...>{{{
+// Index == -1 requests zeroing of the output element
+template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_Tp
+__vector_permute(_Tp __x)
+{
+  static_assert(sizeof...(_Indices) == _TVT::_S_width);
+  return __make_vector<typename _TVT::value_type>(
+    (_Indices == -1 ? 0 : __x[_Indices == -1 ? 0 : _Indices])...);
+}
+
+// }}}
+// __vector_shuffle<Indices...>{{{
+// Index == -1 requests zeroing of the output element
+template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_Tp
+__vector_shuffle(_Tp __x, _Tp __y)
+{
+  return _Tp{(_Indices == -1 ? 0
+			     : _Indices < _TVT::_S_width
+				 ? __x[_Indices]
+				 : __y[_Indices - _TVT::_S_width])...};
+}
+
+// }}}
+// __make_wrapper{{{
+template <typename _Tp, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, sizeof...(_Args)>
+__make_wrapper(const _Args&... __args)
+{
+  return __make_vector<_Tp>(__args...);
+}
+
+// }}}
+// __wrapper_bitcast{{{
+template <typename _Tp, size_t _ToN = 0, typename _Up, size_t _M,
+	  size_t _Np = _ToN != 0 ? _ToN : sizeof(_Up) * _M / sizeof(_Tp)>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, _Np>
+__wrapper_bitcast(_SimdWrapper<_Up, _M> __x)
+{
+  static_assert(_Np > 1);
+  return __intrin_bitcast<__vector_type_t<_Tp, _Np>>(__x._M_data);
+}
+
+// }}}
+// __shift_elements_right{{{
+// if (__shift % 2ⁿ == 0) => the low n Bytes are correct
+template <unsigned __shift, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _Tp
+__shift_elements_right(_Tp __v)
+{
+  [[maybe_unused]] const auto __iv = __to_intrin(__v);
+  static_assert(__shift <= sizeof(_Tp));
+  if constexpr (__shift == 0)
+    return __v;
+  else if constexpr (__shift == sizeof(_Tp))
+    return _Tp();
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+  else if constexpr (__have_sse && __shift == 8
+		     && _TVT::template __is<float, 4>)
+    return _mm_movehl_ps(__iv, __iv);
+  else if constexpr (__have_sse2 && __shift == 8
+		     && _TVT::template __is<double, 2>)
+    return _mm_unpackhi_pd(__iv, __iv);
+  else if constexpr (__have_sse2 && sizeof(_Tp) == 16)
+    return reinterpret_cast<typename _TVT::type>(
+      _mm_srli_si128(reinterpret_cast<__m128i>(__iv), __shift));
+  else if constexpr (__shift == 16 && sizeof(_Tp) == 32)
+    {
+      /*if constexpr (__have_avx && _TVT::template __is<double, 4>)
+	return _mm256_permute2f128_pd(__iv, __iv, 0x81);
+      else if constexpr (__have_avx && _TVT::template __is<float, 8>)
+	return _mm256_permute2f128_ps(__iv, __iv, 0x81);
+      else if constexpr (__have_avx)
+	return reinterpret_cast<typename _TVT::type>(
+	  _mm256_permute2f128_si256(__iv, __iv, 0x81));
+      else*/
+      return __zero_extend(__hi128(__v));
+    }
+  else if constexpr (__have_avx2 && sizeof(_Tp) == 32 && __shift < 16)
+    {
+      const auto __vll = __vector_bitcast<_LLong>(__v);
+      return reinterpret_cast<typename _TVT::type>(
+	_mm256_alignr_epi8(_mm256_permute2x128_si256(__vll, __vll, 0x81), __vll,
+			   __shift));
+    }
+  else if constexpr (__have_avx && sizeof(_Tp) == 32 && __shift < 16)
+    {
+      const auto __vll = __vector_bitcast<_LLong>(__v);
+      return reinterpret_cast<typename _TVT::type>(
+	__concat(_mm_alignr_epi8(__hi128(__vll), __lo128(__vll), __shift),
+		 _mm_srli_si128(__hi128(__vll), __shift)));
+    }
+  else if constexpr (sizeof(_Tp) == 32 && __shift > 16)
+    return __zero_extend(__shift_elements_right<__shift - 16>(__hi128(__v)));
+  else if constexpr (sizeof(_Tp) == 64 && __shift == 32)
+    return __zero_extend(__hi256(__v));
+  else if constexpr (__have_avx512f && sizeof(_Tp) == 64)
+    {
+      if constexpr (__shift >= 48)
+	return __zero_extend(
+	  __shift_elements_right<__shift - 48>(__extract<3, 4>(__v)));
+      else if constexpr (__shift >= 32)
+	return __zero_extend(
+	  __shift_elements_right<__shift - 32>(__hi256(__v)));
+      else if constexpr (__shift % 8 == 0)
+	return reinterpret_cast<typename _TVT::type>(
+	  _mm512_alignr_epi64(__m512i(), __intrin_bitcast<__m512i>(__v),
+			      __shift / 8));
+      else if constexpr (__shift % 4 == 0)
+	return reinterpret_cast<typename _TVT::type>(
+	  _mm512_alignr_epi32(__m512i(), __intrin_bitcast<__m512i>(__v),
+			      __shift / 4));
+      else if constexpr (__have_avx512bw && __shift < 16)
+	{
+	  const auto __vll = __vector_bitcast<_LLong>(__v);
+	  return reinterpret_cast<typename _TVT::type>(
+	    _mm512_alignr_epi8(_mm512_shuffle_i32x4(__vll, __vll, 0xf9), __vll,
+			       __shift));
+	}
+      else if constexpr (__have_avx512bw && __shift < 32)
+	{
+	  const auto __vll = __vector_bitcast<_LLong>(__v);
+	  return reinterpret_cast<typename _TVT::type>(
+	    _mm512_alignr_epi8(_mm512_shuffle_i32x4(__vll, __m512i(), 0xee),
+			       _mm512_shuffle_i32x4(__vll, __vll, 0xf9),
+			       __shift - 16));
+	}
+      else
+	__assert_unreachable<_Tp>();
+    }
+/*
+    } else if constexpr (__shift % 16 == 0 && sizeof(_Tp) == 64)
+	return __auto_bitcast(__extract<__shift / 16, 4>(__v));
+*/
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+  else
+    {
+      constexpr int __chunksize
+	= __shift % 8 == 0 ? 8
+			   : __shift % 4 == 0 ? 4 : __shift % 2 == 0 ? 2 : 1;
+      auto __w = __vector_bitcast<__int_with_sizeof_t<__chunksize>>(__v);
+      using _Up = decltype(__w);
+      return __intrin_bitcast<_Tp>(
+	__call_with_n_evaluations<(sizeof(_Tp) - __shift) / __chunksize>(
+	  [](auto... __chunks) { return _Up{__chunks...}; },
+	  [&](auto __i) { return __w[__shift / __chunksize + __i]; }));
+    }
+}
+
+// }}}
+// __extract_part(_SimdWrapper<_Tp, _Np>) {{{
+template <int _Index, int _Total, int _Combine, typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_CONST _SimdWrapper<_Tp, _Np / _Total * _Combine>
+  __extract_part(const _SimdWrapper<_Tp, _Np> __x)
+{
+  if constexpr (_Index % 2 == 0 && _Total % 2 == 0 && _Combine % 2 == 0)
+    return __extract_part<_Index / 2, _Total / 2, _Combine / 2>(__x);
+  else
+    {
+      constexpr size_t __values_per_part = _Np / _Total;
+      constexpr size_t __values_to_skip = _Index * __values_per_part;
+      constexpr size_t __return_size = __values_per_part * _Combine;
+      using _R = __vector_type_t<_Tp, __return_size>;
+      static_assert((_Index + _Combine) * __values_per_part * sizeof(_Tp)
+		      <= sizeof(__x),
+		    "out of bounds __extract_part");
+      // the following assertion would ensure no "padding" to be read
+      // static_assert(_Total >= _Index + _Combine, "_Total must be greater than
+      // _Index");
+
+      // static_assert(__return_size * _Total == _Np, "_Np must be divisible by
+      // _Total");
+      if (__x._M_is_constprop())
+	return __generate_from_n_evaluations<__return_size, _R>(
+	  [&](auto __i) { return __x[__values_to_skip + __i]; });
+      if constexpr (_Index == 0 && _Total == 1)
+	return __x;
+      else if constexpr (_Index == 0)
+	return __intrin_bitcast<_R>(__as_vector(__x));
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+      else if constexpr (sizeof(__x) == 32 && __return_size * sizeof(_Tp) <= 16)
+	{
+	  constexpr size_t __bytes_to_skip = __values_to_skip * sizeof(_Tp);
+	  if constexpr (__bytes_to_skip == 16)
+	    return __vector_bitcast<_Tp, __return_size>(
+	      __hi128(__as_vector(__x)));
+	  else
+	    return __vector_bitcast<_Tp, __return_size>(
+	      _mm_alignr_epi8(__hi128(__vector_bitcast<_LLong>(__x)),
+			      __lo128(__vector_bitcast<_LLong>(__x)),
+			      __bytes_to_skip));
+	}
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+      else if constexpr (_Index > 0
+			 && (__values_to_skip % __return_size != 0
+			     || sizeof(_R) >= 8)
+			 && (__values_to_skip + __return_size) * sizeof(_Tp)
+			      <= 64
+			 && sizeof(__x) >= 16)
+	return __intrin_bitcast<_R>(
+	  __shift_elements_right<__values_to_skip * sizeof(_Tp)>(
+	    __as_vector(__x)));
+      else
+	{
+	  _R __r = {};
+	  __builtin_memcpy(&__r,
+			   reinterpret_cast<const char*>(&__x)
+			     + sizeof(_Tp) * __values_to_skip,
+			   __return_size * sizeof(_Tp));
+	  return __r;
+	}
+    }
+}
+
+// }}}
+// __extract_part(_SimdWrapper<bool, _Np>) {{{
+template <int _Index, int _Total, int _Combine = 1, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<bool, _Np / _Total * _Combine>
+__extract_part(const _SimdWrapper<bool, _Np> __x)
+{
+  static_assert(_Combine == 1, "_Combine != 1 not implemented");
+  static_assert(__have_avx512f && _Np == _Np);
+  static_assert(_Total >= 2 && _Index + _Combine <= _Total && _Index >= 0);
+  return __x._M_data >> (_Index * _Np / _Total);
+}
+
+// }}}
+
+// __vector_convert {{{
+// implementation requires an index sequence
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, _From __l, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, _From __l, _From __m, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+	     static_cast<_Tp>(__m[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, _From __l, _From __m, _From __n,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+	     static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, _From __l, _From __m, _From __n, _From __o,
+		 index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+	     static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...,
+	     static_cast<_Tp>(__o[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+		 _From __f, _From __g, _From __h, _From __i, _From __j,
+		 _From __k, _From __l, _From __m, _From __n, _From __o,
+		 _From __p, index_sequence<_I...>)
+{
+  using _Tp = typename _VectorTraits<_To>::value_type;
+  return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+	     static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+	     static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+	     static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+	     static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+	     static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+	     static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...,
+	     static_cast<_Tp>(__o[_I])..., static_cast<_Tp>(__p[_I])...};
+}
+
+// Defer actual conversion to the overload that takes an index sequence. Note
+// that this function adds zeros or drops values off the end if you don't ensure
+// matching width.
+template <typename _To, typename... _From, typename _ToT = _VectorTraits<_To>,
+	  typename _FromT = _VectorTraits<__first_of_pack_t<_From...>>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From... __xs)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+  if (!(... && __builtin_constant_p(__xs)))
+    {
+      if constexpr ((sizeof...(_From) & (sizeof...(_From) - 1))
+		    == 0) // power-of-two number of arguments
+	return __convert_x86<_To>(__as_vector(__xs)...);
+      else
+	{
+	  using _FF = __first_of_pack_t<_From...>;
+	  return __vector_convert<_To>(__xs..., _FF{});
+	}
+    }
+  else
+#endif
+    return __vector_convert<_To>(
+      __xs...,
+      make_index_sequence<std::min(_ToT::_S_width, _FromT::_S_width)>());
+}
+
+// This overload takes a vectorizable type _To and produces a return type that
+// matches the width.
+template <typename _To, typename... _From,
+	  typename = enable_if_t<__is_vectorizable_v<_To>>,
+	  typename _FromT = _VectorTraits<__first_of_pack_t<_From...>>,
+	  typename = int>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From... __xs)
+{
+  return __vector_convert<__vector_type_t<_To, _FromT::_S_width>>(__xs...);
+}
+
+// }}}
+// __convert function{{{
+template <typename _To, typename _From, typename... _More>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__convert(_From __v0, _More... __vs)
+{
+  if constexpr (__is_vectorizable_v<_From>)
+    {
+      static_assert((true && ... && is_same_v<_From, _More>) );
+      using _V = typename _VectorTraits<_To>::type;
+      using _Tp = typename _VectorTraits<_To>::value_type;
+      return _V{static_cast<_Tp>(__v0), static_cast<_Tp>(__vs)...};
+    }
+  else if constexpr (!__is_vector_type_v<_From>)
+    return __convert<_To>(__as_vector(__v0), __as_vector(__vs)...);
+  else
+    {
+      static_assert((true && ... && is_same_v<_From, _More>) );
+      if constexpr (__is_vectorizable_v<_To>)
+	return __convert<__vector_type_t<_To, (_VectorTraits<_From>::_S_width
+					       * (1 + sizeof...(_More)))>>(
+	  __v0, __vs...);
+      else if constexpr (!__is_vector_type_v<_To>)
+	return _To(__convert<typename _To::_BuiltinType>(__v0, __vs...));
+      else
+	{
+	  static_assert(
+	    sizeof...(_More) == 0
+	      || _VectorTraits<_To>::_S_width
+		   >= (1 + sizeof...(_More)) * _VectorTraits<_From>::_S_width,
+	    "__convert(...) requires the input to fit into the output");
+	  return __vector_convert<_To>(__v0, __vs...);
+	}
+    }
+}
+
+// }}}
+// __convert_all{{{
+// Converts __v into std::array<_To, N>, where N is _NParts if non-zero or
+// otherwise deduced from _To such that N * #elements(_To) <= #elements(__v).
+// Note: this function may return less than all converted elements
+template <typename _To,
+	  size_t _NParts = 0, // allows to convert fewer or more (only last _To,
+			      // to be partially filled) than all
+	  size_t _Offset = 0, // where to start, # of elements (not Bytes or
+			      // Parts)
+	  typename _From, typename _FromVT = _VectorTraits<_From>>
+_GLIBCXX_SIMD_INTRINSIC auto
+__convert_all(_From __v)
+{
+  if constexpr (std::is_arithmetic_v<_To> && _NParts != 1)
+    {
+      static_assert(_Offset < _FromVT::_S_width);
+      constexpr auto _Np
+	= _NParts == 0 ? _FromVT::_S_partial_width - _Offset : _NParts;
+      return __generate_from_n_evaluations<_Np, std::array<_To, _Np>>(
+	[&](auto __i) { return static_cast<_To>(__v[__i + _Offset]); });
+    }
+  else
+    {
+      static_assert(__is_vector_type_v<_To>);
+      using _ToVT = _VectorTraits<_To>;
+      if constexpr (__is_vector_type_v<_From>)
+	return __convert_all<_To, _NParts>(__as_wrapper(__v));
+      else if constexpr (_NParts == 1)
+	{
+	  static_assert(_Offset % _ToVT::_S_width == 0);
+	  return std::array<_To, 1>{__vector_convert<_To>(
+	    __extract_part<_Offset / _ToVT::_S_width,
+			   __div_roundup(_FromVT::_S_partial_width,
+					 _ToVT::_S_width)>(__v))};
+	}
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+      else if constexpr (
+	!__have_sse4_1 && _Offset == 0
+	&& is_integral_v<
+	  typename _FromVT::
+	    value_type> && sizeof(typename _FromVT::value_type) < sizeof(typename _ToVT::value_type)
+	&& !(sizeof(typename _FromVT::value_type) == 4
+	     && is_same_v<typename _ToVT::value_type, double>) )
+	{
+	  using _ToT = typename _ToVT::value_type;
+	  using _FromT = typename _FromVT::value_type;
+	  constexpr size_t _Np
+	    = _NParts != 0 ? _NParts
+			   : (_FromVT::_S_partial_width / _ToVT::_S_width);
+	  using _R = std::array<_To, _Np>;
+	  // __adjust modifies its input to have _Np (use _SizeConstant) entries
+	  // so that no unnecessary intermediate conversions are requested and,
+	  // more importantly, no intermediate conversions are missing
+	  [[maybe_unused]] auto __adjust
+	    = [](auto __n,
+		 auto __vv) -> _SimdWrapper<_FromT, decltype(__n)::value> {
+	    return __vector_bitcast<_FromT, decltype(__n)::value>(__vv);
+	  };
+	  [[maybe_unused]] const auto __vi = __to_intrin(__v);
+	  auto&& __make_array =
+	    []<typename _ToConvert>(_ToConvert __x0,
+				    [[maybe_unused]] _ToConvert __x1) {
+	      if constexpr (_Np == 1)
+		return _R{__vector_bitcast<_ToT>(__x0)};
+	      else
+		return _R{__vector_bitcast<_ToT>(__x0),
+			  __vector_bitcast<_ToT>(__x1)};
+	    };
+
+	  if constexpr (_Np == 0)
+	    return _R{};
+	  else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) == 2)
+	    {
+	      static_assert(std::is_integral_v<_FromT>);
+	      static_assert(std::is_integral_v<_ToT>);
+	      if constexpr (is_unsigned_v<_FromT>)
+		return __make_array(_mm_unpacklo_epi8(__vi, __m128i()),
+				    _mm_unpackhi_epi8(__vi, __m128i()));
+	      else
+		return __make_array(
+		  _mm_srai_epi16(_mm_unpacklo_epi8(__vi, __vi), 8),
+		  _mm_srai_epi16(_mm_unpackhi_epi8(__vi, __vi), 8));
+	    }
+	  else if constexpr (sizeof(_FromT) == 2 && sizeof(_ToT) == 4)
+	    {
+	      static_assert(std::is_integral_v<_FromT>);
+	      if constexpr (is_floating_point_v<_ToT>)
+		{
+		  const auto __ints
+		    = __convert_all<__vector_type16_t<int>, _Np>(
+		      __adjust(_SizeConstant<_Np * 4>(), __v));
+		  return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		    return __vector_convert<_To>(__ints[__i]);
+		  });
+		}
+	      else if constexpr (is_unsigned_v<_FromT>)
+		return __make_array(_mm_unpacklo_epi16(__vi, __m128i()),
+				    _mm_unpackhi_epi16(__vi, __m128i()));
+	      else
+		return __make_array(
+		  _mm_srai_epi32(_mm_unpacklo_epi16(__vi, __vi), 16),
+		  _mm_srai_epi32(_mm_unpackhi_epi16(__vi, __vi), 16));
+	    }
+	  else if constexpr (sizeof(_FromT) == 4 && sizeof(_ToT) == 8
+			     && is_integral_v<_FromT> && is_integral_v<_ToT>)
+	    {
+	      if constexpr (is_unsigned_v<_FromT>)
+		return __make_array(_mm_unpacklo_epi32(__vi, __m128i()),
+				    _mm_unpackhi_epi32(__vi, __m128i()));
+	      else
+		return __make_array(
+		  _mm_unpacklo_epi32(__vi, _mm_srai_epi32(__vi, 31)),
+		  _mm_unpackhi_epi32(__vi, _mm_srai_epi32(__vi, 31)));
+	    }
+	  else if constexpr (sizeof(_FromT) == 4 && sizeof(_ToT) == 8
+			     && is_integral_v<_FromT> && is_integral_v<_ToT>)
+	    {
+	      if constexpr (is_unsigned_v<_FromT>)
+		return __make_array(_mm_unpacklo_epi32(__vi, __m128i()),
+				    _mm_unpackhi_epi32(__vi, __m128i()));
+	      else
+		return __make_array(
+		  _mm_unpacklo_epi32(__vi, _mm_srai_epi32(__vi, 31)),
+		  _mm_unpackhi_epi32(__vi, _mm_srai_epi32(__vi, 31)));
+	    }
+	  else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) >= 4
+			     && is_signed_v<_FromT>)
+	    {
+	      const __m128i __vv[2] = {_mm_unpacklo_epi8(__vi, __vi),
+				       _mm_unpackhi_epi8(__vi, __vi)};
+	      const __vector_type16_t<int> __vvvv[4]
+		= {__vector_bitcast<int>(_mm_unpacklo_epi16(__vv[0], __vv[0])),
+		   __vector_bitcast<int>(_mm_unpackhi_epi16(__vv[0], __vv[0])),
+		   __vector_bitcast<int>(_mm_unpacklo_epi16(__vv[1], __vv[1])),
+		   __vector_bitcast<int>(_mm_unpackhi_epi16(__vv[1], __vv[1]))};
+	      if constexpr (sizeof(_ToT) == 4)
+		return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		  return __vector_convert<_To>(__vvvv[__i] >> 24);
+		});
+	      else if constexpr (is_integral_v<_ToT>)
+		return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		  const auto __signbits = __to_intrin(__vvvv[__i / 2] >> 31);
+		  const auto __sx32 = __to_intrin(__vvvv[__i / 2] >> 24);
+		  return __vector_bitcast<_ToT>(
+		    __i % 2 == 0 ? _mm_unpacklo_epi32(__sx32, __signbits)
+				 : _mm_unpackhi_epi32(__sx32, __signbits));
+		});
+	      else
+		return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		  const auto __int4 = __vvvv[__i / 2] >> 24;
+		  return __vector_convert<_To>(
+		    __i % 2 == 0 ? __int4
+				 : __vector_bitcast<int>(
+				   _mm_unpackhi_epi64(__to_intrin(__int4),
+						      __to_intrin(__int4))));
+		});
+	    }
+	  else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) == 4)
+	    {
+	      const auto __shorts = __convert_all<__vector_type16_t<
+		conditional_t<is_signed_v<_FromT>, short, unsigned short>>>(
+		__adjust(_SizeConstant<(_Np + 1) / 2 * 8>(), __v));
+	      return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		return __convert_all<_To>(__shorts[__i / 2])[__i % 2];
+	      });
+	    }
+	  else if constexpr (sizeof(_FromT) == 2 && sizeof(_ToT) == 8
+			     && is_signed_v<_FromT> && is_integral_v<_ToT>)
+	    {
+	      const __m128i __vv[2] = {_mm_unpacklo_epi16(__vi, __vi),
+				       _mm_unpackhi_epi16(__vi, __vi)};
+	      const __vector_type16_t<int> __vvvv[4]
+		= {__vector_bitcast<int>(
+		     _mm_unpacklo_epi32(_mm_srai_epi32(__vv[0], 16),
+					_mm_srai_epi32(__vv[0], 31))),
+		   __vector_bitcast<int>(
+		     _mm_unpackhi_epi32(_mm_srai_epi32(__vv[0], 16),
+					_mm_srai_epi32(__vv[0], 31))),
+		   __vector_bitcast<int>(
+		     _mm_unpacklo_epi32(_mm_srai_epi32(__vv[1], 16),
+					_mm_srai_epi32(__vv[1], 31))),
+		   __vector_bitcast<int>(
+		     _mm_unpackhi_epi32(_mm_srai_epi32(__vv[1], 16),
+					_mm_srai_epi32(__vv[1], 31)))};
+	      return __generate_from_n_evaluations<_Np, _R>(
+		[&](auto __i) { return __vector_bitcast<_ToT>(__vvvv[__i]); });
+	    }
+	  else if constexpr (sizeof(_FromT) <= 2 && sizeof(_ToT) == 8)
+	    {
+	      const auto __ints = __convert_all<__vector_type16_t<
+		conditional_t<is_signed_v<_FromT> || is_floating_point_v<_ToT>,
+			      int, unsigned int>>>(
+		__adjust(_SizeConstant<(_Np + 1) / 2 * 4>(), __v));
+	      return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+		return __convert_all<_To>(__ints[__i / 2])[__i % 2];
+	      });
+	    }
+	  else
+	    __assert_unreachable<_To>();
+	}
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+      else if constexpr ((_FromVT::_S_partial_width - _Offset)
+			 > _ToVT::_S_width)
+	{
+	  /*
+	  static_assert(
+	    (_FromVT::_S_partial_width & (_FromVT::_S_partial_width - 1)) == 0,
+	    "__convert_all only supports power-of-2 number of elements.
+	  Otherwise " "the return type cannot be std::array<_To, N>.");
+	    */
+	  constexpr size_t _NTotal
+	    = (_FromVT::_S_partial_width - _Offset) / _ToVT::_S_width;
+	  constexpr size_t _Np = _NParts == 0 ? _NTotal : _NParts;
+	  static_assert(
+	    _Np <= _NTotal
+	    || (_Np == _NTotal + 1
+		&& (_FromVT::_S_partial_width - _Offset) % _ToVT::_S_width
+		     > 0));
+	  using _R = std::array<_To, _Np>;
+	  if constexpr (_Np == 1)
+	    return _R{__vector_convert<_To>(
+	      __as_vector(__extract_part<_Offset, _FromVT::_S_partial_width,
+					 _ToVT::_S_width>(__v)))};
+	  else
+	    return __generate_from_n_evaluations<_Np, _R>([&](
+	      auto __i) constexpr {
+	      auto __part
+		= __extract_part<__i * _ToVT::_S_width + _Offset,
+				 _FromVT::_S_partial_width, _ToVT::_S_width>(
+		  __v);
+	      return __vector_convert<_To>(__part);
+	    });
+	}
+      else if constexpr (_Offset == 0)
+	return std::array<_To, 1>{__vector_convert<_To>(__as_vector(__v))};
+      else
+	return std::array<_To, 1>{__vector_convert<_To>(__as_vector(
+	  __extract_part<_Offset, _FromVT::_S_partial_width,
+			 _FromVT::_S_partial_width - _Offset>(__v)))};
+    }
+}
+
+// }}}
+
+// _GnuTraits {{{
+template <typename _Tp, typename _Mp, typename _Abi, size_t _Np>
+struct _GnuTraits
+{
+  using _IsValid = true_type;
+  using _SimdImpl = typename _Abi::_SimdImpl;
+  using _MaskImpl = typename _Abi::_MaskImpl;
+
+  // simd and simd_mask member types {{{
+  using _SimdMember = _SimdWrapper<_Tp, _Np>;
+  using _MaskMember = _SimdWrapper<_Mp, _Np>;
+  static constexpr size_t _S_simd_align = alignof(_SimdMember);
+  static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+  // }}}
+  // _SimdBase / base class for simd, providing extra conversions {{{
+  struct _SimdBase2
+  {
+    explicit operator __intrinsic_type_t<_Tp, _Np>() const
+    {
+      return __to_intrin(static_cast<const simd<_Tp, _Abi>*>(this)->_M_data);
+    }
+    explicit operator __vector_type_t<_Tp, _Np>() const
+    {
+      return static_cast<const simd<_Tp, _Abi>*>(this)->_M_data.__builtin();
+    }
+  };
+  struct _SimdBase1
+  {
+    explicit operator __intrinsic_type_t<_Tp, _Np>() const
+    {
+      return __data(*static_cast<const simd<_Tp, _Abi>*>(this));
+    }
+  };
+  using _SimdBase
+    = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+				      __vector_type_t<_Tp, _Np>>::value,
+			 _SimdBase1, _SimdBase2>;
+
+  // }}}
+  // _MaskBase {{{
+  struct _MaskBase2
+  {
+    explicit operator __intrinsic_type_t<_Tp, _Np>() const
+    {
+      return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data.__intrin();
+    }
+    explicit operator __vector_type_t<_Tp, _Np>() const
+    {
+      return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data._M_data;
+    }
+  };
+  struct _MaskBase1
+  {
+    explicit operator __intrinsic_type_t<_Tp, _Np>() const
+    {
+      return __data(*static_cast<const simd_mask<_Tp, _Abi>*>(this));
+    }
+  };
+  using _MaskBase
+    = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+				      __vector_type_t<_Tp, _Np>>::value,
+			 _MaskBase1, _MaskBase2>;
+
+  // }}}
+  // _MaskCastType {{{
+  // parameter type of one explicit simd_mask constructor
+  class _MaskCastType
+  {
+    using _Up = __intrinsic_type_t<_Tp, _Np>;
+    _Up _M_data;
+
+  public:
+    _MaskCastType(_Up __x) : _M_data(__x) {}
+    operator _MaskMember() const { return _M_data; }
+  };
+
+  // }}}
+  // _SimdCastType {{{
+  // parameter type of one explicit simd constructor
+  class _SimdCastType1
+  {
+    using _Ap = __intrinsic_type_t<_Tp, _Np>;
+    _SimdMember _M_data;
+
+  public:
+    _SimdCastType1(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+    operator _SimdMember() const { return _M_data; }
+  };
+
+  class _SimdCastType2
+  {
+    using _Ap = __intrinsic_type_t<_Tp, _Np>;
+    using _B = __vector_type_t<_Tp, _Np>;
+    _SimdMember _M_data;
+
+  public:
+    _SimdCastType2(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+    _SimdCastType2(_B __b) : _M_data(__b) {}
+    operator _SimdMember() const { return _M_data; }
+  };
+
+  using _SimdCastType
+    = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+				      __vector_type_t<_Tp, _Np>>::value,
+			 _SimdCastType1, _SimdCastType2>;
+  //}}}
+};
+
+// }}}
+struct _CommonImplX86;
+struct _CommonImplNeon;
+struct _CommonImplBuiltin;
+template <typename _Abi> struct _SimdImplBuiltin;
+template <typename _Abi> struct _MaskImplBuiltin;
+template <typename _Abi> struct _SimdImplX86;
+template <typename _Abi> struct _MaskImplX86;
+template <typename _Abi> struct _SimdImplNeon;
+template <typename _Abi> struct _MaskImplNeon;
+// simd_abi::_VecBuiltin {{{
+template <int _UsedBytes> struct simd_abi::_VecBuiltin
+{
+  template <typename _Tp>
+  static constexpr size_t size = _UsedBytes / sizeof(_Tp);
+  template <typename _Tp>
+  static constexpr size_t _S_full_size
+    = sizeof(__vector_type_t<_Tp, size<_Tp>>) / sizeof(_Tp);
+  static constexpr bool _S_is_partial = (_UsedBytes & (_UsedBytes - 1)) != 0;
+
+  // validity traits {{{
+  struct _IsValidAbiTag : __bool_constant<(_UsedBytes > 1)>
+  {
+  };
+
+  template <typename _Tp>
+  struct _IsValidSizeFor
+    : std::conjunction<
+	__bool_constant<(_UsedBytes / sizeof(_Tp) > 1
+			 && _UsedBytes % sizeof(_Tp) == 0)>,
+	__bool_constant<(_UsedBytes <= __vectorized_sizeof<_Tp>())>>
+  {
+  };
+  template <typename _Tp>
+  struct _IsValid : std::conjunction<_IsValidAbiTag, __is_vectorizable<_Tp>,
+				     _IsValidSizeFor<_Tp>>
+  {
+  };
+  template <typename _Tp>
+  static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+  // }}}
+  // _SimdImpl/_MaskImpl {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+  using _CommonImpl = _CommonImplX86;
+  using _SimdImpl = _SimdImplX86<_VecBuiltin<_UsedBytes>>;
+  using _MaskImpl = _MaskImplX86<_VecBuiltin<_UsedBytes>>;
+#elif _GLIBCXX_SIMD_HAVE_NEON
+  using _CommonImpl = _CommonImplNeon;
+  using _SimdImpl = _SimdImplNeon<_VecBuiltin<_UsedBytes>>;
+  using _MaskImpl = _MaskImplNeon<_VecBuiltin<_UsedBytes>>;
+#else
+  using _CommonImpl = _CommonImplBuiltin;
+  using _SimdImpl = _SimdImplBuiltin<_VecBuiltin<_UsedBytes>>;
+  using _MaskImpl = _MaskImplBuiltin<_VecBuiltin<_UsedBytes>>;
+#endif
+
+  // }}}
+  // __traits {{{
+  template <typename _Tp>
+  using __traits = std::conditional_t<
+    _S_is_valid_v<_Tp>,
+    _GnuTraits<_Tp, _Tp, _VecBuiltin<_UsedBytes>, size<_Tp>>, _InvalidTraits>;
+  //}}}
+  // implicit masks {{{
+  template <typename _Tp>
+  static constexpr _SimdWrapper<_Tp, size<_Tp>> __implicit_mask()
+  {
+    constexpr auto __size = _S_full_size<_Tp>;
+    using _ImplicitMask = __vector_type_t<__int_for_sizeof_t<_Tp>, __size>;
+    return reinterpret_cast<__vector_type_t<_Tp, __size>>(
+      !_S_is_partial ? ~_ImplicitMask()
+		     : __generate_vector<_ImplicitMask>([](auto __i) constexpr {
+			 return __i < _UsedBytes / sizeof(_Tp) ? -1 : 0;
+		       }));
+  }
+
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  static constexpr _Tp __masked(_Tp __x)
+  {
+    using _Up = typename _TVT::value_type;
+    if constexpr (_S_is_partial)
+      return __and(__as_vector(__x), __implicit_mask<_Up>()._M_data);
+    else
+      return __x;
+  }
+
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  static constexpr auto __make_padding_nonzero(_Tp __x)
+  {
+    if constexpr (!_S_is_partial)
+      return __x;
+    else
+      {
+	using _Up = typename _TVT::value_type;
+	if constexpr (std::is_integral_v<_Up>)
+	  return __or(__x, ~__implicit_mask<_Up>()._M_data);
+	else
+	  {
+	    constexpr auto __one
+	      = __andnot(__implicit_mask<_Up>()._M_data,
+			 __vector_broadcast<_S_full_size<_Up>>(_Up(1)));
+	    return __or(__x, __one);
+	  }
+      }
+  }
+  // }}}
+};
+
+// }}}
+// simd_abi::_VecBltnBtmsk {{{
+template <int _UsedBytes> struct simd_abi::_VecBltnBtmsk
+{
+  template <typename _Tp>
+  static constexpr size_t size = _UsedBytes / sizeof(_Tp);
+  template <typename _Tp>
+  static constexpr size_t _S_full_size
+    = sizeof(__vector_type_t<_Tp, size<_Tp>>) / sizeof(_Tp);
+  static constexpr bool _S_is_partial = (_UsedBytes & (_UsedBytes - 1)) != 0;
+
+  // validity traits {{{
+  struct _IsValidAbiTag : __bool_constant<(_UsedBytes > 1)>
+  {
+  };
+  template <typename _Tp>
+  struct _IsValidSizeFor
+    : __bool_constant<(_UsedBytes / sizeof(_Tp) > 1
+		       && _UsedBytes % sizeof(_Tp) == 0 && _UsedBytes <= 64
+		       && (_UsedBytes > 32 || __have_avx512vl))>
+  {
+  };
+  // Bitmasks require at least AVX512F. If sizeof(_Tp) < 4 the AVX512BW is also
+  // required.
+  template <typename _Tp>
+  struct _IsValid
+    : conjunction<_IsValidAbiTag, __bool_constant<__have_avx512f>,
+		  __bool_constant<__have_avx512bw || (sizeof(_Tp) >= 4)>,
+		  __bool_constant<(__vectorized_sizeof<_Tp>() > sizeof(_Tp))>,
+		  _IsValidSizeFor<_Tp>>
+  {
+  };
+  template <typename _Tp>
+  static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+  // }}}
+  // implicit mask {{{
+private:
+  template <typename _Tp> using _ImplicitMask = _SimdWrapper<bool, size<_Tp>>;
+
+public:
+  template <size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr __bool_storage_member_type_t<_Np>
+  __implicit_mask_n()
+  {
+    using _Tp = __bool_storage_member_type_t<_Np>;
+    return _Np < sizeof(_Tp) * CHAR_BIT ? _Tp((1ULL << _Np) - 1) : ~_Tp();
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _ImplicitMask<_Tp> __implicit_mask()
+  {
+    return __implicit_mask_n<size<_Tp>>();
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __masked(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (is_same_v<_Tp, bool>)
+      if constexpr (_S_is_partial || _Np < 8)
+	return _MaskImpl::__bit_and(__x, _SimdWrapper<_Tp, _Np>(
+					   __bool_storage_member_type_t<_Np>(
+					     (1ULL << _Np) - 1)));
+      else
+	return __x;
+    else
+      return __masked(__x._M_data);
+  }
+
+  template <typename _TV>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _TV __masked(_TV __x)
+  {
+    static_assert(
+      !__is_bitmask_v<_TV>,
+      "_VecBltnBtmsk::__masked cannot work on bitmasks, since it doesn't "
+      "know the number of elements. Use _SimdWrapper<bool, N> instead.");
+    if constexpr (_S_is_partial)
+      {
+	using _Tp = typename _VectorTraits<_TV>::value_type;
+	constexpr size_t _Np = size<_Tp>;
+	return __make_dependent_t<_TV, _CommonImpl>::_S_blend(
+	  __implicit_mask<_Tp>(), _SimdWrapper<_Tp, _Np>(),
+	  _SimdWrapper<_Tp, _Np>(__x));
+      }
+    else
+      return __x;
+  }
+
+  template <typename _TV, typename _TVT = _VectorTraits<_TV>>
+  static constexpr auto __make_padding_nonzero(_TV __x)
+  {
+    if constexpr (!_S_is_partial)
+      return __x;
+    else
+      {
+	using _Tp = typename _TVT::value_type;
+	constexpr size_t _Np = size<_Tp>;
+	if constexpr (is_integral_v<typename _TVT::value_type>)
+	  return __x
+		 | __generate_vector<_Tp, _S_full_size<_Tp>>(
+		   [](auto __i) -> _Tp {
+		     if (__i < _Np)
+		       return 0;
+		     else
+		       return 1;
+		   });
+	else
+	  return __make_dependent_t<_TV, _CommonImpl>::_S_blend(
+		   __implicit_mask<_Tp>(),
+		   _SimdWrapper<_Tp, _Np>(
+		     __vector_broadcast<_S_full_size<_Tp>>(_Tp(1))),
+		   _SimdWrapper<_Tp, _Np>(__x))
+	    ._M_data;
+      }
+  }
+
+  // }}}
+  // simd/_MaskImpl {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+  using _CommonImpl = _CommonImplX86;
+  using _SimdImpl = _SimdImplX86<_VecBltnBtmsk<_UsedBytes>>;
+  using _MaskImpl = _MaskImplX86<_VecBltnBtmsk<_UsedBytes>>;
+#else
+  template <int> struct _MissingImpl;
+  using _CommonImpl = _MissingImpl<_UsedBytes>;
+  using _SimdImpl = _MissingImpl<_UsedBytes>;
+  using _MaskImpl = _MissingImpl<_UsedBytes>;
+#endif
+
+  // }}}
+  // __traits {{{
+  template <typename _Tp>
+  using __traits = std::conditional_t<
+    _S_is_valid_v<_Tp>,
+    _GnuTraits<_Tp, bool, _VecBltnBtmsk<_UsedBytes>, size<_Tp>>,
+    _InvalidTraits>;
+  //}}}
+};
+
+//}}}
+// _CommonImplBuiltin {{{
+struct _CommonImplBuiltin
+{
+  // __converts_via_decomposition{{{
+  // This lists all cases where a __vector_convert needs to fall back to
+  // conversion of individual scalars (i.e. decompose the input vector into
+  // scalars, convert, compose output vector). In those cases, __masked_load &
+  // __masked_store prefer to use the __bit_iteration implementation.
+  template <typename _From, typename _To, size_t _ToSize>
+  static inline constexpr bool __converts_via_decomposition_v
+    = sizeof(_From) != sizeof(_To);
+
+  // }}}
+  // _S_load{{{
+  template <typename _Tp, size_t _Np, size_t _M = _Np * sizeof(_Tp),
+	    typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static __vector_type_t<_Tp, _Np>
+  _S_load(const void* __p, _Fp)
+  {
+    static_assert(_Np > 1);
+    static_assert(_M % sizeof(_Tp) == 0);
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90424
+    using _Up = conditional_t<
+      is_integral_v<_Tp>,
+      conditional_t<_M % 4 == 0, conditional_t<_M % 8 == 0, long long, int>,
+		    conditional_t<_M % 2 == 0, short, signed char>>,
+      conditional_t<(_M < 8 || _Np % 2 == 1 || _Np == 2), _Tp, double>>;
+    using _V = __vector_type_t<_Up, _Np * sizeof(_Tp) / sizeof(_Up)>;
+#else  // _GLIBCXX_SIMD_WORKAROUND_PR90424
+    using _V = __vector_type_t<_Tp, _Np>;
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90424
+    _V __r{};
+    static_assert(_M <= sizeof(_V));
+    if constexpr (std::is_same_v<_Fp, vector_aligned_tag>)
+      __p = __builtin_assume_aligned(__p, alignof(__vector_type_t<_Tp, _Np>));
+    else if constexpr (!std::is_same_v<_Fp, element_aligned_tag>)
+      __p = __builtin_assume_aligned(__p, _Fp::_S_alignment);
+
+    __builtin_memcpy(&__r, __p, _M);
+    return reinterpret_cast<__vector_type_t<_Tp, _Np>>(__r);
+  }
+
+  // }}}
+  // __store {{{
+  template <size_t _ReqBytes = 0, typename _Flags, typename _TV>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_TV __x, void* __addr, _Flags)
+  {
+    constexpr size_t _Bytes = _ReqBytes == 0 ? sizeof(__x) : _ReqBytes;
+    static_assert(sizeof(__x) >= _Bytes);
+
+    if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+      __addr = __builtin_assume_aligned(__addr, alignof(_TV));
+    else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+      __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+
+    if constexpr (__is_vector_type_v<_TV>)
+      {
+	using _Tp = typename _VectorTraits<_TV>::value_type;
+	constexpr size_t _Np = _Bytes / sizeof(_Tp);
+	static_assert(_Np * sizeof(_Tp) == _Bytes);
+
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90424
+	using _Up = std::conditional_t<
+	  (std::is_integral_v<_Tp> || _Bytes < 4),
+	  std::conditional_t<(sizeof(__x) > sizeof(long long)), long long, _Tp>,
+	  float>;
+	const auto __v = __vector_bitcast<_Up>(__x);
+#else  // _GLIBCXX_SIMD_WORKAROUND_PR90424
+	const __vector_type_t<_Tp, _Np> __v = __x;
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90424
+
+	if constexpr ((_Bytes & (_Bytes - 1)) != 0)
+	  {
+	    constexpr size_t _MoreBytes = __next_power_of_2(_Bytes);
+	    alignas(decltype(__v)) char __tmp[_MoreBytes];
+	    __builtin_memcpy(__tmp, &__v, _MoreBytes);
+	    __builtin_memcpy(__addr, __tmp, _Bytes);
+	  }
+	else
+	  __builtin_memcpy(__addr, &__v, _Bytes);
+      }
+    else
+      __builtin_memcpy(__addr, &__x, _Bytes);
+  }
+
+  template <typename _Flags, typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __x,
+					      void* __addr, _Flags)
+  {
+    __store<_Np * sizeof(_Tp)>(__x._M_data, __addr, _Flags());
+  }
+
+  // }}}
+  // __store_bool_array(_BitMask) {{{
+  template <size_t _Np, typename _Flags, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
+  __store_bool_array(_BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+  {
+    if constexpr (_Np == 1)
+      __mem[0] = __x[0];
+    else if constexpr (_Np == 2)
+      {
+	short __bool2 = (__x._M_to_bits() * 0x81) & 0x0101;
+	__store<_Np>(__bool2, __mem, _Flags());
+      }
+    else if constexpr (_Np == 3)
+      {
+	int __bool3 = (__x._M_to_bits() * 0x4081) & 0x010101;
+	__store<_Np>(__bool3, __mem, _Flags());
+      }
+    else
+      {
+	__execute_n_times<__div_roundup(_Np, 4)>([&](auto __i) {
+	  constexpr int __offset = __i * 4;
+	  constexpr int __remaining = _Np - __offset;
+	  if constexpr (__remaining > 4 && __remaining <= 7)
+	    {
+	      const _ULLong __bool7
+		= (__x.template _M_extract<__offset>()._M_to_bits()
+		   * 0x40810204081ULL)
+		  & 0x0101010101010101ULL;
+	      __store<__remaining>(__bool7, __mem + __offset, _Flags());
+	    }
+	  else if constexpr (__remaining >= 4)
+	    {
+	      int __bits = __x.template _M_extract<__offset>()._M_to_bits();
+	      if constexpr (__remaining > 7)
+		__bits &= 0xf;
+	      const int __bool4 = (__bits * 0x204081) & 0x01010101;
+	      __store<4>(__bool4, __mem + __offset, _Flags());
+	    }
+	});
+      }
+  }
+
+  // }}}
+  // _S_blend{{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+  _S_blend(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+	   _SimdWrapper<_Tp, _Np> __at1)
+  {
+    return __vector_bitcast<__int_for_sizeof_t<_Tp>>(__k) ? __at1._M_data
+							  : __at0._M_data;
+  }
+
+  // }}}
+};
+
+// }}}
+// _SimdImplBuiltin {{{1
+template <typename _Abi> struct _SimdImplBuiltin
+{
+  // member types {{{2
+  template <typename _Tp> static constexpr size_t _S_max_store_size = 16;
+  using abi_type = _Abi;
+  template <typename _Tp> using _TypeTag = _Tp*;
+  template <typename _Tp>
+  using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+  template <typename _Tp>
+  using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+  template <typename _Tp>
+  static constexpr size_t _S_size = _Abi::template size<_Tp>;
+  template <typename _Tp>
+  static constexpr size_t _S_full_size = _Abi::template _S_full_size<_Tp>;
+  using _CommonImpl = typename _Abi::_CommonImpl;
+  using _SuperImpl = typename _Abi::_SimdImpl;
+  using _MaskImpl = typename _Abi::_MaskImpl;
+
+  // __make_simd(_SimdWrapper/__intrinsic_type_t) {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+  __make_simd(_SimdWrapper<_Tp, _Np> __x)
+  {
+    return {__private_init, __x};
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+  __make_simd(__intrinsic_type_t<_Tp, _Np> __x)
+  {
+    return {__private_init, __vector_bitcast<_Tp>(__x)};
+  }
+
+  // __broadcast {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
+  __broadcast(_Tp __x) noexcept
+  {
+    return __vector_broadcast<_S_full_size<_Tp>>(__x);
+  }
+
+  // __generator {{{2
+  template <typename _Fp, typename _Tp>
+  inline static constexpr _SimdMember<_Tp> __generator(_Fp&& __gen,
+						       _TypeTag<_Tp>)
+  {
+    return __generate_vector<_Tp, _S_full_size<_Tp>>([&](auto __i) constexpr {
+      if constexpr (__i < _S_size<_Tp>)
+	return __gen(__i);
+      else
+	return 0;
+    });
+  }
+
+  // __load {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp> __load(const _Up* __mem, _Fp,
+							 _TypeTag<_Tp>) noexcept
+  {
+    constexpr size_t _Np = _S_size<_Tp>;
+    constexpr size_t __max_load_size
+      = (sizeof(_Up) >= 4 && __have_avx512f) || __have_avx512bw
+	  ? 64
+	  : (std::is_floating_point_v<_Up> && __have_avx) || __have_avx2 ? 32
+									 : 16;
+    constexpr size_t __bytes_to_load = sizeof(_Up) * _Np;
+    if constexpr (sizeof(_Up) > 8)
+      return __generate_vector<_Tp, _SimdMember<_Tp>::_S_width>([&](
+	auto __i) constexpr {
+	return static_cast<_Tp>(__i < _Np ? __mem[__i] : 0);
+      });
+    else if constexpr (std::is_same_v<_Up, _Tp>)
+      return _CommonImpl::template _S_load<_Tp, _S_full_size<_Tp>,
+					   _Np * sizeof(_Tp)>(__mem, _Fp());
+    else if constexpr (__bytes_to_load <= __max_load_size)
+      return __convert<_SimdMember<_Tp>>(
+	_CommonImpl::template _S_load<_Up, _Np>(__mem, _Fp()));
+    else if constexpr (__bytes_to_load % __max_load_size == 0)
+      {
+	constexpr size_t __n_loads = __bytes_to_load / __max_load_size;
+	constexpr size_t __elements_per_load = _Np / __n_loads;
+	return __call_with_n_evaluations<__n_loads>(
+	  [](auto... __uncvted) {
+	    return __convert<_SimdMember<_Tp>>(__uncvted...);
+	  },
+	  [&](auto __i) {
+	    return _CommonImpl::template _S_load<_Up, __elements_per_load>(
+	      __mem + __i * __elements_per_load, _Fp());
+	  });
+      }
+    else if constexpr (__bytes_to_load % (__max_load_size / 2) == 0
+		       && __max_load_size > 16)
+      { // e.g. int[] -> <char, 12> with AVX2
+	constexpr size_t __n_loads = __bytes_to_load / (__max_load_size / 2);
+	constexpr size_t __elements_per_load = _Np / __n_loads;
+	return __call_with_n_evaluations<__n_loads>(
+	  [](auto... __uncvted) {
+	    return __convert<_SimdMember<_Tp>>(__uncvted...);
+	  },
+	  [&](auto __i) {
+	    return _CommonImpl::template _S_load<_Up, __elements_per_load>(
+	      __mem + __i * __elements_per_load, _Fp());
+	  });
+      }
+    else // e.g. int[] -> <char, 9>
+      return __call_with_subscripts(
+	__mem, make_index_sequence<_Np>(), [](auto... __args) {
+	  return __vector_type_t<_Tp, _S_full_size<_Tp>>{
+	    static_cast<_Tp>(__args)...};
+	});
+  }
+
+  // __masked_load {{{2
+  template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+  static inline _SimdWrapper<_Tp, _Np>
+  __masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
+		const _Up* __mem, _Fp) noexcept
+  {
+    _BitOps::__bit_iteration(_MaskImpl::__to_bits(__k), [&](auto __i) {
+      __merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+    });
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdMember<_Tp> __v, _Up* __mem,
+					      _Fp, _TypeTag<_Tp>) noexcept
+  {
+    // TODO: converting int -> "smaller int" can be optimized with AVX512
+    constexpr size_t _Np = _S_size<_Tp>;
+    constexpr size_t __max_store_size
+      = _SuperImpl::template _S_max_store_size<_Up>;
+    if constexpr (sizeof(_Up) > 8)
+      __execute_n_times<_Np>([&](auto __i) constexpr {
+	__mem[__i] = __v[__i];
+      });
+    else if constexpr (std::is_same_v<_Up, _Tp>)
+      _CommonImpl::__store(__v, __mem, _Fp());
+    else if constexpr (sizeof(_Up) * _Np <= __max_store_size)
+      _CommonImpl::__store(_SimdWrapper<_Up, _Np>(__convert<_Up>(__v)), __mem,
+			   _Fp());
+    else
+      {
+	constexpr size_t __vsize = __max_store_size / sizeof(_Up);
+	// round up to convert the last partial vector as well:
+	constexpr size_t __stores = __div_roundup(_Np, __vsize);
+	constexpr size_t __full_stores = _Np / __vsize;
+	using _V = __vector_type_t<_Up, __vsize>;
+	const std::array<_V, __stores> __converted
+	  = __convert_all<_V, __stores>(__v);
+	__execute_n_times<__full_stores>([&](auto __i) constexpr {
+	  _CommonImpl::__store(__converted[__i], __mem + __i * __vsize, _Fp());
+	});
+	if constexpr (__full_stores < __stores)
+	  _CommonImpl::template __store<(_Np - __full_stores * __vsize)
+					* sizeof(_Up)>(
+	    __converted[__full_stores], __mem + __full_stores * __vsize, _Fp());
+      }
+  }
+
+  // __masked_store_nocvt {{{2
+  template <typename _Tp, std::size_t _Np, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+		       _SimdWrapper<_Tp, _Np> __k)
+  {
+    _BitOps::__bit_iteration(
+      _MaskImpl::__to_bits(__k), [&](auto __i) constexpr {
+	__mem[__i] = __v[__i];
+      });
+  }
+
+  // __masked_store {{{2
+  template <typename _TW, typename _TVT = _VectorTraits<_TW>,
+	    typename _Tp = typename _TVT::value_type, typename _Up,
+	    typename _Fp>
+  static inline void __masked_store(const _TW __v, _Up* __mem, _Fp,
+				    const _MaskMember<_Tp> __k) noexcept
+  {
+    constexpr size_t _TV_size = _S_size<_Tp>;
+    [[maybe_unused]] const auto __vi = __to_intrin(__v);
+    constexpr size_t __max_store_size
+      = _SuperImpl::template _S_max_store_size<_Up>;
+    if constexpr (
+      std::is_same_v<
+	_Tp,
+	_Up> || (std::is_integral_v<_Tp> && std::is_integral_v<_Up> && sizeof(_Tp) == sizeof(_Up)))
+      {
+	// bitwise or no conversion, reinterpret:
+	const auto __kk = [&]() {
+	  if constexpr (__is_bitmask_v<decltype(__k)>)
+	    return _MaskMember<_Up>(__k._M_data);
+	  else
+	    return __wrapper_bitcast<_Up>(__k);
+	}();
+	_SuperImpl::__masked_store_nocvt(__wrapper_bitcast<_Up>(__v), __mem,
+					 _Fp(), __kk);
+      }
+    else if constexpr (__vectorized_sizeof<_Up>() > sizeof(_Up)
+		       && !_CommonImpl::template __converts_via_decomposition_v<
+			  _Tp, _Up, __max_store_size>)
+      { // conversion via decomposition is better handled via the bit_iteration
+	// fallback below
+	constexpr size_t _UW_size
+	  = std::min(_TV_size, __max_store_size / sizeof(_Up));
+	static_assert(_UW_size <= _TV_size);
+	using _UW = _SimdWrapper<_Up, _UW_size>;
+	using _UV = __vector_type_t<_Up, _UW_size>;
+	using _UAbi = simd_abi::deduce_t<_Up, _UW_size>;
+	if constexpr (_UW_size == _TV_size) // one convert+store
+	  {
+	    const _UW __converted = __convert<_UW>(__v);
+	    _SuperImpl::__masked_store_nocvt(
+	      __converted, __mem, _Fp(),
+	      _UAbi::_MaskImpl::template __convert<_Up>(__k));
+	  }
+	else
+	  {
+	    static_assert(_UW_size * sizeof(_Up) == __max_store_size);
+	    constexpr size_t _NFullStores = _TV_size / _UW_size;
+	    constexpr size_t _NAllStores = __div_roundup(_TV_size, _UW_size);
+	    constexpr size_t _NParts = _S_full_size<_Tp> / _UW_size;
+	    const std::array<_UV, _NAllStores> __converted
+	      = __convert_all<_UV, _NAllStores>(__v);
+	    __execute_n_times<_NFullStores>([&](auto __i) {
+	      _SuperImpl::__masked_store_nocvt(
+		_UW(__converted[__i]), __mem + __i * _UW_size, _Fp(),
+		_UAbi::_MaskImpl::template __convert<_Up>(
+		  __extract_part<__i, _NParts>(__k.__as_full_vector())));
+	    });
+	    if constexpr (_NAllStores > _NFullStores) // one partial at the end
+	      _SuperImpl::__masked_store_nocvt(
+		_UW(__converted[_NFullStores]), __mem + _NFullStores * _UW_size,
+		_Fp(),
+		_UAbi::_MaskImpl::template __convert<_Up>(
+		  __extract_part<_NFullStores, _NParts>(
+		    __k.__as_full_vector())));
+	  }
+      }
+    else
+      _BitOps::__bit_iteration(
+	_MaskImpl::__to_bits(__k), [&](auto __i) constexpr {
+	  __mem[__i] = static_cast<_Up>(__v[__i]);
+	});
+  }
+
+  // __complement {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __complement(_SimdWrapper<_Tp, _Np> __x) noexcept
+  {
+    return ~__x._M_data;
+  }
+
+  // __unary_minus {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __unary_minus(_SimdWrapper<_Tp, _Np> __x) noexcept
+  {
+    // GCC doesn't use the psign instructions, but pxor & psub seem to be just
+    // as good a choice as pcmpeqd & psign. So meh.
+    return -__x._M_data;
+  }
+
+  // arithmetic operators {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __plus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __x._M_data + __y._M_data;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __minus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __x._M_data - __y._M_data;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __multiplies(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __x._M_data * __y._M_data;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __divides(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    // Note that division by 0 is always UB, so we must ensure we avoid the
+    // case for partial registers
+    if constexpr (!_Abi::_S_is_partial)
+      return __x._M_data / __y._M_data;
+    else
+      return __as_vector(__x) / _Abi::__make_padding_nonzero(__as_vector(__y));
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __modulus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if constexpr (!_Abi::_S_is_partial)
+      return __x._M_data % __y._M_data;
+    else
+      return __as_vector(__x) % _Abi::__make_padding_nonzero(__as_vector(__y));
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_and(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __and(__x._M_data, __y._M_data);
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_or(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __or(__x._M_data, __y._M_data);
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_xor(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __xor(__x._M_data, __y._M_data);
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __bit_shift_left(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __x._M_data << __y._M_data;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __bit_shift_right(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_5
+    if constexpr (sizeof(_Tp) == 8)
+      return __generate_vector<__vector_type_t<_Tp, _Np>>([&](auto __i) {
+	return __x._M_data[__i.value] >> __y._M_data[__i.value];
+      });
+    else
+#endif
+      return __x._M_data >> __y._M_data;
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_shift_left(_SimdWrapper<_Tp, _Np> __x, int __y)
+  {
+    // The behavior is undefined if the right operand is negative, or greater
+    // than or equal to the width of the promoted left operand.
+    if (__y < 0 || __y >= sizeof(std::declval<_Tp>() << __y) * CHAR_BIT)
+      __builtin_unreachable();
+    else if (__builtin_constant_p(__y) && __y >= sizeof(_Tp) * CHAR_BIT)
+      return {};
+    else
+      return __x._M_data << __y;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_shift_right(_SimdWrapper<_Tp, _Np> __x, int __y)
+  {
+    if (__y < 0 || __y >= sizeof(std::declval<_Tp>() >> __y) * CHAR_BIT)
+      __builtin_unreachable();
+    else if (__builtin_constant_p(__y) && __y >= sizeof(_Tp) * CHAR_BIT
+	     && is_unsigned_v<_Tp>)
+      return {};
+    else
+      return __x._M_data >> __y;
+  }
+
+  // compares {{{2
+  // __equal_to {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __vector_bitcast<_Tp>(__x._M_data == __y._M_data);
+  }
+
+  // __not_equal_to {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __not_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __vector_bitcast<_Tp>(__x._M_data != __y._M_data);
+  }
+
+  // __less {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __less(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __vector_bitcast<_Tp>(__x._M_data < __y._M_data);
+  }
+
+  // __less_equal {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __less_equal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __vector_bitcast<_Tp>(__x._M_data <= __y._M_data);
+  }
+
+  // negation {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __negate(_SimdWrapper<_Tp, _Np> __x) noexcept
+  {
+    return __vector_bitcast<_Tp>(!__x._M_data);
+  }
+
+  // __min, __max, __minmax {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_NORMAL_MATH
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+    __min(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+  {
+    return __a._M_data < __b._M_data ? __a._M_data : __b._M_data;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_NORMAL_MATH
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+    __max(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+  {
+    return __a._M_data > __b._M_data ? __a._M_data : __b._M_data;
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_NORMAL_MATH
+    _GLIBCXX_SIMD_INTRINSIC static constexpr std::pair<_SimdWrapper<_Tp, _Np>,
+						       _SimdWrapper<_Tp, _Np>>
+    __minmax(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+  {
+    return {__a._M_data < __b._M_data ? __a._M_data : __b._M_data,
+	    __a._M_data < __b._M_data ? __b._M_data : __a._M_data};
+  }
+
+  // reductions {{{2
+  template <size_t _Np, size_t... _Is, size_t... _Zeros, typename _Tp,
+	    typename _BinaryOperation>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp
+  __reduce_partial(std::index_sequence<_Is...>, std::index_sequence<_Zeros...>,
+		   simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
+  {
+    using _V = __vector_type_t<_Tp, _Np / 2>;
+    static_assert(sizeof(_V) <= sizeof(__x));
+    // _S_width is the size of the smallest native SIMD register that can
+    // store _Np/2 elements:
+    using _FullSimd = __deduced_simd<_Tp, _VectorTraits<_V>::_S_width>;
+    using _HalfSimd = __deduced_simd<_Tp, _Np / 2>;
+    const auto __xx = __as_vector(__x);
+    return _HalfSimd::abi_type::_SimdImpl::__reduce(
+      static_cast<_HalfSimd>(__as_vector(__binary_op(
+	static_cast<_FullSimd>(__intrin_bitcast<_V>(__xx)),
+	static_cast<_FullSimd>(__intrin_bitcast<_V>(
+	  __vector_permute<(_Np / 2 + _Is)..., (int(_Zeros * 0) - 1)...>(
+	    __xx)))))),
+      __binary_op);
+  }
+
+  template <typename _Tp, typename _BinaryOperation>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+  __reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
+  {
+    constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+    if constexpr (_Np == 1)
+      return __x[0];
+    else if constexpr (_Np == 2)
+      return __binary_op(simd<_Tp, simd_abi::scalar>(__x[0]),
+			 simd<_Tp, simd_abi::scalar>(__x[1]))[0];
+    else if constexpr (_Abi::_S_is_partial) //{{{
+      {
+	[[maybe_unused]] constexpr auto __full_size
+	  = _Abi::template _S_full_size<_Tp>;
+	if constexpr (_Np == 3)
+	  return __binary_op(__binary_op(simd<_Tp, simd_abi::scalar>(__x[0]),
+					 simd<_Tp, simd_abi::scalar>(__x[1])),
+			     simd<_Tp, simd_abi::scalar>(__x[2]))[0];
+	else if constexpr (std::is_same_v<__remove_cvref_t<_BinaryOperation>,
+					  std::plus<>>)
+	  {
+	    using _Ap = simd_abi::deduce_t<_Tp, __full_size>;
+	    return _Ap::_SimdImpl::__reduce(
+	      simd<_Tp, _Ap>(__private_init, _Abi::__masked(__as_vector(__x))),
+	      __binary_op);
+	  }
+	else if constexpr (std::is_same_v<__remove_cvref_t<_BinaryOperation>,
+					  std::multiplies<>>)
+	  {
+	    using _Ap = simd_abi::deduce_t<_Tp, __full_size>;
+	    using _TW = _SimdWrapper<_Tp, __full_size>;
+	    constexpr auto __implicit_mask_full
+	      = _Abi::template __implicit_mask<_Tp>().__as_full_vector();
+	    constexpr _TW __one = __vector_broadcast<__full_size>(_Tp(1));
+	    const _TW __x_full = __data(__x).__as_full_vector();
+	    const _TW __x_padded_with_ones
+	      = _Ap::_CommonImpl::_S_blend(__implicit_mask_full, __one,
+					   __x_full);
+	    return _Ap::_SimdImpl::__reduce(
+	      simd<_Tp, _Ap>(__private_init, __x_padded_with_ones),
+	      __binary_op);
+	  }
+	else if constexpr (_Np & 1)
+	  {
+	    using _Ap = simd_abi::deduce_t<_Tp, _Np - 1>;
+	    return __binary_op(
+	      simd<_Tp, simd_abi::scalar>(_Ap::_SimdImpl::__reduce(
+		simd<_Tp, _Ap>(__intrin_bitcast<__vector_type_t<_Tp, _Np - 1>>(
+		  __as_vector(__x))),
+		__binary_op)),
+	      simd<_Tp, simd_abi::scalar>(__x[_Np - 1]))[0];
+	  }
+	else
+	  return __reduce_partial<_Np>(
+	    std::make_index_sequence<_Np / 2>(),
+	    std::make_index_sequence<__full_size - _Np / 2>(), __x,
+	    __binary_op);
+      }                                   //}}}
+    else if constexpr (sizeof(__x) == 16) //{{{
+      {
+	if constexpr (_Np == 16)
+	  {
+	    const auto __y = __data(__x);
+	    __x = __binary_op(
+	      __make_simd<_Tp, _Np>(__vector_permute<0, 0, 1, 1, 2, 2, 3, 3, 4,
+						     4, 5, 5, 6, 6, 7, 7>(__y)),
+	      __make_simd<_Tp, _Np>(
+		__vector_permute<8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14,
+				 14, 15, 15>(__y)));
+	  }
+	if constexpr (_Np >= 8)
+	  {
+	    const auto __y = __vector_bitcast<short>(__data(__x));
+	    __x
+	      = __binary_op(__make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+			      __vector_permute<0, 0, 1, 1, 2, 2, 3, 3>(__y))),
+			    __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+			      __vector_permute<4, 4, 5, 5, 6, 6, 7, 7>(__y))));
+	  }
+	if constexpr (_Np >= 4)
+	  {
+	    using _Up
+	      = std::conditional_t<std::is_floating_point_v<_Tp>, float, int>;
+	    const auto __y = __vector_bitcast<_Up>(__data(__x));
+	    __x = __binary_op(__x, __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+				     __vector_permute<3, 2, 1, 0>(__y))));
+	  }
+	using _Up
+	  = std::conditional_t<std::is_floating_point_v<_Tp>, double, _LLong>;
+	const auto __y = __vector_bitcast<_Up>(__data(__x));
+	__x = __binary_op(__x, __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+				 __vector_permute<1, 1>(__y))));
+	return __x[0];
+      } //}}}
+    else
+      {
+	static_assert(sizeof(__x) > __min_vector_size<_Tp>);
+	static_assert((_Np & (_Np - 1)) == 0); // _Np must be a power of 2
+	using _Ap = simd_abi::deduce_t<_Tp, _Np / 2>;
+	using _V = std::experimental::simd<_Tp, _Ap>;
+	return _Ap::_SimdImpl::__reduce(
+	  __binary_op(_V(__private_init, __extract<0, 2>(__as_vector(__x))),
+		      _V(__private_init, __extract<1, 2>(__as_vector(__x)))),
+	  static_cast<_BinaryOperation&&>(__binary_op));
+      }
+  }
+
+  // math {{{2
+  // frexp, modf and copysign implemented in simd_math.h
+#define _GLIBCXX_SIMD_MATH_FALLBACK(__name)                                    \
+  template <typename _Tp, typename... _More>                                   \
+  static _Tp __##__name(const _Tp& __x, const _More&... __more)                \
+  {                                                                            \
+    return __generate_vector<_Tp>(                                             \
+      [&](auto __i) { return std::__name(__x[__i], __more[__i]...); });        \
+  }
+
+#define _GLIBCXX_SIMD_MATH_FALLBACK_MASKRET(__name)                            \
+  template <typename _Tp, typename... _More>                                   \
+  static                                                                       \
+    typename _Tp::mask_type __##__name(const _Tp& __x, const _More&... __more) \
+  {                                                                            \
+    return __generate_vector<_Tp>(                                             \
+      [&](auto __i) { return std::__name(__x[__i], __more[__i]...); });        \
+  }
+
+#define _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(_RetTp, __name)                   \
+  template <typename _Tp, typename... _More>                                   \
+  static auto __##__name(const _Tp& __x, const _More&... __more)               \
+  {                                                                            \
+    return __fixed_size_storage_t<_RetTp,                                      \
+				  _VectorTraits<_Tp>::_S_partial_width>::      \
+      __generate([&](auto __meta) constexpr {                                  \
+	return __meta.__generator(                                             \
+	  [&](auto __i) {                                                      \
+	    return std::__name(__x[__meta._S_offset + __i],                    \
+			       __more[__meta._S_offset + __i]...);             \
+	  },                                                                   \
+	  static_cast<_RetTp*>(nullptr));                                      \
+      });                                                                      \
+  }
+
+  _GLIBCXX_SIMD_MATH_FALLBACK(acos)
+  _GLIBCXX_SIMD_MATH_FALLBACK(asin)
+  _GLIBCXX_SIMD_MATH_FALLBACK(atan)
+  _GLIBCXX_SIMD_MATH_FALLBACK(atan2)
+  _GLIBCXX_SIMD_MATH_FALLBACK(cos)
+  _GLIBCXX_SIMD_MATH_FALLBACK(sin)
+  _GLIBCXX_SIMD_MATH_FALLBACK(tan)
+  _GLIBCXX_SIMD_MATH_FALLBACK(acosh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(asinh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(atanh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(cosh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(sinh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(tanh)
+  _GLIBCXX_SIMD_MATH_FALLBACK(exp)
+  _GLIBCXX_SIMD_MATH_FALLBACK(exp2)
+  _GLIBCXX_SIMD_MATH_FALLBACK(expm1)
+  _GLIBCXX_SIMD_MATH_FALLBACK(ldexp)
+  _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(int, ilogb)
+  _GLIBCXX_SIMD_MATH_FALLBACK(log)
+  _GLIBCXX_SIMD_MATH_FALLBACK(log10)
+  _GLIBCXX_SIMD_MATH_FALLBACK(log1p)
+  _GLIBCXX_SIMD_MATH_FALLBACK(log2)
+  _GLIBCXX_SIMD_MATH_FALLBACK(logb)
+
+  // modf implemented in simd_math.h
+  _GLIBCXX_SIMD_MATH_FALLBACK(scalbn)
+  _GLIBCXX_SIMD_MATH_FALLBACK(scalbln)
+  _GLIBCXX_SIMD_MATH_FALLBACK(cbrt)
+  _GLIBCXX_SIMD_MATH_FALLBACK(fabs)
+  _GLIBCXX_SIMD_MATH_FALLBACK(pow)
+  _GLIBCXX_SIMD_MATH_FALLBACK(sqrt)
+  _GLIBCXX_SIMD_MATH_FALLBACK(erf)
+  _GLIBCXX_SIMD_MATH_FALLBACK(erfc)
+  _GLIBCXX_SIMD_MATH_FALLBACK(lgamma)
+  _GLIBCXX_SIMD_MATH_FALLBACK(tgamma)
+
+  _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long, lrint)
+  _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long long, llrint)
+
+  _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long, lround)
+  _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long long, llround)
+
+  _GLIBCXX_SIMD_MATH_FALLBACK(fmod)
+  _GLIBCXX_SIMD_MATH_FALLBACK(remainder)
+
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  static _Tp __remquo(const _Tp __x, const _Tp __y,
+		      __fixed_size_storage_t<int, _TVT::_S_partial_width>* __z)
+  {
+    return __generate_vector<_Tp>([&](auto __i) {
+      int __tmp;
+      auto __r = std::remquo(__x[__i], __y[__i], &__tmp);
+      __z->__set(__i, __tmp);
+      return __r;
+    });
+  }
+
+  // copysign in simd_math.h
+  _GLIBCXX_SIMD_MATH_FALLBACK(nextafter)
+  _GLIBCXX_SIMD_MATH_FALLBACK(fdim)
+  _GLIBCXX_SIMD_MATH_FALLBACK(fmax)
+  _GLIBCXX_SIMD_MATH_FALLBACK(fmin)
+  _GLIBCXX_SIMD_MATH_FALLBACK(fma)
+
+  template <typename _Tp, size_t _Np>
+  static constexpr auto __isgreater(_SimdWrapper<_Tp, _Np> __x,
+				    _SimdWrapper<_Tp, _Np> __y) noexcept
+  {
+    using _Ip = __int_for_sizeof_t<_Tp>;
+    const auto __xn = __vector_bitcast<_Ip>(__x);
+    const auto __yn = __vector_bitcast<_Ip>(__y);
+    const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+    const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+    return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+		 __vector_bitcast<_Tp>(__xp > __yp));
+  }
+  template <typename _Tp, size_t _Np>
+  static constexpr auto __isgreaterequal(_SimdWrapper<_Tp, _Np> __x,
+					 _SimdWrapper<_Tp, _Np> __y) noexcept
+  {
+    using _Ip = __int_for_sizeof_t<_Tp>;
+    const auto __xn = __vector_bitcast<_Ip>(__x);
+    const auto __yn = __vector_bitcast<_Ip>(__y);
+    const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+    const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+    return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+		 __vector_bitcast<_Tp>(__xp >= __yp));
+  }
+  template <typename _Tp, size_t _Np>
+  static constexpr auto __isless(_SimdWrapper<_Tp, _Np> __x,
+				 _SimdWrapper<_Tp, _Np> __y) noexcept
+  {
+    using _Ip = __int_for_sizeof_t<_Tp>;
+    const auto __xn = __vector_bitcast<_Ip>(__x);
+    const auto __yn = __vector_bitcast<_Ip>(__y);
+    const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+    const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+    return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+		 __vector_bitcast<_Tp>(__xp < __yp));
+  }
+  template <typename _Tp, size_t _Np>
+  static constexpr auto __islessequal(_SimdWrapper<_Tp, _Np> __x,
+				      _SimdWrapper<_Tp, _Np> __y) noexcept
+  {
+    using _Ip = __int_for_sizeof_t<_Tp>;
+    const auto __xn = __vector_bitcast<_Ip>(__x);
+    const auto __yn = __vector_bitcast<_Ip>(__y);
+    const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+    const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+    return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+		 __vector_bitcast<_Tp>(__xp <= __yp));
+  }
+  template <typename _Tp, size_t _Np>
+  static constexpr auto __islessgreater(_SimdWrapper<_Tp, _Np> __x,
+					_SimdWrapper<_Tp, _Np> __y) noexcept
+  {
+    return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+		 _SuperImpl::__not_equal_to(__x, __y));
+  }
+
+#undef _GLIBCXX_SIMD_MATH_FALLBACK
+#undef _GLIBCXX_SIMD_MATH_FALLBACK_MASKRET
+#undef _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET
+  // __abs {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __abs(_SimdWrapper<_Tp, _Np> __x) noexcept
+  {
+    // if (__builtin_is_constant_evaluated())
+    //  {
+    //    return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+    //  }
+    if constexpr (std::is_floating_point_v<_Tp>)
+      // `v < 0 ? -v : v` cannot compile to the efficient implementation of
+      // masking the signbit off because it must consider v == -0
+
+      // ~(-0.) & v would be easy, but breaks with fno-signed-zeros
+      return __and(_S_absmask<__vector_type_t<_Tp, _Np>>, __x._M_data);
+    else
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR91533
+      if constexpr (sizeof(__x) < 16 && std::is_signed_v<_Tp>)
+      {
+	if constexpr (sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_abs_epi32(__to_intrin(__x)));
+	else if constexpr (sizeof(_Tp) == 2)
+	  return __auto_bitcast(_mm_abs_epi16(__to_intrin(__x)));
+	else
+	  return __auto_bitcast(_mm_abs_epi8(__to_intrin(__x)));
+      }
+    else
+#endif //_GLIBCXX_SIMD_WORKAROUND_PR91533
+      return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+  }
+
+  // __nearbyint {{{3
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x_) noexcept
+  {
+    using value_type = typename _TVT::value_type;
+    using _V = typename _TVT::type;
+    const _V __x = __x_;
+    const _V __absx = __and(__x, _S_absmask<_V>);
+    static_assert(CHAR_BIT * sizeof(1ull)
+		  >= std::numeric_limits<value_type>::digits);
+    constexpr _V __shifter_abs
+      = _V() + (1ull << (std::numeric_limits<value_type>::digits - 1));
+    const _V __shifter = __or(__and(_S_signmask<_V>, __x), __shifter_abs);
+    _V __shifted = __x + __shifter;
+    // how can we stop -fassociative-math to break this pattern?
+    // asm("" : "+X"(__shifted));
+    __shifted -= __shifter;
+    return __absx < __shifter_abs ? __shifted : __x;
+  }
+
+  // __rint {{{3
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x) noexcept
+  {
+    return _SuperImpl::__nearbyint(__x);
+  }
+
+  // __trunc {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __trunc(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _V = __vector_type_t<_Tp, _Np>;
+    const _V __absx = __and(__x._M_data, _S_absmask<_V>);
+    static_assert(CHAR_BIT * sizeof(1ull) >= std::numeric_limits<_Tp>::digits);
+    constexpr _Tp __shifter = 1ull << (std::numeric_limits<_Tp>::digits - 1);
+    _V __truncated = (__absx + __shifter) - __shifter;
+    __truncated -= __truncated > __absx ? _V() + 1 : _V();
+    return __absx < __shifter ? __or(__xor(__absx, __x._M_data), __truncated)
+			      : __x._M_data;
+  }
+
+  // __round {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __round(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _V = __vector_type_t<_Tp, _Np>;
+    const _V __absx = __and(__x._M_data, _S_absmask<_V>);
+    static_assert(CHAR_BIT * sizeof(1ull) >= std::numeric_limits<_Tp>::digits);
+    constexpr _Tp __shifter = 1ull << (std::numeric_limits<_Tp>::digits - 1);
+    _V __truncated = (__absx + __shifter) - __shifter;
+    __truncated -= __truncated > __absx ? _V() + 1 : _V();
+    const _V __rounded
+      = __or(__xor(__absx, __x._M_data),
+	     __truncated + (__absx - __truncated >= _Tp(.5) ? _V() + 1 : _V()));
+    return __absx < __shifter ? __rounded : __x._M_data;
+  }
+
+  // __floor {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __floor(_SimdWrapper<_Tp, _Np> __x)
+  {
+    const auto __y = _SuperImpl::__trunc(__x)._M_data;
+    const auto __negative_input
+      = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+    const auto __mask
+      = __andnot(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+    return __or(__andnot(__mask, __y),
+		__and(__mask, __y - __vector_broadcast<_Np, _Tp>(1)));
+  }
+
+  // __ceil {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __ceil(_SimdWrapper<_Tp, _Np> __x)
+  {
+    const auto __y = _SuperImpl::__trunc(__x)._M_data;
+    const auto __negative_input
+      = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+    const auto __inv_mask
+      = __or(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+    return __or(__and(__inv_mask, __y),
+		__andnot(__inv_mask, __y + __vector_broadcast<_Np, _Tp>(1)));
+  }
+
+  // __isnan {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isnan(_SimdWrapper<_Tp, _Np> __x)
+  {
+#if __FINITE_MATH_ONLY__
+    [](auto&&) {}(__x);
+    return {}; // false
+#elif !defined __SUPPORT_SNAN__
+    return __vector_bitcast<_Tp>(~(__x._M_data == __x._M_data));
+#elif defined __STDC_IEC_559__
+    using _Up = make_unsigned_t<__int_for_sizeof_t<_Tp>>;
+    constexpr auto __max = __vector_bitcast<_Up>(
+      __vector_broadcast<_Np>(numeric_limits<_Tp>::infinity()));
+    auto __bits = __vector_bitcast<_Up>(__x);
+    __bits &= __vector_bitcast<_Up>(_S_absmask<__vector_type_t<_Tp, _Np>>);
+    return __vector_bitcast<_Tp>(__bits > __max);
+#else
+#error "Not implemented: how to support SNaN but non-IEC559 floating-point?"
+#endif
+  }
+
+  // __isfinite {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isfinite(_SimdWrapper<_Tp, _Np> __x)
+  {
+#if __FINITE_MATH_ONLY__
+    [](auto&&) {}(__x);
+    return __vector_bitcast<_Np>(_Tp()) == __vector_bitcast<_Np>(_Tp());
+#else
+    // if all exponent bits are set, __x is either inf or NaN
+    using _I = __int_for_sizeof_t<_Tp>;
+    constexpr auto __inf = __vector_bitcast<_I>(
+      __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+    return __vector_bitcast<_Tp>(__inf > (__vector_bitcast<_I>(__x) & __inf));
+#endif
+  }
+
+  // __isunordered {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    return __or(__isnan(__x), __isnan(__y));
+  }
+
+  // __signbit {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __signbit(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _I = __int_for_sizeof_t<_Tp>;
+    return __vector_bitcast<_Tp>(__vector_bitcast<_I>(__x) < 0);
+    // Arithmetic right shift (SRA) would also work (instead of compare), but
+    // 64-bit SRA isn't available on x86 before AVX512. And in general,
+    // compares are more likely to be efficient than SRA.
+  }
+
+  // __isinf {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isinf(_SimdWrapper<_Tp, _Np> __x)
+  {
+#if __FINITE_MATH_ONLY__
+    [](auto&&) {}(__x);
+    return {}; // false
+#else
+    return _SuperImpl::template __equal_to<_Tp, _Np>(
+      _SuperImpl::__abs(__x),
+      __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+    // alternative:
+    // compare to inf using the corresponding integer type
+    /*
+       return
+       __vector_bitcast<_Tp>(__vector_bitcast<__int_for_sizeof_t<_Tp>>(__abs(__x)._M_data)
+       ==
+       __vector_bitcast<__int_for_sizeof_t<_Tp>>(__vector_broadcast<_Np>(
+       std::numeric_limits<_Tp>::infinity())));
+       */
+#endif
+  }
+
+  // __isnormal {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isnormal(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _I = __int_for_sizeof_t<_Tp>;
+    const auto absn = __vector_bitcast<_I>(_SuperImpl::__abs(__x));
+    const auto minn = __vector_bitcast<_I>(
+      __vector_broadcast<_Np>(std::numeric_limits<_Tp>::min()));
+#if __FINITE_MATH_ONLY__
+    return __auto_bitcast(absn >= minn);
+#else
+    const auto infn = __vector_bitcast<_I>(
+      __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+    return __auto_bitcast(absn >= minn && absn < infn);
+#endif
+  }
+
+  // __fpclassify {{{3
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static __fixed_size_storage_t<int, _Np>
+  __fpclassify(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _I = __int_for_sizeof_t<_Tp>;
+    const auto __xi = __to_intrin(__abs(__x));
+    const auto __xn = __vector_bitcast<_I>(__xi);
+    constexpr size_t _NI = sizeof(__xn) / sizeof(_I);
+
+    constexpr auto __fp_normal = __vector_broadcast<_NI, _I>(FP_NORMAL);
+    constexpr auto __fp_nan = __vector_broadcast<_NI, _I>(FP_NAN);
+    constexpr auto __fp_infinite = __vector_broadcast<_NI, _I>(FP_INFINITE);
+    constexpr auto __fp_subnormal = __vector_broadcast<_NI, _I>(FP_SUBNORMAL);
+    constexpr auto __fp_zero = __vector_broadcast<_NI, _I>(FP_ZERO);
+
+    __vector_type_t<_I, _NI> __tmp;
+    if constexpr (sizeof(_Tp) == 4)
+      __tmp = __xn < 0x0080'0000
+		? (__xn == 0 ? __fp_zero : __fp_subnormal)
+		: (__xn < 0x7f80'0000
+		     ? __fp_normal
+		     : (__xn == 0x7f80'0000 ? __fp_infinite : __fp_nan));
+    else if constexpr (sizeof(_Tp) == 8)
+      __tmp = __xn < 0x0010'0000'0000'0000LL
+		? (__xn == 0 ? __fp_zero : __fp_subnormal)
+		: (__xn < 0x7ff0'0000'0000'0000LL
+		     ? __fp_normal
+		     : (__xn == 0x7ff0'0000'0000'0000LL ? __fp_infinite
+							: __fp_nan));
+    else
+      __assert_unreachable<_Tp>();
+
+    if constexpr (sizeof(_I) == sizeof(int))
+      {
+	using _FixedInt = __fixed_size_storage_t<int, _Np>;
+	const auto __as_int = __vector_bitcast<int, _Np>(__tmp);
+	if constexpr (_FixedInt::_S_tuple_size == 1)
+	  return {__as_int};
+	else if constexpr (_FixedInt::_S_tuple_size == 2
+			   && std::is_same_v<
+			     typename _FixedInt::_SecondType::_FirstAbi,
+			     simd_abi::scalar>)
+	  return {__extract<0, 2>(__as_int), __as_int[_Np - 1]};
+	else if constexpr (_FixedInt::_S_tuple_size == 2)
+	  return {__extract<0, 2>(__as_int),
+		  __auto_bitcast(__extract<1, 2>(__as_int))};
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (_Np == 2 && sizeof(_I) == 8
+		       && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 2)
+      {
+	const auto __aslong = __vector_bitcast<_LLong>(__tmp);
+	return {int(__aslong[0]), {int(__aslong[1])}};
+      }
+#if _GLIBCXX_SIMD_X86INTRIN
+    else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 32
+		       && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+      return {_mm_packs_epi32(__to_intrin(__lo128(__tmp)),
+			      __to_intrin(__hi128(__tmp)))};
+    else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 64
+		       && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+      return {_mm512_cvtepi64_epi32(__to_intrin(__tmp))};
+#endif // _GLIBCXX_SIMD_X86INTRIN
+    else if constexpr (__fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+      return {__call_with_subscripts<_Np>(__vector_bitcast<_LLong>(__tmp),
+					  [](auto... __l) {
+					    return __make_wrapper<int>(__l...);
+					  })};
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // __increment & __decrement{{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void __increment(_SimdWrapper<_Tp, _Np>& __x)
+  {
+    __x = __x._M_data + 1;
+  }
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void __decrement(_SimdWrapper<_Tp, _Np>& __x)
+  {
+    __x = __x._M_data - 1;
+  }
+
+  // smart_reference access {{{2
+  template <typename _Tp, size_t _Np, typename _Up>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static void
+  __set(_SimdWrapper<_Tp, _Np>& __v, int __i, _Up&& __x) noexcept
+  {
+    __v.__set(__i, static_cast<_Up&&>(__x));
+  }
+
+  // __masked_assign{{{2
+  template <typename _Tp, typename _K, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+		  __id<_SimdWrapper<_Tp, _Np>> __rhs)
+  {
+    __lhs = _CommonImpl::_S_blend(__k, __lhs, __rhs);
+  }
+
+  template <typename _Tp, typename _K, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+		  __id<_Tp> __rhs)
+  {
+    if (__builtin_constant_p(__rhs) && __rhs == 0 && std::is_same_v<_K, _Tp>)
+      {
+	if constexpr (!is_same_v<bool, _K>)
+	  // the __andnot optimization only makes sense if __k._M_data is a
+	  // vector register
+	  __lhs._M_data = __andnot(__k._M_data, __lhs._M_data);
+	else
+	  // for AVX512/__mmask, a _mm512_maskz_mov is best
+	  __lhs = _CommonImpl::_S_blend(__k, __lhs, _SimdWrapper<_Tp, _Np>());
+      }
+    else
+      __lhs = _CommonImpl::_S_blend(__k, __lhs,
+				    _SimdWrapper<_Tp, _Np>(
+				      __vector_broadcast<_Np>(__rhs)));
+  }
+
+  // __masked_cassign {{{2
+  template <typename _Op, typename _Tp, typename _K, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_cassign(const _SimdWrapper<_K, _Np> __k,
+		   _SimdWrapper<_Tp, _Np>& __lhs,
+		   const __id<_SimdWrapper<_Tp, _Np>> __rhs, _Op __op)
+  {
+    __lhs = _CommonImpl::_S_blend(__k, __lhs, __op(_SuperImpl{}, __lhs, __rhs));
+  }
+
+  template <typename _Op, typename _Tp, typename _K, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_cassign(const _SimdWrapper<_K, _Np> __k,
+		   _SimdWrapper<_Tp, _Np>& __lhs, const __id<_Tp> __rhs,
+		   _Op __op)
+  {
+    __lhs = _CommonImpl::_S_blend(__k, __lhs,
+				  __op(_SuperImpl{}, __lhs,
+				       _SimdWrapper<_Tp, _Np>(
+					 __vector_broadcast<_Np>(__rhs))));
+  }
+
+  // __masked_unary {{{2
+  template <template <typename> class _Op, typename _Tp, typename _K,
+	    size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __masked_unary(const _SimdWrapper<_K, _Np> __k,
+		 const _SimdWrapper<_Tp, _Np> __v)
+  {
+    auto __vv = __make_simd(__v);
+    _Op<decltype(__vv)> __op;
+    return _CommonImpl::_S_blend(__k, __v, __data(__op(__vv)));
+  }
+
+  //}}}2
+};
+
+// _MaskImplBuiltinMixin {{{1
+struct _MaskImplBuiltinMixin
+{
+  template <typename _Tp> using _TypeTag = _Tp*;
+
+  // __to_maskvector {{{
+  template <typename _Up, size_t _ToN = 1>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+  __to_maskvector(bool __x)
+  {
+    using _I = __int_for_sizeof_t<_Up>;
+    return __vector_bitcast<_Up>(__x ? __vector_type_t<_I, _ToN>{~_I()}
+				     : __vector_type_t<_I, _ToN>{});
+  }
+
+  template <typename _Up, size_t _UpN = 0, size_t _Np, bool _Sanitized,
+	    size_t _ToN = _UpN == 0 ? _Np : _UpN>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+  __to_maskvector(_BitMask<_Np, _Sanitized> __x)
+  {
+    using _I = __int_for_sizeof_t<_Up>;
+    return __vector_bitcast<_Up>(
+      __generate_vector<__vector_type_t<_I, _ToN>>([&](auto __i) constexpr {
+	if constexpr (__i < _Np)
+	  return __x[__i] ? ~_I() : _I();
+	else
+	  return _I();
+      }));
+  }
+
+  template <typename _Up, size_t _UpN = 0, typename _Tp, size_t _Np,
+	    size_t _ToN = _UpN == 0 ? _Np : _UpN>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+  __to_maskvector(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _TW = _SimdWrapper<_Tp, _Np>;
+    using _UW = _SimdWrapper<_Up, _ToN>;
+    if constexpr (sizeof(_Up) == sizeof(_Tp) && sizeof(_TW) == sizeof(_UW))
+      return __wrapper_bitcast<_Up, _ToN>(__x);
+    else if constexpr (is_same_v<_Tp, bool>) // bits -> vector
+      return __to_maskvector<_Up, _ToN>(std::bitset<_Np>(__x._M_data));
+    else
+      { // vector -> vector
+	/*
+	[[maybe_unused]] const auto __y = __vector_bitcast<_Up>(__x._M_data);
+	if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4 && sizeof(__y) == 16)
+	  return __vector_permute<1, 3, -1, -1>(__y);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+			   && sizeof(__y) == 16)
+	  return __vector_permute<1, 3, 5, 7, -1, -1, -1, -1>(__y);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+			   && sizeof(__y) == 16)
+	  return __vector_permute<3, 7, -1, -1, -1, -1, -1, -1>(__y);
+	else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+			   && sizeof(__y) == 16)
+	  return __vector_permute<1, 3, 5, 7, 9, 11, 13, 15, -1, -1, -1, -1, -1,
+				  -1, -1, -1>(__y);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+			   && sizeof(__y) == 16)
+	  return __vector_permute<3, 7, 11, 15, -1, -1, -1, -1, -1, -1, -1, -1,
+				  -1, -1, -1, -1>(__y);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+			   && sizeof(__y) == 16)
+	  return __vector_permute<7, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+				  -1, -1, -1, -1>(__y);
+	else
+	*/
+	{
+	  using _I = __int_for_sizeof_t<_Up>;
+	  const auto __y
+	    = __vector_bitcast<__int_for_sizeof_t<_Tp>>(__x._M_data);
+	  return __vector_bitcast<_Up>(
+	    __generate_vector<__vector_type_t<_I, _ToN>>([&](
+	      auto __i) constexpr {
+	      if constexpr (__i < _Np)
+		return _I(__y[__i.value]);
+	      else
+		return _I();
+	    }));
+	}
+      }
+  }
+
+  // }}}
+  // __to_bits {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __to_bits(_SimdWrapper<_Tp, _Np> __x)
+  {
+    static_assert(!is_same_v<_Tp, bool>);
+    static_assert(_Np <= CHAR_BIT * sizeof(_ULLong));
+    using _Up = make_unsigned_t<__int_for_sizeof_t<_Tp>>;
+    const auto __bools
+      = __vector_bitcast<_Up>(__x) >> (sizeof(_Up) * CHAR_BIT - 1);
+    _ULLong __r = 0;
+    __execute_n_times<_Np>(
+      [&](auto __i) { __r |= _ULLong(__bools[__i.value]) << __i; });
+    return __r;
+  }
+
+  // }}}
+};
+
+// _MaskImplBuiltin {{{1
+template <typename _Abi> struct _MaskImplBuiltin : _MaskImplBuiltinMixin
+{
+  using _MaskImplBuiltinMixin::__to_bits;
+  using _MaskImplBuiltinMixin::__to_maskvector;
+
+  // member types {{{
+  template <typename _Tp>
+  using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+  template <typename _Tp>
+  using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+  using _SuperImpl = typename _Abi::_MaskImpl;
+  using _CommonImpl = typename _Abi::_CommonImpl;
+  template <typename _Tp> static constexpr size_t size = simd_size_v<_Tp, _Abi>;
+
+  // }}}
+  // __broadcast {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __broadcast(bool __x)
+  {
+    return __x ? _Abi::template __implicit_mask<_Tp>() : _MaskMember<_Tp>();
+  }
+
+  // }}}
+  // __load {{{
+  template <typename _Tp, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __load(const bool* __mem)
+  {
+    using _I = __int_for_sizeof_t<_Tp>;
+    if constexpr (sizeof(_Tp) == sizeof(bool))
+      {
+	const auto __bools
+	  = _CommonImpl::template _S_load<_I, size<_Tp>>(__mem, _Flags());
+	// bool is {0, 1}, everything else is UB
+	return __vector_bitcast<_Tp>(__bools > 0);
+      }
+    else
+      return __vector_bitcast<_Tp>(__generate_vector<_I, size<_Tp>>([&](
+	auto __i) constexpr { return __mem[__i] ? ~_I() : _I(); }));
+  }
+
+  // }}}
+  // __convert {{{
+  template <typename _Tp, size_t _Np, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+  __convert(_BitMask<_Np, _Sanitized> __x)
+  {
+    if constexpr (__is_builtin_bitmask_abi<_Abi>())
+      return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(__x._M_to_bits());
+    else
+      return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(
+	__x._M_sanitized());
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+  __convert(_SimdWrapper<bool, _Np> __x)
+  {
+    if constexpr (__is_builtin_bitmask_abi<_Abi>())
+      return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(__x._M_data);
+    else
+      return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(
+	_BitMask<_Np>(__x._M_data)._M_sanitized());
+  }
+
+  template <typename _Tp, typename _Up, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+  __convert(_SimdWrapper<_Up, _Np> __x)
+  {
+    if constexpr (__is_builtin_bitmask_abi<_Abi>())
+      return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(
+	_SuperImpl::__to_bits(__x));
+    else
+      return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__x);
+  }
+
+  template <typename _Tp, typename _Up, typename _UAbi>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+  __convert(simd_mask<_Up, _UAbi> __x)
+  {
+    if constexpr (__is_builtin_bitmask_abi<_Abi>())
+      {
+	using _R = _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>;
+	if constexpr (__is_builtin_bitmask_abi<_UAbi>()) // bits -> bits
+	  return _R(__data(__x));
+	else if constexpr (__is_scalar_abi<_UAbi>()) // bool -> bits
+	  return _R(__data(__x));
+	else if constexpr (__is_fixed_size_abi_v<_UAbi>) // bitset -> bits
+	  return _R(__data(__x)._M_to_bits());
+	else // vector -> bits
+	  return _R(_UAbi::_MaskImpl::__to_bits(__data(__x))._M_to_bits());
+      }
+    else
+      return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__data(__x));
+  }
+
+  // }}}
+  // __masked_load {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  static inline _SimdWrapper<_Tp, _Np>
+  __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __mask,
+		const bool* __mem, _Fp) noexcept
+  {
+    // AVX(2) has 32/64 bit maskload, but nothing at 8 bit granularity
+    auto __tmp = __wrapper_bitcast<__int_for_sizeof_t<_Tp>>(__merge);
+    _BitOps::__bit_iteration(_SuperImpl::__to_bits(__mask),
+			     [&](auto __i) { __tmp.__set(__i, -__mem[__i]); });
+    __merge = __wrapper_bitcast<_Tp>(__tmp);
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __v,
+					      bool* __mem, _Fp) noexcept
+  {
+    __execute_n_times<_Np>([&](auto __i) constexpr { __mem[__i] = __v[__i]; });
+  }
+
+  // __masked_store {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  static inline void __masked_store(const _SimdWrapper<_Tp, _Np> __v,
+				    bool* __mem, _Fp,
+				    const _SimdWrapper<_Tp, _Np> __k) noexcept
+  {
+    _BitOps::__bit_iteration(
+      _SuperImpl::__to_bits(__k), [&](auto __i) constexpr {
+	__mem[__i] = __v[__i];
+      });
+  }
+
+  // __from_bitmask{{{2
+  template <size_t _Np, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __from_bitmask(_SanitizedBitMask<_Np> __bits, _TypeTag<_Tp>)
+  {
+    return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__bits);
+  }
+
+  // logical and bitwise operators {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __logical_and(const _SimdWrapper<_Tp, _Np>& __x,
+		const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    return __and(__x._M_data, __y._M_data);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __logical_or(const _SimdWrapper<_Tp, _Np>& __x,
+	       const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    return __or(__x._M_data, __y._M_data);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_not(const _SimdWrapper<_Tp, _Np>& __x)
+  {
+    if constexpr(_Abi::_S_is_partial)
+      return __andnot(__x._M_data, _Abi::template __implicit_mask<_Tp>());
+    else
+      return __not(__x._M_data);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_and(const _SimdWrapper<_Tp, _Np>& __x,
+	    const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    return __and(__x._M_data, __y._M_data);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    return __or(__x._M_data, __y._M_data);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
+	    const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    return __xor(__x._M_data, __y._M_data);
+  }
+
+  // smart_reference access {{{2
+  template <typename _Tp, size_t _Np>
+  static constexpr void __set(_SimdWrapper<_Tp, _Np>& __k, int __i,
+			      bool __x) noexcept
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      __k.__set(__i, __x);
+    else
+      {
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	auto __ki = __vector_bitcast<_Ip>(__k._M_data);
+	if (__builtin_is_constant_evaluated())
+	  {
+	    __k = __vector_bitcast<_Tp>(
+	      __generate_from_n_evaluations<_Np, decltype(__ki)>([&](auto __j) {
+		if (__i == __j)
+		  return _Ip(-__x);
+		else
+		  return __ki[+__j];
+	      }));
+	  }
+	else
+	  {
+	    __ki[__i] = _Ip(-__x);
+	    __k = __vector_bitcast<_Tp>(__ki);
+	  }
+      }
+  }
+
+  // __masked_assign{{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+		  __id<_SimdWrapper<_Tp, _Np>> __rhs)
+  {
+    __lhs = _CommonImpl::_S_blend(__k, __lhs, __rhs);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+		  bool __rhs)
+  {
+    if (__builtin_constant_p(__rhs))
+      {
+	if (__rhs == false)
+	  {
+	    __lhs = __andnot(__k._M_data, __lhs._M_data);
+	  }
+	else
+	  {
+	    __lhs = __or(__k._M_data, __lhs._M_data);
+	  }
+	return;
+      }
+    __lhs
+      = _CommonImpl::_S_blend(__k, __lhs, __data(simd_mask<_Tp, _Abi>(__rhs)));
+  }
+
+  //}}}2
+  // __all_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __call_with_subscripts(
+      __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+      make_index_sequence<size<_Tp>>(),
+      [](const auto... __ent) constexpr { return (... && !(__ent == 0)); });
+  }
+
+  // }}}
+  // __any_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __call_with_subscripts(
+      __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+      make_index_sequence<size<_Tp>>(),
+      [](const auto... __ent) constexpr { return (... || !(__ent == 0)); });
+  }
+
+  // }}}
+  // __none_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __call_with_subscripts(
+      __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+      make_index_sequence<size<_Tp>>(),
+      [](const auto... __ent) constexpr { return (... && (__ent == 0)); });
+  }
+
+  // }}}
+  // __some_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+  {
+    const int __n_true = __popcount(__k);
+    return __n_true > 0 && __n_true < int(size<_Tp>);
+  }
+
+  // }}}
+  // __popcount {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+  {
+    using _I = __int_for_sizeof_t<_Tp>;
+    if constexpr (std::is_default_constructible_v<simd<_I, _Abi>>)
+      return -reduce(
+	simd<_I, _Abi>(__private_init, __wrapper_bitcast<_I>(__data(__k))));
+    else
+      return -reduce(__bit_cast<rebind_simd_t<_I, simd<_Tp, _Abi>>>(
+	simd<_Tp, _Abi>(__private_init, __data(__k))));
+  }
+
+  // }}}
+  // __find_first_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+  {
+    return _BitOps::__firstbit(_SuperImpl::__to_bits(__data(__k))._M_to_bits());
+  }
+
+  // }}}
+  // __find_last_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+  {
+    return _BitOps::__lastbit(_SuperImpl::__to_bits(__data(__k))._M_to_bits());
+  }
+
+  // }}}
+};
+
+//}}}1
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h
new file mode 100644
index 00000000000..256b64023d2
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_converter.h
@@ -0,0 +1,337 @@
+// Generic simd conversions -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+
+#if __cplusplus >= 201703L
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+// _SimdConverter scalar -> scalar {{{
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::scalar, _To, simd_abi::scalar,
+		      std::enable_if_t<!std::is_same_v<_From, _To>>>
+{
+  _GLIBCXX_SIMD_INTRINSIC constexpr _To operator()(_From __a) const noexcept
+  {
+    return static_cast<_To>(__a);
+  }
+};
+
+// }}}
+// _SimdConverter "native" -> scalar {{{
+template <typename _From, typename _To, typename _Abi>
+struct _SimdConverter<_From, _Abi, _To, simd_abi::scalar,
+		      std::enable_if_t<!std::is_same_v<_Abi, simd_abi::scalar>>>
+{
+  using _Arg = typename _Abi::template __traits<_From>::_SimdMember;
+  static constexpr size_t _S_n = _Arg::_S_width;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr std::array<_To, _S_n>
+  __all(_Arg __a) const noexcept
+  {
+    return __call_with_subscripts(
+      __a, make_index_sequence<_S_n>(),
+      [&](auto... __values) constexpr -> std::array<_To, _S_n> {
+	return {static_cast<_To>(__values)...};
+      });
+  }
+};
+
+// }}}
+// _SimdConverter scalar -> "native" {{{
+template <typename _From, typename _To, typename _Abi>
+struct _SimdConverter<_From, simd_abi::scalar, _To, _Abi,
+		      std::enable_if_t<!std::is_same_v<_Abi, simd_abi::scalar>>>
+{
+  using _Ret = typename _Abi::template __traits<_To>::_SimdMember;
+
+  template <typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+  operator()(_From __a, _More... __more) const noexcept
+  {
+    static_assert(sizeof...(_More) + 1 == _Abi::template size<_To>);
+    static_assert(std::conjunction_v<std::is_same<_From, _More>...>);
+    return __make_vector<_To>(__a, __more...);
+  }
+};
+
+// }}}
+// _SimdConverter "native 1" -> "native 2" {{{
+template <typename _From, typename _To, typename _AFrom, typename _ATo>
+struct _SimdConverter<
+  _From, _AFrom, _To, _ATo,
+  std::enable_if_t<!std::disjunction_v<
+    __is_fixed_size_abi<_AFrom>, __is_fixed_size_abi<_ATo>,
+    std::is_same<_AFrom, simd_abi::scalar>,
+    std::is_same<_ATo, simd_abi::scalar>,
+    std::conjunction<std::is_same<_From, _To>, std::is_same<_AFrom, _ATo>>>>>
+{
+  using _Arg = typename _AFrom::template __traits<_From>::_SimdMember;
+  using _Ret = typename _ATo::template __traits<_To>::_SimdMember;
+  using _V = __vector_type_t<_To, simd_size_v<_To, _ATo>>;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr auto __all(_Arg __a) const noexcept
+  {
+    return __convert_all<_V>(__a);
+  }
+
+  template <typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+  operator()(_Arg __a, _More... __more) const noexcept
+  {
+    return __convert<_V>(__a, __more...);
+  }
+};
+
+// }}}
+// _SimdConverter scalar -> fixed_size<1> {{{1
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::scalar, _To, simd_abi::fixed_size<1>,
+		      void>
+{
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_To, simd_abi::scalar>
+  operator()(_From __x) const noexcept
+  {
+    return {static_cast<_To>(__x)};
+  }
+};
+
+// _SimdConverter fixed_size<1> -> scalar {{{1
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::fixed_size<1>, _To, simd_abi::scalar,
+		      void>
+{
+  _GLIBCXX_SIMD_INTRINSIC constexpr _To
+  operator()(_SimdTuple<_From, simd_abi::scalar> __x) const noexcept
+  {
+    return {static_cast<_To>(__x.first)};
+  }
+};
+
+// _SimdConverter fixed_size<_Np> -> fixed_size<_Np> {{{1
+template <typename _From, typename _To, int _Np>
+struct _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+		      simd_abi::fixed_size<_Np>,
+		      std::enable_if_t<!std::is_same_v<_From, _To>>>
+{
+  using _Ret = __fixed_size_storage_t<_To, _Np>;
+  using _Arg = __fixed_size_storage_t<_From, _Np>;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+  operator()(const _Arg& __x) const noexcept
+  {
+    if constexpr (std::is_same_v<_From, _To>)
+      return __x;
+
+    // special case (optimize) int signedness casts
+    else if constexpr (sizeof(_From) == sizeof(_To)
+		       && std::is_integral_v<_From> && std::is_integral_v<_To>)
+      return __bit_cast<_Ret>(__x);
+
+    // special case if all ABI tags in _Ret are scalar
+    else if constexpr (__is_scalar_abi<typename _Ret::_FirstAbi>())
+      {
+	return __call_with_subscripts(
+	  __x, make_index_sequence<_Np>(),
+	  [](auto... __values) constexpr -> _Ret {
+	    return __make_simd_tuple<_To, decltype((void) __values,
+						   simd_abi::scalar())...>(
+	      static_cast<_To>(__values)...);
+	  });
+      }
+
+    // from one vector to one vector
+    else if constexpr (_Arg::_S_first_size == _Ret::_S_first_size)
+      {
+	_SimdConverter<_From, typename _Arg::_FirstAbi, _To,
+		       typename _Ret::_FirstAbi>
+	  __native_cvt;
+	if constexpr (_Arg::_S_tuple_size == 1)
+	  return {__native_cvt(__x.first)};
+	else
+	  {
+	    constexpr size_t _NRemain = _Np - _Arg::_S_first_size;
+	    _SimdConverter<_From, simd_abi::fixed_size<_NRemain>, _To,
+			   simd_abi::fixed_size<_NRemain>>
+	      __remainder_cvt;
+	    return {__native_cvt(__x.first), __remainder_cvt(__x.second)};
+	  }
+      }
+
+    // from one vector to multiple vectors
+    else if constexpr (_Arg::_S_first_size > _Ret::_S_first_size)
+      {
+	const auto __multiple_return_chunks
+	  = __convert_all<__vector_type_t<_To, _Ret::_S_first_size>>(__x.first);
+	constexpr auto __converted = __multiple_return_chunks.size()
+				     * _Ret::_FirstAbi::template size<_To>;
+	constexpr auto __remaining = _Np - __converted;
+	if constexpr (_Arg::_S_tuple_size == 1 && __remaining == 0)
+	  return __to_simd_tuple<_To, _Np>(__multiple_return_chunks);
+	else if constexpr (_Arg::_S_tuple_size == 1)
+	  { // e.g. <int, 3> -> <double, 2, 1> or <short, 7> -> <double, 4, 2,
+	    // 1>
+	    using _RetRem = __remove_cvref_t<decltype(
+	      __simd_tuple_pop_front<__multiple_return_chunks.size()>(_Ret()))>;
+	    const auto __return_chunks2
+	      = __convert_all<__vector_type_t<_To, _RetRem::_S_first_size>, 0,
+			      __converted>(__x.first);
+	    constexpr auto __converted2
+	      = __converted + __return_chunks2.size() * _RetRem::_S_first_size;
+	    if constexpr (__converted2 == _Np)
+	      return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+					       __return_chunks2);
+	    else
+	      {
+		using _RetRem2 = __remove_cvref_t<decltype(
+		  __simd_tuple_pop_front<__return_chunks2.size()>(_RetRem()))>;
+		const auto __return_chunks3
+		  = __convert_all<__vector_type_t<_To, _RetRem2::_S_first_size>,
+				  0, __converted2>(__x.first);
+		constexpr auto __converted3
+		  = __converted2
+		    + __return_chunks3.size() * _RetRem2::_S_first_size;
+		if constexpr (__converted3 == _Np)
+		  return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+						   __return_chunks2,
+						   __return_chunks3);
+		else
+		  {
+		    using _RetRem3 = __remove_cvref_t<decltype(
+		      __simd_tuple_pop_front<__return_chunks3.size()>(
+			_RetRem2()))>;
+		    const auto __return_chunks4 = __convert_all<
+		      __vector_type_t<_To, _RetRem3::_S_first_size>, 0,
+		      __converted3>(__x.first);
+		    constexpr auto __converted4
+		      = __converted3
+			+ __return_chunks4.size() * _RetRem3::_S_first_size;
+		    if constexpr (__converted4 == _Np)
+		      return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+						       __return_chunks2,
+						       __return_chunks3,
+						       __return_chunks4);
+		    else
+		      __assert_unreachable<_To>();
+		  }
+	      }
+	  }
+	else
+	  {
+	    constexpr size_t _NRemain = _Np - _Arg::_S_first_size;
+	    _SimdConverter<_From, simd_abi::fixed_size<_NRemain>, _To,
+			   simd_abi::fixed_size<_NRemain>>
+	      __remainder_cvt;
+	    return __simd_tuple_concat(
+	      __to_simd_tuple<_To, _Arg::_S_first_size>(
+		__multiple_return_chunks),
+	      __remainder_cvt(__x.second));
+	  }
+      }
+
+    // from multiple vectors to one vector
+    // _Arg::_S_first_size < _Ret::_S_first_size
+    // a) heterogeneous input at the end of the tuple (possible with partial
+    //    native registers in _Ret)
+    else if constexpr (_Ret::_S_tuple_size == 1
+		       && _Np % _Arg::_S_first_size != 0)
+      {
+	static_assert(_Ret::_FirstAbi::_S_is_partial);
+	return _Ret{__generate_from_n_evaluations<
+	  _Np, typename _VectorTraits<typename _Ret::_FirstType>::type>(
+	  [&](auto __i) { return static_cast<_To>(__x[__i]); })};
+      }
+    else
+      {
+	static_assert(_Arg::_S_tuple_size > 1);
+	constexpr auto __n
+	  = __div_roundup(_Ret::_S_first_size, _Arg::_S_first_size);
+	return __call_with_n_evaluations<__n>(
+	  [&__x](auto... __uncvted) {
+	    // assuming _Arg Abi tags for all __i are _Arg::_FirstAbi
+	    _SimdConverter<_From, typename _Arg::_FirstAbi, _To,
+			   typename _Ret::_FirstAbi>
+	      __native_cvt;
+	    if constexpr (_Ret::_S_tuple_size == 1)
+	      return _Ret{__native_cvt(__uncvted...)};
+	    else
+	      return _Ret{
+		__native_cvt(__uncvted...),
+		_SimdConverter<
+		  _From, simd_abi::fixed_size<_Np - _Ret::_S_first_size>, _To,
+		  simd_abi::fixed_size<_Np - _Ret::_S_first_size>>()(
+		  __simd_tuple_pop_front<sizeof...(__uncvted)>(__x))};
+	  },
+	  [&__x](auto __i) { return __get_tuple_at<__i>(__x); });
+      }
+  }
+};
+
+// _SimdConverter "native" -> fixed_size<_Np> {{{1
+// i.e. 1 register to ? registers
+template <typename _From, typename _Ap, typename _To, int _Np>
+struct _SimdConverter<_From, _Ap, _To, simd_abi::fixed_size<_Np>,
+		      std::enable_if_t<!__is_fixed_size_abi_v<_Ap>>>
+{
+  static_assert(
+    _Np == simd_size_v<_From, _Ap>,
+    "_SimdConverter to fixed_size only works for equal element counts");
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_To, _Np>
+  operator()(typename _SimdTraits<_From, _Ap>::_SimdMember __x) const noexcept
+  {
+    _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+		   simd_abi::fixed_size<_Np>>
+      __fixed_cvt;
+    return __fixed_cvt(__fixed_size_storage_t<_From, _Np>{__x});
+  }
+};
+
+// _SimdConverter fixed_size<_Np> -> "native" {{{1
+// i.e. ? register to 1 registers
+template <typename _From, int _Np, typename _To, typename _Ap>
+struct _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To, _Ap,
+		      std::enable_if_t<!__is_fixed_size_abi_v<_Ap>>>
+{
+  static_assert(
+    _Np == simd_size_v<_To, _Ap>,
+    "_SimdConverter to fixed_size only works for equal element counts");
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr typename _SimdTraits<_To, _Ap>::_SimdMember
+  operator()(__fixed_size_storage_t<_From, _Np> __x) const noexcept
+  {
+    _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+		   simd_abi::fixed_size<_Np>>
+      __fixed_cvt;
+    return __fixed_cvt(__x).first;
+  }
+};
+
+// }}}1
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
new file mode 100644
index 00000000000..c8a40ecc3af
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -0,0 +1,309 @@
+// Internal macros for the simd implementation -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+
+#if __cplusplus >= 201703L
+
+#include <cstddef>
+#include <cstdint>
+
+
+#define _GLIBCXX_SIMD_BEGIN_NAMESPACE                                          \
+  namespace std _GLIBCXX_VISIBILITY(default)                                   \
+  {                                                                            \
+    _GLIBCXX_BEGIN_NAMESPACE_VERSION                                           \
+      namespace experimental {                                                 \
+      inline namespace parallelism_v2 {
+#define _GLIBCXX_SIMD_END_NAMESPACE                                            \
+  }                                                                            \
+  }                                                                            \
+  _GLIBCXX_END_NAMESPACE_VERSION                                               \
+  }
+
+// ISA extension detection. The following defines all the _GLIBCXX_SIMD_HAVE_XXX
+// macros ARM{{{
+#if defined __ARM_NEON
+#define _GLIBCXX_SIMD_HAVE_NEON 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON 0
+#endif
+#if defined __ARM_NEON && (__ARM_ARCH >= 8 || defined __aarch64__)
+#define _GLIBCXX_SIMD_HAVE_NEON_A32 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON_A32 0
+#endif
+#if defined __ARM_NEON && defined __aarch64__
+#define _GLIBCXX_SIMD_HAVE_NEON_A64 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON_A64 0
+#endif
+//}}}
+// x86{{{
+#ifdef __MMX__
+#define _GLIBCXX_SIMD_HAVE_MMX 1
+#else
+#define _GLIBCXX_SIMD_HAVE_MMX 0
+#endif
+#if defined __SSE__ || defined __x86_64__
+#define _GLIBCXX_SIMD_HAVE_SSE 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE 0
+#endif
+#if defined __SSE2__ || defined __x86_64__
+#define _GLIBCXX_SIMD_HAVE_SSE2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE2 0
+#endif
+#ifdef __SSE3__
+#define _GLIBCXX_SIMD_HAVE_SSE3 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE3 0
+#endif
+#ifdef __SSSE3__
+#define _GLIBCXX_SIMD_HAVE_SSSE3 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSSE3 0
+#endif
+#ifdef __SSE4_1__
+#define _GLIBCXX_SIMD_HAVE_SSE4_1 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4_1 0
+#endif
+#ifdef __SSE4_2__
+#define _GLIBCXX_SIMD_HAVE_SSE4_2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4_2 0
+#endif
+#ifdef __XOP__
+#define _GLIBCXX_SIMD_HAVE_XOP 1
+#else
+#define _GLIBCXX_SIMD_HAVE_XOP 0
+#endif
+#ifdef __AVX__
+#define _GLIBCXX_SIMD_HAVE_AVX 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX 0
+#endif
+#ifdef __AVX2__
+#define _GLIBCXX_SIMD_HAVE_AVX2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX2 0
+#endif
+#ifdef __BMI__
+#define _GLIBCXX_SIMD_HAVE_BMI1 1
+#else
+#define _GLIBCXX_SIMD_HAVE_BMI1 0
+#endif
+#ifdef __BMI2__
+#define _GLIBCXX_SIMD_HAVE_BMI2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_BMI2 0
+#endif
+#ifdef __LZCNT__
+#define _GLIBCXX_SIMD_HAVE_LZCNT 1
+#else
+#define _GLIBCXX_SIMD_HAVE_LZCNT 0
+#endif
+#ifdef __SSE4A__
+#define _GLIBCXX_SIMD_HAVE_SSE4A 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4A 0
+#endif
+#ifdef __FMA__
+#define _GLIBCXX_SIMD_HAVE_FMA 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FMA 0
+#endif
+#ifdef __FMA4__
+#define _GLIBCXX_SIMD_HAVE_FMA4 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FMA4 0
+#endif
+#ifdef __F16C__
+#define _GLIBCXX_SIMD_HAVE_F16C 1
+#else
+#define _GLIBCXX_SIMD_HAVE_F16C 0
+#endif
+#ifdef __POPCNT__
+#define _GLIBCXX_SIMD_HAVE_POPCNT 1
+#else
+#define _GLIBCXX_SIMD_HAVE_POPCNT 0
+#endif
+#ifdef __AVX512F__
+#define _GLIBCXX_SIMD_HAVE_AVX512F 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512F 0
+#endif
+#ifdef __AVX512DQ__
+#define _GLIBCXX_SIMD_HAVE_AVX512DQ 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512DQ 0
+#endif
+#ifdef __AVX512VL__
+#define _GLIBCXX_SIMD_HAVE_AVX512VL 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512VL 0
+#endif
+#ifdef __AVX512BW__
+#define _GLIBCXX_SIMD_HAVE_AVX512BW 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512BW 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_SSE
+#define _GLIBCXX_SIMD_HAVE_SSE_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_SSE2
+#define _GLIBCXX_SIMD_HAVE_FULL_SSE_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_SSE_ABI 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_AVX
+#define _GLIBCXX_SIMD_HAVE_AVX_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_AVX2
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX_ABI 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_AVX512F
+#define _GLIBCXX_SIMD_HAVE_AVX512_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_AVX512BW
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX512_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX512_ABI 0
+#endif
+
+#if defined __x86_64__ && !_GLIBCXX_SIMD_HAVE_SSE2
+#error "Use of SSE2 is required on AMD64"
+#endif
+//}}}
+
+#define _GLIBCXX_SIMD_NORMAL_MATH                                              \
+  [[__gnu__::__optimize__("finite-math-only,no-signed-zeros")]]
+#define _GLIBCXX_SIMD_NEVER_INLINE [[__gnu__::__noinline__]]
+#define _GLIBCXX_SIMD_INTRINSIC                                                \
+  [[__gnu__::__always_inline__, __gnu__::__artificial__]] inline
+#define _GLIBCXX_SIMD_ALWAYS_INLINE [[__gnu__::__always_inline__]] inline
+#define _GLIBCXX_SIMD_IS_UNLIKELY(__x) __builtin_expect(__x, 0)
+#define _GLIBCXX_SIMD_IS_LIKELY(__x) __builtin_expect(__x, 1)
+#if defined __STRICT_ANSI__ && __STRICT_ANSI__
+#define _GLIBCXX_SIMD_CONSTEXPR
+#else
+#define _GLIBCXX_SIMD_CONSTEXPR constexpr
+#endif
+
+#define _GLIBCXX_SIMD_LIST_BINARY(__macro) __macro(|) __macro(&) __macro(^)
+#define _GLIBCXX_SIMD_LIST_SHIFTS(__macro) __macro(<<) __macro(>>)
+#define _GLIBCXX_SIMD_LIST_ARITHMETICS(__macro)                                \
+  __macro(+) __macro(-) __macro(*) __macro(/) __macro(%)
+
+#define _GLIBCXX_SIMD_ALL_BINARY(__macro)                                      \
+  _GLIBCXX_SIMD_LIST_BINARY(__macro) static_assert(true)
+#define _GLIBCXX_SIMD_ALL_SHIFTS(__macro)                                      \
+  _GLIBCXX_SIMD_LIST_SHIFTS(__macro) static_assert(true)
+#define _GLIBCXX_SIMD_ALL_ARITHMETICS(__macro)                                 \
+  _GLIBCXX_SIMD_LIST_ARITHMETICS(__macro) static_assert(true)
+
+#ifdef _GLIBCXX_SIMD_NO_ALWAYS_INLINE
+#undef _GLIBCXX_SIMD_ALWAYS_INLINE
+#define _GLIBCXX_SIMD_ALWAYS_INLINE inline
+#undef _GLIBCXX_SIMD_INTRINSIC
+#define _GLIBCXX_SIMD_INTRINSIC inline
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_SSE || _GLIBCXX_SIMD_HAVE_MMX
+#define _GLIBCXX_SIMD_X86INTRIN 1
+#else
+#define _GLIBCXX_SIMD_X86INTRIN 0
+#endif
+
+// workaround macros {{{
+// use aliasing loads to help GCC understand the data accesses better
+// This also seems to hide a miscompilation on swap(x[i], x[i + 1]) with
+// fixed_size_simd<float, 16> x.
+#define _GLIBCXX_SIMD_USE_ALIASING_LOADS 1
+
+// vector conversions on x86 not optimized:
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_PR85048 1
+#endif
+
+// Invalid instruction mov from xmm16-31
+#define _GLIBCXX_SIMD_WORKAROUND_PR89229 1
+
+// integer division not optimized
+#define _GLIBCXX_SIMD_WORKAROUND_PR90993 1
+
+// very bad codegen for extraction and concatenation of 128/256 "subregisters"
+// with sizeof(element type) < 8: https://godbolt.org/g/mqUsgM
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_1 1
+#endif
+
+// bad codegen for 8 Byte memcpy to __vector_type_t<char, 16>
+#define _GLIBCXX_SIMD_WORKAROUND_PR90424 1
+
+// bad codegen for zero-extend using simple concat(__x, 0)
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_3 1
+#endif
+
+// bad codegen for integer division
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_4 1
+
+// abs pattern may generate MMX instructions without EMMS cleanup (This only
+// happens with SSSE3 because pabs[bwd] is part of SSSE3.)
+#if __GNUC__ < 10 && defined __SSSE3__ && _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_PR91533 1
+#endif
+
+#if __GNUC__ < 10 && defined __aarch64__
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_5 1
+#endif
+
+// https://github.com/cplusplus/parallelism-ts/issues/65 (incorrect return type
+// of static_simd_cast)
+#define _GLIBCXX_SIMD_FIX_P2TS_ISSUE65 1
+
+// https://github.com/cplusplus/parallelism-ts/issues/66 (incorrect SFINAE
+// constraint on (static)_simd_cast)
+#define _GLIBCXX_SIMD_FIX_P2TS_ISSUE66 1
+// }}}
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
new file mode 100644
index 00000000000..2b643f28835
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -0,0 +1,2102 @@
+// Simd fixed_size ABI specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+/*
+ * The fixed_size ABI gives the following guarantees:
+ *  - simd objects are passed via the stack
+ *  - memory layout of `simd<_Tp, _Np>` is equivalent to `std::array<_Tp, _Np>`
+ *  - alignment of `simd<_Tp, _Np>` is `_Np * sizeof(_Tp)` if _Np is __a
+ *    power-of-2 value, otherwise `__next_power_of_2(_Np * sizeof(_Tp))` (Note:
+ *    if the alignment were to exceed the system/compiler maximum, it is bounded
+ *    to that maximum)
+ *  - simd_mask objects are passed like std::bitset<_Np>
+ *  - memory layout of `simd_mask<_Tp, _Np>` is equivalent to `std::bitset<_Np>`
+ *  - alignment of `simd_mask<_Tp, _Np>` is equal to the alignment of
+ *    `std::bitset<_Np>`
+ */
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+
+#if __cplusplus >= 201703L
+
+#include <array>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __simd_tuple_element {{{
+template <size_t _I, typename _Tp> struct __simd_tuple_element;
+template <typename _Tp, typename _A0, typename... _As>
+struct __simd_tuple_element<0, _SimdTuple<_Tp, _A0, _As...>>
+{
+  using type = std::experimental::simd<_Tp, _A0>;
+};
+template <size_t _I, typename _Tp, typename _A0, typename... _As>
+struct __simd_tuple_element<_I, _SimdTuple<_Tp, _A0, _As...>>
+{
+  using type =
+    typename __simd_tuple_element<_I - 1, _SimdTuple<_Tp, _As...>>::type;
+};
+template <size_t _I, typename _Tp>
+using __simd_tuple_element_t = typename __simd_tuple_element<_I, _Tp>::type;
+
+// }}}
+// __simd_tuple_concat {{{
+template <typename _Tp, typename... _A0s, typename... _A1s>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0s..., _A1s...>
+__simd_tuple_concat(const _SimdTuple<_Tp, _A0s...>& __left,
+		    const _SimdTuple<_Tp, _A1s...>& __right)
+{
+  if constexpr (sizeof...(_A0s) == 0)
+    return __right;
+  else if constexpr (sizeof...(_A1s) == 0)
+    return __left;
+  else
+    return {__left.first, __simd_tuple_concat(__left.second, __right)};
+}
+
+template <typename _Tp, typename _A10, typename... _A1s>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, simd_abi::scalar, _A10,
+					     _A1s...>
+__simd_tuple_concat(const _Tp& __left,
+		    const _SimdTuple<_Tp, _A10, _A1s...>& __right)
+{
+  return {__left, __right};
+}
+
+// }}}
+// __simd_tuple_pop_front {{{
+template <size_t _Np, typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+__simd_tuple_pop_front(_Tp&& __x)
+{
+  if constexpr (_Np == 0)
+    return static_cast<_Tp&&>(__x);
+  else
+    return __simd_tuple_pop_front<_Np - 1>(__x.second);
+}
+
+// }}}
+// __get_simd_at<_Np> {{{1
+struct __as_simd
+{
+};
+struct __as_simd_tuple
+{
+};
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr simd<_Tp, _A0>
+__simd_tuple_get_impl(__as_simd, const _SimdTuple<_Tp, _A0, _Abis...>& __t,
+		      _SizeConstant<0>)
+{
+  return {__private_init, __t.first};
+}
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__simd_tuple_get_impl(__as_simd_tuple,
+		      const _SimdTuple<_Tp, _A0, _Abis...>& __t,
+		      _SizeConstant<0>)
+{
+  return __t.first;
+}
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _A0, _Abis...>& __t,
+		      _SizeConstant<0>)
+{
+  return __t.first;
+}
+
+template <typename _R, size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__simd_tuple_get_impl(_R, const _SimdTuple<_Tp, _Abis...>& __t,
+		      _SizeConstant<_Np>)
+{
+  return __simd_tuple_get_impl(_R(), __t.second, _SizeConstant<_Np - 1>());
+}
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _Abis...>& __t,
+		      _SizeConstant<_Np>)
+{
+  return __simd_tuple_get_impl(__as_simd_tuple(), __t.second,
+			       _SizeConstant<_Np - 1>());
+}
+
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__get_simd_at(const _SimdTuple<_Tp, _Abis...>& __t)
+{
+  return __simd_tuple_get_impl(__as_simd(), __t, _SizeConstant<_Np>());
+}
+
+// }}}
+// __get_tuple_at<_Np> {{{
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__get_tuple_at(const _SimdTuple<_Tp, _Abis...>& __t)
+{
+  return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
+}
+
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__get_tuple_at(_SimdTuple<_Tp, _Abis...>& __t)
+{
+  return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
+}
+
+// __tuple_element_meta {{{1
+template <typename _Tp, typename _Abi, size_t _Offset>
+struct __tuple_element_meta : public _Abi::_SimdImpl
+{
+  static_assert(is_same_v<typename _Abi::_SimdImpl::abi_type,
+			  _Abi>); // this fails e.g. when _SimdImpl is an alias
+				  // for _SimdImplBuiltin<_DifferentAbi>
+  using value_type = _Tp;
+  using abi_type = _Abi;
+  using _Traits = _SimdTraits<_Tp, _Abi>;
+  using _MaskImpl = typename _Abi::_MaskImpl;
+  using _MaskMember = typename _Traits::_MaskMember;
+  using simd_type = std::experimental::simd<_Tp, _Abi>;
+  static constexpr size_t _S_offset = _Offset;
+  static constexpr size_t size() { return simd_size<_Tp, _Abi>::value; }
+  static constexpr _MaskImpl _S_mask_impl = {};
+
+  template <size_t _Np, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static auto
+  __submask(_BitMask<_Np, _Sanitized> __bits)
+  {
+    return __bits.template _M_extract<_Offset, size()>();
+  }
+
+  template <size_t _Np, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __make_mask(_BitMask<_Np, _Sanitized> __bits)
+  {
+    return _MaskImpl::template __convert<_Tp>(
+      __bits.template _M_extract<_Offset, size()>()._M_sanitized());
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static _ULLong
+  __mask_to_shifted_ullong(_MaskMember __k)
+  {
+    return _MaskImpl::__to_bits(__k).to_ullong() << _Offset;
+  }
+};
+
+template <size_t _Offset, typename _Tp, typename _Abi, typename... _As>
+__tuple_element_meta<_Tp, _Abi, _Offset>
+__make_meta(const _SimdTuple<_Tp, _Abi, _As...>&)
+{
+  return {};
+}
+
+// }}}1
+// _WithOffset wrapper class {{{
+template <size_t _Offset, typename _Base> struct _WithOffset : public _Base
+{
+  static inline constexpr size_t _S_offset = _Offset;
+
+  _GLIBCXX_SIMD_INTRINSIC char* __as_charptr()
+  {
+    return reinterpret_cast<char*>(this)
+	   + _S_offset * sizeof(typename _Base::value_type);
+  }
+  _GLIBCXX_SIMD_INTRINSIC const char* __as_charptr() const
+  {
+    return reinterpret_cast<const char*>(this)
+	   + _S_offset * sizeof(typename _Base::value_type);
+  }
+};
+
+// make _WithOffset<_WithOffset> ill-formed to use:
+template <size_t _O0, size_t _O1, typename _Base>
+struct _WithOffset<_O0, _WithOffset<_O1, _Base>>
+{
+};
+
+template <size_t _Offset, typename _Tp>
+decltype(auto)
+__add_offset(_Tp& __base)
+{
+  return static_cast<_WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(__base);
+}
+template <size_t _Offset, typename _Tp>
+decltype(auto)
+__add_offset(const _Tp& __base)
+{
+  return static_cast<const _WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(
+    __base);
+}
+template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
+decltype(auto)
+__add_offset(_WithOffset<_ExistingOffset, _Tp>& __base)
+{
+  return static_cast<_WithOffset<_Offset + _ExistingOffset, _Tp>&>(
+    static_cast<_Tp&>(__base));
+}
+template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
+decltype(auto)
+__add_offset(const _WithOffset<_ExistingOffset, _Tp>& __base)
+{
+  return static_cast<const _WithOffset<_Offset + _ExistingOffset, _Tp>&>(
+    static_cast<const _Tp&>(__base));
+}
+
+template <typename _Tp> constexpr inline size_t __offset = 0;
+template <size_t _Offset, typename _Tp>
+constexpr inline size_t
+  __offset<_WithOffset<_Offset, _Tp>> = _WithOffset<_Offset, _Tp>::_S_offset;
+template <typename _Tp>
+constexpr inline size_t __offset<const _Tp> = __offset<_Tp>;
+template <typename _Tp> constexpr inline size_t __offset<_Tp&> = __offset<_Tp>;
+template <typename _Tp> constexpr inline size_t __offset<_Tp&&> = __offset<_Tp>;
+
+// }}}
+// _SimdTuple specializations {{{1
+// empty {{{2
+template <typename _Tp> struct _SimdTuple<_Tp>
+{
+  using value_type = _Tp;
+  static constexpr size_t _S_tuple_size = 0;
+  static constexpr size_t size() { return 0; }
+};
+
+// _SimdTupleData {{{2
+template <typename _FirstType, typename _SecondType> struct _SimdTupleData
+{
+  _FirstType first;
+  _SecondType second;
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    if constexpr(is_class_v<_FirstType>)
+      return first._M_is_constprop() && second._M_is_constprop();
+    else
+      return __builtin_constant_p(first) && second._M_is_constprop();
+  }
+};
+
+template <typename _FirstType, typename _Tp>
+struct _SimdTupleData<_FirstType, _SimdTuple<_Tp>>
+{
+  _FirstType first;
+  static constexpr _SimdTuple<_Tp> second = {};
+
+  _GLIBCXX_SIMD_INTRINSIC
+  constexpr bool _M_is_constprop() const
+  {
+    if constexpr(is_class_v<_FirstType>)
+      return first._M_is_constprop();
+    else
+      return __builtin_constant_p(first);
+  }
+};
+
+// 1 or more {{{2
+template <typename _Tp, typename _Abi0, typename... _Abis>
+struct _SimdTuple<_Tp, _Abi0, _Abis...>
+  : _SimdTupleData<typename _SimdTraits<_Tp, _Abi0>::_SimdMember,
+		   _SimdTuple<_Tp, _Abis...>>
+{
+  static_assert(!__is_fixed_size_abi_v<_Abi0>);
+  using value_type = _Tp;
+  using _FirstType = typename _SimdTraits<_Tp, _Abi0>::_SimdMember;
+  using _FirstAbi = _Abi0;
+  using _SecondType = _SimdTuple<_Tp, _Abis...>;
+  static constexpr size_t _S_tuple_size = sizeof...(_Abis) + 1;
+  static constexpr size_t size()
+  {
+    return simd_size_v<_Tp, _Abi0> + _SecondType::size();
+  }
+  static constexpr size_t _S_first_size = simd_size_v<_Tp, _Abi0>;
+
+  using _Base = _SimdTupleData<typename _SimdTraits<_Tp, _Abi0>::_SimdMember,
+			       _SimdTuple<_Tp, _Abis...>>;
+  using _Base::first;
+  using _Base::second;
+
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple() = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(const _SimdTuple&) = default;
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple& operator=(const _SimdTuple&)
+    = default;
+
+  template <typename _Up>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x)
+    : _Base{static_cast<_Up&&>(__x)}
+  {}
+  template <typename _Up, typename _Up2>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _Up2&& __y)
+    : _Base{static_cast<_Up&&>(__x), static_cast<_Up2&&>(__y)}
+  {}
+  template <typename _Up>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _SimdTuple<_Tp>)
+    : _Base{static_cast<_Up&&>(__x)}
+  {}
+
+  _GLIBCXX_SIMD_INTRINSIC char* __as_charptr()
+  {
+    return reinterpret_cast<char*>(this);
+  }
+  _GLIBCXX_SIMD_INTRINSIC const char* __as_charptr() const
+  {
+    return reinterpret_cast<const char*>(this);
+  }
+
+  template <size_t _Np> _GLIBCXX_SIMD_INTRINSIC constexpr auto& __at()
+  {
+    if constexpr (_Np == 0)
+      return first;
+    else
+      return second.template __at<_Np - 1>();
+  }
+  template <size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC constexpr const auto& __at() const
+  {
+    if constexpr (_Np == 0)
+      return first;
+    else
+      return second.template __at<_Np - 1>();
+  }
+
+  template <size_t _Np> _GLIBCXX_SIMD_INTRINSIC constexpr auto __simd_at() const
+  {
+    if constexpr (_Np == 0)
+      return simd<_Tp, _Abi0>(__private_init, first);
+    else
+      return second.template __simd_at<_Np - 1>();
+  }
+
+  template <size_t _Offset = 0, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple
+  __generate(_Fp&& __gen, _SizeConstant<_Offset> = {})
+  {
+    auto&& __first = __gen(__tuple_element_meta<_Tp, _Abi0, _Offset>());
+    if constexpr (_S_tuple_size == 1)
+      return {__first};
+    else
+      return {__first, _SecondType::__generate(
+			 static_cast<_Fp&&>(__gen),
+			 _SizeConstant<_Offset + simd_size_v<_Tp, _Abi0>>())};
+  }
+
+  template <size_t _Offset = 0, typename _Fp, typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC _SimdTuple
+  __apply_wrapped(_Fp&& __fun, const _More&... __more) const
+  {
+    auto&& __first = __fun(__make_meta<_Offset>(*this), first, __more.first...);
+    if constexpr (_S_tuple_size == 1)
+      return {__first};
+    else
+      return {
+	__first,
+	second.template __apply_wrapped<_Offset + simd_size_v<_Tp, _Abi0>>(
+	  static_cast<_Fp&&>(__fun), __more.second...)};
+  }
+
+  template <size_t _Size, size_t _Offset = 0,
+	    typename _R = __fixed_size_storage_t<_Tp, _Size>>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _R __extract_tuple_with_size() const
+  {
+    if constexpr (_Size == _S_first_size && _Offset == 0)
+      return {first};
+    else if constexpr (_Size > _S_first_size && _Offset == 0
+		       && _S_tuple_size > 1)
+      return {
+	first,
+	second.template __extract_tuple_with_size<_Size - _S_first_size>()};
+    else if constexpr (_Size == 1)
+      return {operator[](_SizeConstant<_Offset>())};
+    else if constexpr (_R::_S_tuple_size == 1)
+      {
+	static_assert(_Offset % _Size == 0);
+	static_assert(_S_first_size % _Size == 0);
+	return {typename _R::_FirstType(
+	  __private_init,
+	  __extract_part<_Offset / _Size, _S_first_size / _Size>(first))};
+      }
+    else
+      __assert_unreachable<_SizeConstant<_Size>>();
+  }
+
+  template <typename _Tup>
+  _GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+  __extract_argument(_Tup&& __tup) const
+  {
+    using _TupT = typename __remove_cvref_t<_Tup>::value_type;
+    if constexpr (is_same_v<_SimdTuple, __remove_cvref_t<_Tup>>)
+      return __tup.first;
+    else if (__builtin_is_constant_evaluated())
+      return __fixed_size_storage_t<_TupT, _S_first_size>::__generate([&](
+	auto __meta) constexpr {
+	return __meta.__generator(
+	  [&](auto __i) constexpr { return __tup[__i]; },
+	  static_cast<_TupT*>(nullptr));
+      });
+    else
+      return [&]() {
+	__fixed_size_storage_t<_TupT, _S_first_size> __r;
+	__builtin_memcpy(__r.__as_charptr(), __tup.__as_charptr(), sizeof(__r));
+	return __r;
+      }();
+  }
+
+  template <typename _Tup>
+  _GLIBCXX_SIMD_INTRINSIC constexpr auto& __skip_argument(_Tup&& __tup) const
+  {
+    static_assert(_S_tuple_size > 1);
+    using _Up = __remove_cvref_t<_Tup>;
+    constexpr size_t __off = __offset<_Up>;
+    if constexpr (_S_first_size == _Up::_S_first_size && __off == 0)
+      return __tup.second;
+    else if constexpr (_S_first_size > _Up::_S_first_size
+		       && _S_first_size % _Up::_S_first_size == 0 && __off == 0)
+      return __simd_tuple_pop_front<_S_first_size / _Up::_S_first_size>(__tup);
+    else if constexpr (_S_first_size + __off < _Up::_S_first_size)
+      return __add_offset<_S_first_size>(__tup);
+    else if constexpr (_S_first_size + __off == _Up::_S_first_size)
+      return __tup.second;
+    else
+      __assert_unreachable<_Tup>();
+  }
+
+  template <size_t _Offset, typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC constexpr void
+  __assign_front(const _SimdTuple<_Tp, _Abi0, _More...>& __x) &
+  {
+    static_assert(_Offset == 0);
+    first = __x.first;
+    if constexpr (sizeof...(_More) > 0)
+      {
+	static_assert(sizeof...(_Abis) >= sizeof...(_More));
+	second.template __assign_front<0>(__x.second);
+      }
+  }
+
+  template <size_t _Offset>
+  _GLIBCXX_SIMD_INTRINSIC constexpr void __assign_front(const _FirstType& __x) &
+  {
+    static_assert(_Offset == 0);
+    first = __x;
+  }
+
+  template <size_t _Offset, typename... _As>
+  _GLIBCXX_SIMD_INTRINSIC constexpr void
+  __assign_front(const _SimdTuple<_Tp, _As...>& __x) &
+  {
+    __builtin_memcpy(__as_charptr() + _Offset * sizeof(value_type),
+		     __x.__as_charptr(),
+		     sizeof(_Tp) * _SimdTuple<_Tp, _As...>::size());
+  }
+
+  /*
+   * Iterate over the first objects in this _SimdTuple and call __fun for each
+   * of them. If additional arguments are passed via __more, chunk them into
+   * _SimdTuple or __vector_type_t objects of the same number of values.
+   */
+  template <typename _Fp, typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple
+  __apply_per_chunk(_Fp&& __fun, _More&&... __more) const
+  {
+    if constexpr ((...
+		   || conjunction_v<
+		     is_lvalue_reference<_More>,
+		     negation<is_const<remove_reference_t<_More>>>>) )
+      {
+	// need to write back at least one of __more after calling __fun
+	auto&& __first = [&](auto... __args) constexpr
+	{
+	  auto __r
+	    = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first, __args...);
+	  [[maybe_unused]] auto&& __ignore_me = {(
+	    [](auto&& __dst, const auto& __src) {
+	      if constexpr (is_assignable_v<decltype(__dst), decltype(__dst)>)
+		{
+		  __dst.template __assign_front<__offset<decltype(__dst)>>(
+		    __src);
+		}
+	    }(static_cast<_More&&>(__more), __args),
+	    0)...};
+	  return __r;
+	}
+	(__extract_argument(__more)...);
+	if constexpr (_S_tuple_size == 1)
+	  return {__first};
+	else
+	  return {__first,
+		  second.__apply_per_chunk(static_cast<_Fp&&>(__fun),
+					   __skip_argument(__more)...)};
+      }
+    else
+      {
+	auto&& __first = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first,
+			       __extract_argument(__more)...);
+	if constexpr (_S_tuple_size == 1)
+	  return {__first};
+	else
+	  return {__first,
+		  second.__apply_per_chunk(static_cast<_Fp&&>(__fun),
+					   __skip_argument(__more)...)};
+      }
+  }
+
+  template <typename _R = _Tp, typename _Fp, typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC auto __apply_r(_Fp&& __fun,
+					 const _More&... __more) const
+  {
+    auto&& __first
+      = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first, __more.first...);
+    if constexpr (_S_tuple_size == 1)
+      return __first;
+    else
+      return __simd_tuple_concat<_R>(
+	__first, second.template __apply_r<_R>(static_cast<_Fp&&>(__fun),
+					       __more.second...));
+  }
+
+  template <typename _Fp, typename... _More>
+  _GLIBCXX_SIMD_INTRINSIC constexpr friend _SanitizedBitMask<size()>
+  __test(const _Fp& __fun, const _SimdTuple& __x, const _More&... __more)
+  {
+    const _SanitizedBitMask<_S_first_size> __first
+      = _Abi0::_MaskImpl::__to_bits(__fun(__tuple_element_meta<_Tp, _Abi0, 0>(),
+					  __x.first, __more.first...));
+    if constexpr (_S_tuple_size == 1)
+      return __first;
+    else
+      return __test(__fun, __x.second, __more.second...)._M_prepend(__first);
+  }
+
+  template <typename _Up, _Up _I>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+  operator[](std::integral_constant<_Up, _I>) const noexcept
+  {
+    if constexpr (_I < simd_size_v<_Tp, _Abi0>)
+      return __subscript_read(_I);
+    else
+      return second[std::integral_constant<_Up,
+					   _I - simd_size_v<_Tp, _Abi0>>()];
+  }
+
+  _Tp operator[](size_t __i) const noexcept
+  {
+    if constexpr (_S_tuple_size == 1)
+      return __subscript_read(__i);
+    else
+      {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+	return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
+#else
+	if constexpr (__is_scalar_abi<_Abi0>())
+	  {
+	    const _Tp* ptr = &first;
+	    return ptr[__i];
+	  }
+	else
+	  return __i < simd_size_v<_Tp, _Abi0>
+		   ? __subscript_read(__i)
+		   : second[__i - simd_size_v<_Tp, _Abi0>];
+#endif
+      }
+  }
+
+  void __set(size_t __i, _Tp __val) noexcept
+  {
+    if constexpr (_S_tuple_size == 1)
+      return __subscript_write(__i, __val);
+    else
+      {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+	reinterpret_cast<__may_alias<_Tp>*>(this)[__i] = __val;
+#else
+	if (__i < simd_size_v<_Tp, _Abi0>)
+	  __subscript_write(__i, __val);
+	else
+	  second.__set(__i - simd_size_v<_Tp, _Abi0>, __val);
+#endif
+      }
+  }
+
+private:
+  // __subscript_read/_write {{{
+  _Tp __subscript_read([[maybe_unused]] size_t __i) const noexcept
+  {
+    if constexpr (__is_vectorizable_v<_FirstType>)
+      return first;
+    else
+      return first[__i];
+  }
+
+  void __subscript_write([[maybe_unused]] size_t __i, _Tp __y) noexcept
+  {
+    if constexpr (__is_vectorizable_v<_FirstType>)
+      first = __y;
+    else
+      first.__set(__i, __y);
+  }
+
+  // }}}
+};
+
+// __make_simd_tuple {{{1
+template <typename _Tp, typename _A0>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+__make_simd_tuple(std::experimental::simd<_Tp, _A0> __x0)
+{
+  return {__data(__x0)};
+}
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _As...>
+__make_simd_tuple(const std::experimental::simd<_Tp, _A0>& __x0,
+		  const std::experimental::simd<_Tp, _As>&... __xs)
+{
+  return {__data(__x0), __make_simd_tuple(__xs...)};
+}
+
+template <typename _Tp, typename _A0>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+__make_simd_tuple(const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0)
+{
+  return {__arg0};
+}
+
+template <typename _Tp, typename _A0, typename _A1, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _A1, _Abis...>
+__make_simd_tuple(
+  const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0,
+  const typename _SimdTraits<_Tp, _A1>::_SimdMember& __arg1,
+  const typename _SimdTraits<_Tp, _Abis>::_SimdMember&... __args)
+{
+  return {__arg0, __make_simd_tuple<_Tp, _A1, _Abis...>(__arg1, __args...)};
+}
+
+// __to_simd_tuple {{{1
+template <typename _Tp, size_t _Np, typename _V, size_t _NV, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_Tp, _Np>
+__to_simd_tuple(const std::array<_V, _NV>& __from, const _VX... __fromX);
+
+template <typename _Tp, size_t _Np,
+	  size_t _Offset = 0, // skip this many elements in __from0
+	  typename _R = __fixed_size_storage_t<_Tp, _Np>, typename _V0,
+	  typename _V0VT = _VectorTraits<_V0>, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC _R constexpr __to_simd_tuple(const _V0 __from0,
+						     const _VX... __fromX)
+{
+  static_assert(std::is_same_v<typename _V0VT::value_type, _Tp>);
+  static_assert(_Offset < _V0VT::_S_width);
+  using _R0 = __vector_type_t<_Tp, _R::_S_first_size>;
+  if constexpr (_R::_S_tuple_size == 1)
+    {
+      if constexpr (_Np == 1)
+	return _R{__from0[_Offset]};
+      else if constexpr (_Offset == 0 && _V0VT::_S_width >= _Np)
+	return _R{__intrin_bitcast<_R0>(__from0)};
+      else if constexpr (_Offset * 2 == _V0VT::_S_width
+			 && _V0VT::_S_width / 2 >= _Np)
+	return _R{__intrin_bitcast<_R0>(__extract_part<1, 2>(__from0))};
+      else if constexpr (_Offset * 4 == _V0VT::_S_width
+			 && _V0VT::_S_width / 4 >= _Np)
+	return _R{__intrin_bitcast<_R0>(__extract_part<1, 4>(__from0))};
+      else
+	__assert_unreachable<_Tp>();
+    }
+  else
+    {
+      if constexpr (1 == _R::_S_first_size)
+	{ // extract one scalar and recurse
+	  if constexpr (_Offset + 1 < _V0VT::_S_width)
+	    return _R{__from0[_Offset],
+		      __to_simd_tuple<_Tp, _Np - 1, _Offset + 1>(__from0,
+								 __fromX...)};
+	  else
+	    return _R{__from0[_Offset],
+		      __to_simd_tuple<_Tp, _Np - 1, 0>(__fromX...)};
+	}
+
+      // place __from0 into _R::first and recurse for __fromX -> _R::second
+      else if constexpr (_V0VT::_S_width == _R::_S_first_size && _Offset == 0)
+	return _R{__from0,
+		  __to_simd_tuple<_Tp, _Np - _R::_S_first_size>(__fromX...)};
+
+      // place lower part of __from0 into _R::first and recurse with _Offset
+      else if constexpr (_V0VT::_S_width > _R::_S_first_size && _Offset == 0)
+	return _R{__intrin_bitcast<_R0>(__from0),
+		  __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+				  _R::_S_first_size>(__from0, __fromX...)};
+
+      // place lower part of second quarter of __from0 into _R::first and
+      // recurse with _Offset
+      else if constexpr (_Offset * 4 == _V0VT::_S_width
+			 && _V0VT::_S_width >= 4 * _R::_S_first_size)
+	return _R{__intrin_bitcast<_R0>(__extract_part<2, 4>(__from0)),
+		  __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+				  _Offset + _R::_S_first_size>(__from0,
+							       __fromX...)};
+
+      // place lower half of high half of __from0 into _R::first and recurse
+      // with _Offset
+      else if constexpr (_Offset * 2 == _V0VT::_S_width
+			 && _V0VT::_S_width >= 4 * _R::_S_first_size)
+	return _R{__intrin_bitcast<_R0>(__extract_part<2, 4>(__from0)),
+		  __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+				  _Offset + _R::_S_first_size>(__from0,
+							       __fromX...)};
+
+      // place high half of __from0 into _R::first and recurse with __fromX
+      else if constexpr (_Offset * 2 == _V0VT::_S_width
+			 && _V0VT::_S_width / 2 >= _R::_S_first_size)
+	return _R{__intrin_bitcast<_R0>(__extract_part<1, 2>(__from0)),
+		  __to_simd_tuple<_Tp, _Np - _R::_S_first_size, 0>(__fromX...)};
+
+      // ill-formed if some unforseen pattern is needed
+      else
+	__assert_unreachable<_Tp>();
+    }
+}
+
+template <typename _Tp, size_t _Np, typename _V, size_t _NV, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_Tp, _Np>
+__to_simd_tuple(const std::array<_V, _NV>& __from, const _VX... __fromX)
+{
+  if constexpr (std::is_same_v<_Tp, _V>)
+    {
+      static_assert(
+	sizeof...(_VX) == 0,
+	"An array of scalars must be the last argument to __to_simd_tuple");
+      return __call_with_subscripts(
+	__from,
+	std::make_index_sequence<_NV>(), [&](const auto... __args) constexpr {
+	  return __simd_tuple_concat(
+	    _SimdTuple<_Tp, simd_abi::scalar>{__args}..., _SimdTuple<_Tp>());
+	});
+    }
+  else
+    return __call_with_subscripts(
+      __from,
+      std::make_index_sequence<_NV>(), [&](const auto... __args) constexpr {
+	return __to_simd_tuple<_Tp, _Np>(__args..., __fromX...);
+      });
+}
+
+template <size_t, typename _Tp> using __to_tuple_helper = _Tp;
+template <typename _Tp, typename _A0, size_t _NOut, size_t _Np,
+	  size_t... _Indexes>
+_GLIBCXX_SIMD_INTRINSIC __fixed_size_storage_t<_Tp, _NOut>
+__to_simd_tuple_impl(
+  std::index_sequence<_Indexes...>,
+  const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+  return __make_simd_tuple<_Tp, __to_tuple_helper<_Indexes, _A0>...>(
+    __args[_Indexes]...);
+}
+
+template <typename _Tp, typename _A0, size_t _NOut, size_t _Np,
+	  typename _R = __fixed_size_storage_t<_Tp, _NOut>>
+_GLIBCXX_SIMD_INTRINSIC _R
+__to_simd_tuple_sized(
+  const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+  static_assert(_Np * simd_size_v<_Tp, _A0> >= _NOut);
+  return __to_simd_tuple_impl<_Tp, _A0, _NOut>(
+    std::make_index_sequence<_R::_S_tuple_size>(), __args);
+}
+
+template <typename _Tp, typename _A0, size_t _Np>
+[[deprecated]] _GLIBCXX_SIMD_INTRINSIC auto
+__to_simd_tuple(
+  const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+  return __to_simd_tuple<_Tp, _Np * simd_size_v<_Tp, _A0>>(__args);
+}
+
+// __optimize_simd_tuple {{{1
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp>
+__optimize_simd_tuple(const _SimdTuple<_Tp>)
+{
+  return {};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC const _SimdTuple<_Tp, _Ap>&
+__optimize_simd_tuple(const _SimdTuple<_Tp, _Ap>& __x)
+{
+  return __x;
+}
+
+template <typename _Tp, typename _A0, typename _A1, typename... _Abis,
+	  typename _R = __fixed_size_storage_t<
+	    _Tp, _SimdTuple<_Tp, _A0, _A1, _Abis...>::size()>>
+_GLIBCXX_SIMD_INTRINSIC _R
+__optimize_simd_tuple(const _SimdTuple<_Tp, _A0, _A1, _Abis...>& __x)
+{
+  using _Tup = _SimdTuple<_Tp, _A0, _A1, _Abis...>;
+  if constexpr (std::is_same_v<_R, _Tup>)
+    return __x;
+  else if constexpr (is_same_v<typename _R::_FirstType,
+			       typename _Tup::_FirstType>)
+    return {__x.first, __optimize_simd_tuple(__x.second)};
+  else if constexpr (__is_scalar_abi<_A0>()) // implies all entries are scalar
+    return {
+      __generate_from_n_evaluations<_R::_S_first_size, typename _R::_FirstType>(
+	[&](auto __i) { return __x[__i]; }),
+      __optimize_simd_tuple(__simd_tuple_pop_front<_R::_S_first_size>(__x))};
+  else if constexpr (_R::_S_first_size
+		       == simd_size_v<
+			    _Tp,
+			    _A0> + simd_size_v<_Tp, _A1> && is_same_v<_A0, _A1>)
+    return {__concat(__x.template __at<0>(), __x.template __at<1>()),
+	    __optimize_simd_tuple(__x.second.second)};
+  else if constexpr (
+    sizeof...(_Abis) >= 2
+    && _R::_S_first_size
+	 == 4
+	      * simd_size_v<
+		_Tp,
+		_A0> && simd_size_v<_Tp, _A0> == __simd_tuple_element_t<(sizeof...(_Abis) >= 2 ? 3 : 0), _Tup>::size())
+    return {__concat(__concat(__x.template __at<0>(), __x.template __at<1>()),
+		     __concat(__x.template __at<2>(), __x.template __at<3>())),
+	    __optimize_simd_tuple(__x.second.second.second.second)};
+  else
+    {
+      _R __r;
+      __builtin_memcpy(__r.__as_charptr(), __x.__as_charptr(),
+		       sizeof(_Tp) * _R::size());
+      return __r;
+    }
+}
+
+// __for_each(const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0>& __t, _Fp&& __fun)
+{
+  static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__t), __t.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+	  typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __t, _Fp&& __fun)
+{
+  __fun(__make_meta<_Offset>(__t), __t.first);
+  __for_each<_Offset + simd_size<_Tp, _A0>::value>(__t.second,
+						   static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(_SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0>& __t, _Fp&& __fun)
+{
+  static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__t), __t.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+	  typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __t, _Fp&& __fun)
+{
+  __fun(__make_meta<_Offset>(__t), __t.first);
+  __for_each<_Offset + simd_size<_Tp, _A0>::value>(__t.second,
+						   static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(_SimdTuple &, const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
+	   _Fp&& __fun)
+{
+  static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+	  typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __a,
+	   const _SimdTuple<_Tp, _A0, _A1, _As...>& __b, _Fp&& __fun)
+{
+  __fun(__make_meta<_Offset>(__a), __a.first, __b.first);
+  __for_each<_Offset + simd_size<_Tp, _A0>::value>(__a.second, __b.second,
+						   static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(const _SimdTuple &, const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
+	   _Fp&& __fun)
+{
+  static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+	  typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __a,
+	   const _SimdTuple<_Tp, _A0, _A1, _As...>& __b, _Fp&& __fun)
+{
+  __fun(__make_meta<_Offset>(__a), __a.first, __b.first);
+  __for_each<_Offset + simd_size<_Tp, _A0>::value>(__a.second, __b.second,
+						   static_cast<_Fp&&>(__fun));
+}
+
+// }}}1
+// __extract_part(_SimdTuple) {{{
+template <int _Index, int _Total, int _Combine, typename _Tp, typename _A0,
+	  typename... _As>
+_GLIBCXX_SIMD_INTRINSIC auto // __vector_type_t or _SimdTuple
+__extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x)
+{
+  // worst cases:
+  // (a) 4, 4, 4 => 3, 3, 3, 3 (_Total = 4)
+  // (b) 2, 2, 2 => 3, 3       (_Total = 2)
+  // (c) 4, 2 => 2, 2, 2       (_Total = 3)
+  using _Tuple = _SimdTuple<_Tp, _A0, _As...>;
+  static_assert(_Index + _Combine <= _Total && _Index >= 0 && _Total >= 1);
+  constexpr size_t _Np = _Tuple::size();
+  static_assert(_Np >= _Total && _Np % _Total == 0);
+  constexpr size_t __values_per_part = _Np / _Total;
+  [[maybe_unused]] constexpr size_t __values_to_skip
+    = _Index * __values_per_part;
+  constexpr size_t __return_size = __values_per_part * _Combine;
+  using _RetAbi = simd_abi::deduce_t<_Tp, __return_size>;
+
+  // handle (optimize) the simple cases
+  if constexpr (_Index == 0 && _Tuple::_S_first_size == __return_size)
+    return __x.first._M_data;
+  else if constexpr (_Index == 0 && _Total == _Combine)
+    return __x;
+  else if constexpr (_Index == 0 && _Tuple::_S_first_size >= __return_size)
+    return __intrin_bitcast<__vector_type_t<_Tp, __return_size>>(
+      __as_vector(__x.first));
+
+  // recurse to skip unused data members at the beginning of _SimdTuple
+  else if constexpr (__values_to_skip >= _Tuple::_S_first_size)
+    { // recurse
+      if constexpr (_Tuple::_S_first_size % __values_per_part == 0)
+	{
+	  constexpr int __parts_in_first
+	    = _Tuple::_S_first_size / __values_per_part;
+	  return __extract_part<_Index - __parts_in_first,
+				_Total - __parts_in_first, _Combine>(
+	    __x.second);
+	}
+      else
+	return __extract_part<__values_to_skip - _Tuple::_S_first_size,
+			      _Np - _Tuple::_S_first_size, __return_size>(
+	  __x.second);
+    }
+
+  // extract from multiple _SimdTuple data members
+  else if constexpr (__return_size > _Tuple::_S_first_size - __values_to_skip)
+    {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+      const __may_alias<_Tp>* const element_ptr
+	= reinterpret_cast<const __may_alias<_Tp>*>(&__x) + __values_to_skip;
+      return __as_vector(simd<_Tp, _RetAbi>(element_ptr, element_aligned));
+#else
+      [[maybe_unused]] constexpr size_t __offset = __values_to_skip;
+      return __as_vector(simd<_Tp, _RetAbi>([&](auto __i) constexpr {
+	constexpr _SizeConstant<__i + __offset> __k;
+	return __x[__k];
+      }));
+#endif
+    }
+
+  // all of the return values are in __x.first
+  else if constexpr (_Tuple::_S_first_size % __values_per_part == 0)
+    return __extract_part<_Index, _Tuple::_S_first_size / __values_per_part,
+			  _Combine>(__x.first);
+  else
+    return __extract_part<__values_to_skip, _Tuple::_S_first_size,
+			  _Combine * __values_per_part>(__x.first);
+}
+
+// }}}
+// __fixed_size_storage_t<_Tp, _Np>{{{
+template <typename _Tp, int _Np, typename _Tuple,
+	  typename _Next = simd<_Tp, _AllNativeAbis::_BestAbi<_Tp, _Np>>,
+	  int _Remain = _Np - int(_Next::size())>
+struct __fixed_size_storage_builder;
+
+template <typename _Tp, int _Np>
+struct __fixed_size_storage
+  : public __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp>>
+{
+};
+
+template <typename _Tp, int _Np, typename... _As, typename _Next>
+struct __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp, _As...>, _Next, 0>
+{
+  using type = _SimdTuple<_Tp, _As..., typename _Next::abi_type>;
+};
+
+template <typename _Tp, int _Np, typename... _As, typename _Next, int _Remain>
+struct __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp, _As...>, _Next,
+				    _Remain>
+{
+  using type = typename __fixed_size_storage_builder<
+    _Tp, _Remain, _SimdTuple<_Tp, _As..., typename _Next::abi_type>>::type;
+};
+
+// }}}
+// _AbisInSimdTuple {{{
+template <typename _Tp> struct _SeqOp;
+template <size_t _I0, size_t... _Is>
+struct _SeqOp<std::index_sequence<_I0, _Is...>>
+{
+  using _FirstPlusOne = std::index_sequence<_I0 + 1, _Is...>;
+  using _NotFirstPlusOne = std::index_sequence<_I0, (_Is + 1)...>;
+  template <size_t _First, size_t _Add>
+  using _Prepend = std::index_sequence<_First, _I0 + _Add, (_Is + _Add)...>;
+};
+
+template <typename _Tp> struct _AbisInSimdTuple;
+template <typename _Tp> struct _AbisInSimdTuple<_SimdTuple<_Tp>>
+{
+  using _Counts = std::index_sequence<0>;
+  using _Begins = std::index_sequence<0>;
+};
+template <typename _Tp, typename _Ap>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _Ap>>
+{
+  using _Counts = std::index_sequence<1>;
+  using _Begins = std::index_sequence<0>;
+};
+template <typename _Tp, typename _A0, typename... _As>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A0, _As...>>
+{
+  using _Counts = typename _SeqOp<typename _AbisInSimdTuple<
+    _SimdTuple<_Tp, _A0, _As...>>::_Counts>::_FirstPlusOne;
+  using _Begins = typename _SeqOp<typename _AbisInSimdTuple<
+    _SimdTuple<_Tp, _A0, _As...>>::_Begins>::_NotFirstPlusOne;
+};
+template <typename _Tp, typename _A0, typename _A1, typename... _As>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A1, _As...>>
+{
+  using _Counts = typename _SeqOp<typename _AbisInSimdTuple<
+    _SimdTuple<_Tp, _A1, _As...>>::_Counts>::template _Prepend<1, 0>;
+  using _Begins = typename _SeqOp<typename _AbisInSimdTuple<
+    _SimdTuple<_Tp, _A1, _As...>>::_Begins>::template _Prepend<0, 1>;
+};
+
+// }}}
+// __autocvt_to_simd {{{
+template <typename _Tp, bool = std::is_arithmetic_v<__remove_cvref_t<_Tp>>>
+struct __autocvt_to_simd
+{
+  _Tp _M_data;
+  using _TT = __remove_cvref_t<_Tp>;
+  operator _TT() { return _M_data; }
+  operator _TT&()
+  {
+    static_assert(std::is_lvalue_reference<_Tp>::value, "");
+    static_assert(!std::is_const<_Tp>::value, "");
+    return _M_data;
+  }
+  operator _TT*()
+  {
+    static_assert(std::is_lvalue_reference<_Tp>::value, "");
+    static_assert(!std::is_const<_Tp>::value, "");
+    return &_M_data;
+  }
+
+  constexpr inline __autocvt_to_simd(_Tp dd) : _M_data(dd) {}
+
+  template <typename _Abi> operator simd<typename _TT::value_type, _Abi>()
+  {
+    return {__private_init, _M_data};
+  }
+
+  template <typename _Abi> operator simd<typename _TT::value_type, _Abi> &()
+  {
+    return *reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data);
+  }
+
+  template <typename _Abi> operator simd<typename _TT::value_type, _Abi> *()
+  {
+    return reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data);
+  }
+};
+template <typename _Tp> __autocvt_to_simd(_Tp &&) -> __autocvt_to_simd<_Tp>;
+
+template <typename _Tp> struct __autocvt_to_simd<_Tp, true>
+{
+  using _TT = __remove_cvref_t<_Tp>;
+  _Tp _M_data;
+  fixed_size_simd<_TT, 1> _M_fd;
+
+  constexpr inline __autocvt_to_simd(_Tp dd) : _M_data(dd), _M_fd(_M_data) {}
+  ~__autocvt_to_simd() { _M_data = __data(_M_fd).first; }
+
+  operator fixed_size_simd<_TT, 1>() { return _M_fd; }
+  operator fixed_size_simd<_TT, 1> &()
+  {
+    static_assert(std::is_lvalue_reference<_Tp>::value, "");
+    static_assert(!std::is_const<_Tp>::value, "");
+    return _M_fd;
+  }
+  operator fixed_size_simd<_TT, 1> *()
+  {
+    static_assert(std::is_lvalue_reference<_Tp>::value, "");
+    static_assert(!std::is_const<_Tp>::value, "");
+    return &_M_fd;
+  }
+};
+
+// }}}
+
+struct _CommonImplFixedSize;
+template <int _Np> struct _SimdImplFixedSize;
+template <int _Np> struct _MaskImplFixedSize;
+// simd_abi::_Fixed {{{
+template <int _Np> struct simd_abi::_Fixed
+{
+  template <typename _Tp> static constexpr size_t size = _Np;
+  template <typename _Tp> static constexpr size_t _S_full_size = _Np;
+  // validity traits {{{
+  struct _IsValidAbiTag : public __bool_constant<(_Np > 0)>
+  {
+  };
+  template <typename _Tp>
+  struct _IsValidSizeFor
+    : __bool_constant<(_Np <= simd_abi::max_fixed_size<_Tp>)>
+  {
+  };
+  template <typename _Tp>
+  struct _IsValid
+    : conjunction<_IsValidAbiTag, __is_vectorizable<_Tp>, _IsValidSizeFor<_Tp>>
+  {
+  };
+  template <typename _Tp>
+  static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+  // }}}
+  // __masked {{{
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __masked(_BitMask<_Np> __x)
+  {
+    return __x._M_sanitized();
+  }
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __masked(_SanitizedBitMask<_Np> __x)
+  {
+    return __x;
+  }
+
+  // }}}
+  // _*Impl {{{
+  using _CommonImpl = _CommonImplFixedSize;
+  using _SimdImpl = _SimdImplFixedSize<_Np>;
+  using _MaskImpl = _MaskImplFixedSize<_Np>;
+
+  // }}}
+  // __traits {{{
+  template <typename _Tp, bool = _S_is_valid_v<_Tp>>
+  struct __traits : _InvalidTraits
+  {
+  };
+
+  template <typename _Tp> struct __traits<_Tp, true>
+  {
+    using _IsValid = true_type;
+    using _SimdImpl = _SimdImplFixedSize<_Np>;
+    using _MaskImpl = _MaskImplFixedSize<_Np>;
+
+    // simd and simd_mask member types {{{
+    using _SimdMember = __fixed_size_storage_t<_Tp, _Np>;
+    using _MaskMember = _SanitizedBitMask<_Np>;
+    static constexpr size_t _S_simd_align
+      = __next_power_of_2(_Np * sizeof(_Tp));
+    static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+    // }}}
+    // _SimdBase / base class for simd, providing extra conversions {{{
+    struct _SimdBase
+    {
+      // The following ensures, function arguments are passed via the stack.
+      // This is important for ABI compatibility across TU boundaries
+      _SimdBase(const _SimdBase&) {}
+      _SimdBase() = default;
+
+      explicit operator const _SimdMember &() const
+      {
+	return static_cast<const simd<_Tp, _Fixed>*>(this)->_M_data;
+      }
+      explicit operator std::array<_Tp, _Np>() const
+      {
+	std::array<_Tp, _Np> __r;
+	// _SimdMember can be larger because of higher alignment
+	static_assert(sizeof(__r) <= sizeof(_SimdMember), "");
+	__builtin_memcpy(__r.data(), &static_cast<const _SimdMember&>(*this),
+			 sizeof(__r));
+	return __r;
+      }
+    };
+
+    // }}}
+    // _MaskBase {{{
+    // empty. The std::bitset interface suffices
+    struct _MaskBase
+    {
+    };
+
+    // }}}
+    // _SimdCastType {{{
+    struct _SimdCastType
+    {
+      _SimdCastType(const std::array<_Tp, _Np>&);
+      _SimdCastType(const _SimdMember& dd) : _M_data(dd) {}
+      explicit operator const _SimdMember &() const { return _M_data; }
+
+    private:
+      const _SimdMember& _M_data;
+    };
+
+    // }}}
+    // _MaskCastType {{{
+    class _MaskCastType
+    {
+      _MaskCastType() = delete;
+    };
+    // }}}
+  };
+  // }}}
+};
+
+// }}}
+// _CommonImplFixedSize {{{
+struct _CommonImplFixedSize
+{
+  // __store {{{
+  template <typename _Flags, typename _Tp, typename... _As>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __store(const _SimdTuple<_Tp, _As...>& __x, void* __addr, _Flags)
+  {
+    constexpr size_t _Np = _SimdTuple<_Tp, _As...>::size();
+    if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+      __addr = __builtin_assume_aligned(
+	__addr, memory_alignment_v<fixed_size_simd<_Tp, _Np>, _Tp>);
+    else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+      __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+    __builtin_memcpy(__addr, &__x, _Np * sizeof(_Tp));
+  }
+
+  // }}}
+};
+
+// }}}
+// _SimdImplFixedSize {{{1
+// fixed_size should not inherit from _SimdMathFallback in order for
+// specializations in the used _SimdTuple Abis to get used
+template <int _Np> struct _SimdImplFixedSize
+{
+  // member types {{{2
+  using _MaskMember = _SanitizedBitMask<_Np>;
+  template <typename _Tp> using _SimdMember = __fixed_size_storage_t<_Tp, _Np>;
+  template <typename _Tp>
+  static constexpr std::size_t _S_tuple_size = _SimdMember<_Tp>::_S_tuple_size;
+  template <typename _Tp>
+  using _Simd = std::experimental::simd<_Tp, simd_abi::fixed_size<_Np>>;
+  template <typename _Tp> using _TypeTag = _Tp*;
+
+  // broadcast {{{2
+  template <typename _Tp>
+  static constexpr inline _SimdMember<_Tp> __broadcast(_Tp __x) noexcept
+  {
+    return _SimdMember<_Tp>::__generate([&](auto __meta) constexpr {
+      return __meta.__broadcast(__x);
+    });
+  }
+
+  // __generator {{{2
+  template <typename _Fp, typename _Tp>
+  static constexpr inline _SimdMember<_Tp> __generator(_Fp&& __gen,
+						       _TypeTag<_Tp>)
+  {
+    return _SimdMember<_Tp>::__generate([&__gen](auto __meta) constexpr {
+      return __meta.__generator(
+	[&](auto __i) constexpr {
+	  return __i < _Np ? __gen(_SizeConstant<__meta._S_offset + __i>()) : 0;
+	},
+	_TypeTag<_Tp>());
+    });
+  }
+
+  // __load {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  static inline _SimdMember<_Tp> __load(const _Up* __mem, _Fp __f,
+					_TypeTag<_Tp>) noexcept
+  {
+    return _SimdMember<_Tp>::__generate([&](auto __meta) {
+      return __meta.__load(&__mem[__meta._S_offset], __f, _TypeTag<_Tp>());
+    });
+  }
+
+  // __masked_load {{{2
+  template <typename _Tp, typename... _As, typename _Up, typename _Fp>
+  static inline _SimdTuple<_Tp, _As...>
+  __masked_load(const _SimdTuple<_Tp, _As...>& __old, const _MaskMember __bits,
+		const _Up* __mem, _Fp __f) noexcept
+  {
+    auto __merge = __old;
+    __for_each(__merge, [&](auto __meta, auto& __native) {
+      if (__meta.__submask(__bits).any())
+#pragma GCC diagnostic push
+      // __mem + __mem._S_offset could be UB ([expr.add]/4.3, but it punts the
+      // responsibility for avoiding UB to the caller of the masked load via the
+      // mask. Consequently, the compiler may assume this branch is unreachable,
+      // if the pointer arithmetic is UB.
+#pragma GCC diagnostic ignored "-Warray-bounds"
+	__native = __meta.__masked_load(__native, __meta.__make_mask(__bits),
+					__mem + __meta._S_offset, __f);
+#pragma GCC diagnostic pop
+    });
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  static inline void __store(const _SimdMember<_Tp>& __v, _Up* __mem, _Fp __f,
+			     _TypeTag<_Tp>) noexcept
+  {
+    __for_each(__v, [&](auto __meta, auto __native) {
+      __meta.__store(__native, &__mem[__meta._S_offset], __f, _TypeTag<_Tp>());
+    });
+  }
+
+  // __masked_store {{{2
+  template <typename _Tp, typename... _As, typename _Up, typename _Fp>
+  static inline void __masked_store(const _SimdTuple<_Tp, _As...>& __v,
+				    _Up* __mem, _Fp __f,
+				    const _MaskMember __bits) noexcept
+  {
+    __for_each(__v, [&](auto __meta, auto __native) {
+      if (__meta.__submask(__bits).any())
+#pragma GCC diagnostic push
+      // __mem + __mem._S_offset could be UB ([expr.add]/4.3, but it punts the
+      // responsibility for avoiding UB to the caller of the masked store via the
+      // mask. Consequently, the compiler may assume this branch is unreachable,
+      // if the pointer arithmetic is UB.
+#pragma GCC diagnostic ignored "-Warray-bounds"
+	__meta.__masked_store(__native, __mem + __meta._S_offset, __f,
+			      __meta.__make_mask(__bits));
+#pragma GCC diagnostic pop
+    });
+  }
+
+  // negation {{{2
+  template <typename _Tp, typename... _As>
+  static inline _MaskMember
+  __negate(const _SimdTuple<_Tp, _As...>& __x) noexcept
+  {
+    _MaskMember __bits = 0;
+    __for_each(
+      __x, [&__bits](auto __meta, auto __native) constexpr {
+	__bits |= __meta.__mask_to_shifted_ullong(__meta.__negate(__native));
+      });
+    return __bits;
+  }
+
+  // reductions {{{2
+  template <typename _Tp, typename _BinaryOperation>
+  static constexpr inline _Tp __reduce(const _Simd<_Tp>& __x,
+				       const _BinaryOperation& __binary_op)
+  {
+    using _Tup = _SimdMember<_Tp>;
+    const _Tup& __tup = __data(__x);
+    if constexpr (_Tup::_S_tuple_size == 1)
+      return _Tup::_FirstAbi::_SimdImpl::__reduce(__tup.template __simd_at<0>(),
+						  __binary_op);
+    else if constexpr (_Tup::_S_tuple_size == 2
+		       && _Tup::size() > 2
+		       && _Tup::_SecondType::size() == 1)
+      {
+	return __binary_op(simd<_Tp, simd_abi::scalar>(
+			     reduce(__tup.template __simd_at<0>(),
+				    __binary_op)),
+			   __tup.template __simd_at<1>())[0];
+      }
+    else if constexpr (_Tup::_S_tuple_size == 2
+		       && _Tup::size() > 4
+		       && _Tup::_SecondType::size() == 2)
+      {
+	return __binary_op(
+	  simd<_Tp, simd_abi::scalar>(
+	    reduce(__tup.template __simd_at<0>(), __binary_op)),
+	  simd<_Tp, simd_abi::scalar>(
+	    reduce(__tup.template __simd_at<1>(), __binary_op)))[0];
+      }
+    else
+      {
+	const auto& __x2
+	  = __call_with_n_evaluations<__div_roundup(_Tup::_S_tuple_size, 2)>(
+	    [](auto __first_simd, auto... __remaining) {
+	      if constexpr (sizeof...(__remaining) == 0)
+		return __first_simd;
+	      else
+		{
+		  using _Tup2
+		    = _SimdTuple<_Tp, typename decltype(__first_simd)::abi_type,
+				 typename decltype(__remaining)::abi_type...>;
+		  return fixed_size_simd<_Tp, _Tup2::size()>(
+		    __private_init,
+		    __make_simd_tuple(__first_simd, __remaining...));
+		}
+	    },
+	    [&](auto __i) {
+	      auto __left = __tup.template __simd_at<2 * __i>();
+	      if constexpr (2 * __i + 1 == _Tup::_S_tuple_size)
+		return __left;
+	      else
+		{
+		  auto __right = __tup.template __simd_at<2 * __i + 1>();
+		  using _LT = decltype(__left);
+		  using _RT = decltype(__right);
+		  if constexpr (_LT::size() == _RT::size())
+		    return __binary_op(__left, __right);
+		  else
+		    {
+		      _GLIBCXX_SIMD_CONSTEXPR typename _LT::mask_type __k(
+			  __private_init, [](auto __j) constexpr {
+			    return __j < _RT::size();
+			  });
+		      _LT __ext_right = __left;
+		      where(__k, __ext_right)
+			= __proposed::resizing_simd_cast<_LT>(__right);
+		      where(__k, __left) = __binary_op(__left, __ext_right);
+		      return __left;
+		    }
+		}
+	    });
+	return reduce(__x2, __binary_op);
+      }
+  }
+
+  // __min, __max {{{2
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __min(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
+  {
+    return __a.__apply_per_chunk(
+      [](auto __impl, auto __aa, auto __bb) constexpr {
+	return __impl.__min(__aa, __bb);
+      },
+      __b);
+  }
+
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __max(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
+  {
+    return __a.__apply_per_chunk(
+      [](auto __impl, auto __aa, auto __bb) constexpr {
+	return __impl.__max(__aa, __bb);
+      },
+      __b);
+  }
+
+  // __complement {{{2
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __complement(const _SimdTuple<_Tp, _As...>& __x) noexcept
+  {
+    return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr {
+      return __impl.__complement(__xx);
+    });
+  }
+
+  // __unary_minus {{{2
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __unary_minus(const _SimdTuple<_Tp, _As...>& __x) noexcept
+  {
+    return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr {
+      return __impl.__unary_minus(__xx);
+    });
+  }
+
+  // arithmetic operators {{{2
+
+#define _GLIBCXX_SIMD_FIXED_OP(name_, op_)                                     \
+  template <typename _Tp, typename... _As>                                     \
+  static inline constexpr _SimdTuple<_Tp, _As...> name_(                       \
+    const _SimdTuple<_Tp, _As...> __x, const _SimdTuple<_Tp, _As...> __y)      \
+  {                                                                            \
+    return __x.__apply_per_chunk(                                              \
+      [](auto __impl, auto __xx, auto __yy) constexpr {                        \
+	return __impl.name_(__xx, __yy);                                       \
+      },                                                                       \
+      __y);                                                                    \
+  }
+
+  _GLIBCXX_SIMD_FIXED_OP(__plus, +)
+  _GLIBCXX_SIMD_FIXED_OP(__minus, -)
+  _GLIBCXX_SIMD_FIXED_OP(__multiplies, *)
+  _GLIBCXX_SIMD_FIXED_OP(__divides, /)
+  _GLIBCXX_SIMD_FIXED_OP(__modulus, %)
+  _GLIBCXX_SIMD_FIXED_OP(__bit_and, &)
+  _GLIBCXX_SIMD_FIXED_OP(__bit_or, |)
+  _GLIBCXX_SIMD_FIXED_OP(__bit_xor, ^)
+  _GLIBCXX_SIMD_FIXED_OP(__bit_shift_left, <<)
+  _GLIBCXX_SIMD_FIXED_OP(__bit_shift_right, >>)
+#undef _GLIBCXX_SIMD_FIXED_OP
+
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __bit_shift_left(const _SimdTuple<_Tp, _As...>& __x, int __y)
+  {
+    return __x.__apply_per_chunk([__y](auto __impl, auto __xx) constexpr {
+      return __impl.__bit_shift_left(__xx, __y);
+    });
+  }
+
+  template <typename _Tp, typename... _As>
+  static inline constexpr _SimdTuple<_Tp, _As...>
+  __bit_shift_right(const _SimdTuple<_Tp, _As...>& __x, int __y)
+  {
+    return __x.__apply_per_chunk([__y](auto __impl, auto __xx) constexpr {
+      return __impl.__bit_shift_right(__xx, __y);
+    });
+  }
+
+  // math {{{2
+#define _GLIBCXX_SIMD_APPLY_ON_TUPLE(_RetTp, __name)                           \
+  template <typename _Tp, typename... _As, typename... _More>                  \
+  static inline __fixed_size_storage_t<_RetTp,                                 \
+				       _SimdTuple<_Tp, _As...>::size()>        \
+    __##__name(const _SimdTuple<_Tp, _As...>& __x, const _More&... __more)     \
+  {                                                                            \
+    if constexpr (sizeof...(_More) == 0)                                       \
+      {                                                                        \
+	if constexpr (is_same_v<_Tp, _RetTp>)                                  \
+	  return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr {  \
+	    using _V = typename decltype(__impl)::simd_type;                   \
+	    return __data(__name(_V(__private_init, __xx)));                   \
+	  });                                                                  \
+	else                                                                   \
+	  return __optimize_simd_tuple(__x.template __apply_r<_RetTp>(         \
+	    [](auto __impl, auto __xx) { return __impl.__##__name(__xx); }));  \
+      }                                                                        \
+    else if constexpr (                                                        \
+      is_same_v<                                                               \
+	_Tp,                                                                   \
+	_RetTp> && (... && std::is_same_v<_SimdTuple<_Tp, _As...>, _More>) )   \
+      return __x.__apply_per_chunk(                                            \
+	[](auto __impl, auto __xx, auto... __pack) constexpr {                 \
+	  using _V = typename decltype(__impl)::simd_type;                     \
+	  return __data(                                                       \
+	    __name(_V(__private_init, __xx), _V(__private_init, __pack)...));  \
+	},                                                                     \
+	__more...);                                                            \
+    else if constexpr (is_same_v<_Tp, _RetTp>)                                 \
+      return __x.__apply_per_chunk(                                            \
+	[](auto __impl, auto __xx, auto... __pack) constexpr {                 \
+	  using _V = typename decltype(__impl)::simd_type;                     \
+	  return __data(                                                       \
+	    __name(_V(__private_init, __xx), __autocvt_to_simd(__pack)...));   \
+	},                                                                     \
+	__more...);                                                            \
+    else                                                                       \
+      __assert_unreachable<_Tp>();                                             \
+  }
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, acos)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, asin)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atan)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atan2)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cos)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sin)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tan)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, acosh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, asinh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atanh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cosh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sinh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tanh)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, exp)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, exp2)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, expm1)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(int, ilogb)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log10)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log1p)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log2)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, logb)
+  // modf implemented in simd_math.h
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, scalbn) // double scalbn(double x, int exp);
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, scalbln)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cbrt)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, abs)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fabs)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, pow)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sqrt)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, erf)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, erfc)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, lgamma)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tgamma)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, trunc)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ceil)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, floor)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nearbyint)
+
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, rint)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(long, lrint)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(long long, llrint)
+
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, round)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(long, lround)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(long long, llround)
+
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ldexp)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmod)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, remainder)
+  // copysign in simd_math.h
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nextafter)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fdim)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmax)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmin)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fma)
+  _GLIBCXX_SIMD_APPLY_ON_TUPLE(int, fpclassify)
+#undef _GLIBCXX_SIMD_APPLY_ON_TUPLE
+
+  template <typename _Tp, typename... _Abis>
+  static _SimdTuple<_Tp, _Abis...>
+  __remquo(const _SimdTuple<_Tp, _Abis...>& __x,
+	   const _SimdTuple<_Tp, _Abis...>& __y,
+	   __fixed_size_storage_t<int, _SimdTuple<_Tp, _Abis...>::size()>* __z)
+  {
+    return __x.__apply_per_chunk(
+      [](auto __impl, const auto __xx, const auto __yy, auto& __zz) {
+	return __impl.__remquo(__xx, __yy, &__zz);
+      },
+      __y, *__z);
+  }
+
+  template <typename _Tp, typename... _As>
+  static inline _SimdTuple<_Tp, _As...>
+  __frexp(const _SimdTuple<_Tp, _As...>& __x,
+	  __fixed_size_storage_t<int, _Np>& __exp) noexcept
+  {
+    return __x.__apply_per_chunk(
+      [](auto __impl, const auto& __a, auto& __b) {
+	return __data(
+	  frexp(typename decltype(__impl)::simd_type(__private_init, __a),
+		__autocvt_to_simd(__b)));
+      },
+      __exp);
+  }
+
+  template <typename _Tp, typename... _As>
+  static inline __fixed_size_storage_t<int, _Np>
+  __fpclassify(const _SimdTuple<_Tp, _As...>& __x) noexcept
+  {
+    return __optimize_simd_tuple(__x.template __apply_r<int>(
+      [](auto __impl, auto __xx) { return __impl.__fpclassify(__xx); }));
+  }
+
+#define _GLIBCXX_SIMD_TEST_ON_TUPLE_(name_)                                    \
+  template <typename _Tp, typename... _As>                                     \
+  static inline _MaskMember __##name_(                                         \
+    const _SimdTuple<_Tp, _As...>& __x) noexcept                               \
+  {                                                                            \
+    return __test([](auto __impl,                                              \
+		     auto __xx) { return __impl.__##name_(__xx); },            \
+		  __x);                                                        \
+  }
+  _GLIBCXX_SIMD_TEST_ON_TUPLE_(isinf)
+  _GLIBCXX_SIMD_TEST_ON_TUPLE_(isfinite)
+  _GLIBCXX_SIMD_TEST_ON_TUPLE_(isnan)
+  _GLIBCXX_SIMD_TEST_ON_TUPLE_(isnormal)
+  _GLIBCXX_SIMD_TEST_ON_TUPLE_(signbit)
+#undef _GLIBCXX_SIMD_TEST_ON_TUPLE_
+
+  // __increment & __decrement{{{2
+  template <typename... _Ts>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
+  __increment(_SimdTuple<_Ts...>& __x)
+  {
+    __for_each(
+      __x,
+      [](auto __meta, auto& native) constexpr { __meta.__increment(native); });
+  }
+
+  template <typename... _Ts>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
+  __decrement(_SimdTuple<_Ts...>& __x)
+  {
+    __for_each(
+      __x,
+      [](auto __meta, auto& native) constexpr { __meta.__decrement(native); });
+  }
+
+  // compares {{{2
+#define _GLIBCXX_SIMD_CMP_OPERATIONS(__cmp)                                    \
+  template <typename _Tp, typename... _As>                                     \
+  _GLIBCXX_SIMD_INTRINSIC constexpr static _MaskMember __cmp(                  \
+    const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y)    \
+  {                                                                            \
+    return __test(                                                             \
+      [](auto __impl, auto __xx, auto __yy) constexpr {                        \
+	return __impl.__cmp(__xx, __yy);                                       \
+      },                                                                       \
+      __x, __y);                                                               \
+  }
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__equal_to)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__not_equal_to)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__less)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__less_equal)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__isless)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__islessequal)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__isgreater)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__isgreaterequal)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__islessgreater)
+  _GLIBCXX_SIMD_CMP_OPERATIONS(__isunordered)
+#undef _GLIBCXX_SIMD_CMP_OPERATIONS
+
+  // smart_reference access {{{2
+  template <typename _Tp, typename... _As, typename _Up>
+  _GLIBCXX_SIMD_INTRINSIC static void __set(_SimdTuple<_Tp, _As...>& __v,
+					    int __i, _Up&& __x) noexcept
+  {
+    __v.__set(__i, static_cast<_Up&&>(__x));
+  }
+
+  // __masked_assign {{{2
+  template <typename _Tp, typename... _As>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+		  const __id<_SimdTuple<_Tp, _As...>>& __rhs)
+  {
+    __for_each(
+      __lhs,
+      __rhs, [&](auto __meta, auto& __native_lhs, auto __native_rhs) constexpr {
+	__meta.__masked_assign(__meta.__make_mask(__bits), __native_lhs,
+			       __native_rhs);
+      });
+  }
+
+  // Optimization for the case where the RHS is a scalar. No need to broadcast
+  // the scalar to a simd first.
+  template <typename _Tp, typename... _As>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+		  const __id<_Tp> __rhs)
+  {
+    __for_each(
+      __lhs, [&](auto __meta, auto& __native_lhs) constexpr {
+	__meta.__masked_assign(__meta.__make_mask(__bits), __native_lhs, __rhs);
+      });
+  }
+
+  // __masked_cassign {{{2
+  template <typename _Op, typename _Tp, typename... _As>
+  static inline void
+  __masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+		   const _SimdTuple<_Tp, _As...>& __rhs, _Op __op)
+  {
+    __for_each(
+      __lhs,
+      __rhs, [&](auto __meta, auto& __native_lhs, auto __native_rhs) constexpr {
+	__meta.template __masked_cassign(__meta.__make_mask(__bits),
+					 __native_lhs, __native_rhs, __op);
+      });
+  }
+
+  // Optimization for the case where the RHS is a scalar. No need to broadcast
+  // the scalar to a simd first.
+  template <typename _Op, typename _Tp, typename... _As>
+  static inline void __masked_cassign(const _MaskMember __bits,
+				      _SimdTuple<_Tp, _As...>& __lhs,
+				      const _Tp& __rhs, _Op __op)
+  {
+    __for_each(
+      __lhs, [&](auto __meta, auto& __native_lhs) constexpr {
+	__meta.template __masked_cassign(__meta.__make_mask(__bits),
+					 __native_lhs, __rhs, __op);
+      });
+  }
+
+  // __masked_unary {{{2
+  template <template <typename> class _Op, typename _Tp, typename... _As>
+  static inline _SimdTuple<_Tp, _As...>
+  __masked_unary(const _MaskMember __bits,
+		 const _SimdTuple<_Tp, _As...> __v) // TODO: const-ref __v?
+  {
+    return __v.__apply_wrapped([&__bits](auto __meta, auto __native) constexpr {
+      return __meta.template __masked_unary<_Op>(__meta.__make_mask(__bits),
+						 __native);
+    });
+  }
+
+  // }}}2
+};
+
+// _MaskImplFixedSize {{{1
+template <int _Np> struct _MaskImplFixedSize
+{
+  static_assert(sizeof(_ULLong) * CHAR_BIT >= _Np,
+		"The fixed_size implementation relies on one "
+		"_ULLong being able to store all boolean "
+		"elements."); // required in load & store
+
+  // member types {{{
+  using _Abi = simd_abi::fixed_size<_Np>;
+  template <typename _Tp>
+  using _FirstAbi = typename __fixed_size_storage_t<_Tp, _Np>::_FirstAbi;
+  using _MaskMember = _SanitizedBitMask<_Np>;
+  template <typename _Tp> using _TypeTag = _Tp*;
+
+  // }}}
+  // __broadcast {{{
+  template <typename>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember __broadcast(bool __x)
+  {
+    return __x ? ~_MaskMember() : _MaskMember();
+  }
+
+  // }}}
+  // __load {{{
+  template <typename, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember __load(const bool* __mem)
+  {
+    using _Up = make_unsigned_t<__int_for_sizeof_t<bool>>;
+    const simd<_Up, _Abi> __bools(reinterpret_cast<const __may_alias<_Up>*>(
+				    __mem),
+				  _Fp());
+    return __data(__bools != 0);
+  }
+
+  // }}}
+  // __to_bits {{{
+  template <bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __to_bits(_BitMask<_Np, _Sanitized> __x)
+  {
+    if constexpr (_Sanitized)
+      return __x;
+    else
+      return __x._M_sanitized();
+  }
+
+  // }}}
+  // __convert {{{
+  template <typename _Tp, typename _Up, typename _UAbi>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
+  __convert(simd_mask<_Up, _UAbi> __x)
+  {
+    return _UAbi::_MaskImpl::__to_bits(__data(__x))
+      .template _M_extract<0, _Np>();
+  }
+
+  // }}}
+  // __from_bitmask {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __from_bitmask(_MaskMember __bits, _TypeTag<_Tp>) noexcept
+  {
+    return __bits;
+  }
+
+  // __load {{{2
+  template <typename _Fp>
+  static inline _MaskMember __load(const bool* __mem, _Fp __f) noexcept
+  {
+    // TODO: _UChar is not necessarily the best type to use here. For smaller
+    // _Np _UShort, _UInt, _ULLong, float, and double can be more efficient.
+    _ULLong __r = 0;
+    using _Vs = __fixed_size_storage_t<_UChar, _Np>;
+    __for_each(_Vs{}, [&](auto __meta, auto) {
+      __r |= __meta.__mask_to_shifted_ullong(
+	__meta._S_mask_impl.__load(&__mem[__meta._S_offset], __f,
+				   _SizeConstant<__meta.size()>()));
+    });
+    return __r;
+  }
+
+  // __masked_load {{{2
+  template <typename _Fp>
+  static inline _MaskMember __masked_load(_MaskMember __merge,
+					  _MaskMember __mask, const bool* __mem,
+					  _Fp) noexcept
+  {
+    _BitOps::__bit_iteration(__mask.to_ullong(),
+			     [&](auto __i) { __merge.set(__i, __mem[__i]); });
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Fp>
+  static inline void __store(const _MaskMember __bitmask, bool* __mem,
+			     _Fp) noexcept
+  {
+    if constexpr (_Np == 1)
+      __mem[0] = __bitmask[0];
+    else
+      _FirstAbi<_UChar>::_CommonImpl::__store_bool_array(__bitmask, __mem,
+							 _Fp());
+  }
+
+  // __masked_store {{{2
+  template <typename _Fp>
+  static inline void __masked_store(const _MaskMember __v, bool* __mem, _Fp,
+				    const _MaskMember __k) noexcept
+  {
+    _BitOps::__bit_iteration(__k, [&](auto __i) { __mem[__i] = __v[__i]; });
+  }
+
+  // logical and bitwise operators {{{2
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __logical_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
+  {
+    return __x & __y;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __logical_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
+  {
+    return __x | __y;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
+  __bit_not(const _MaskMember& __x) noexcept
+  {
+    return ~__x;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __bit_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
+  {
+    return __x & __y;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __bit_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
+  {
+    return __x | __y;
+  }
+
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+  __bit_xor(const _MaskMember& __x, const _MaskMember& __y) noexcept
+  {
+    return __x ^ __y;
+  }
+
+  // smart_reference access {{{2
+  _GLIBCXX_SIMD_INTRINSIC static void __set(_MaskMember& __k, int __i,
+					    bool __x) noexcept
+  {
+    __k.set(__i, __x);
+  }
+
+  // __masked_assign {{{2
+  _GLIBCXX_SIMD_INTRINSIC static void __masked_assign(const _MaskMember __k,
+						      _MaskMember& __lhs,
+						      const _MaskMember __rhs)
+  {
+    __lhs = (__lhs & ~__k) | (__rhs & __k);
+  }
+
+  // Optimization for the case where the RHS is a scalar.
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(const _MaskMember __k, _MaskMember& __lhs, const bool __rhs)
+  {
+    if (__rhs)
+      {
+	__lhs |= __k;
+      }
+    else
+      {
+	__lhs &= ~__k;
+      }
+  }
+
+  // }}}2
+  // __all_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __data(__k).all();
+  }
+
+  // }}}
+  // __any_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __data(__k).any();
+  }
+
+  // }}}
+  // __none_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __data(__k).none();
+  }
+
+  // }}}
+  // __some_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool
+  __some_of([[maybe_unused]] simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (_Np == 1)
+      return false;
+    else
+      return __data(__k).any() && !__data(__k).all();
+  }
+
+  // }}}
+  // __popcount {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+  {
+    return __data(__k).count();
+  }
+
+  // }}}
+  // __find_first_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+  {
+    return _BitOps::__firstbit(__data(__k).to_ullong());
+  }
+
+  // }}}
+  // __find_last_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+  {
+    return _BitOps::__lastbit(__data(__k).to_ullong());
+  }
+
+  // }}}
+};
+// }}}1
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
new file mode 100644
index 00000000000..4185a3bcaa1
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -0,0 +1,1451 @@
+// Math overloads for simd -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+
+#if __cplusplus >= 201703L
+
+#include <utility>
+#include <iomanip>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+template <typename _Tp, typename _V>
+using __samesize = fixed_size_simd<_Tp, _V::size()>;
+// __math_return_type {{{
+template <typename _DoubleR, typename _Tp, typename _Abi>
+struct __math_return_type;
+template <typename _DoubleR, typename _Tp, typename _Abi>
+using __math_return_type_t =
+  typename __math_return_type<_DoubleR, _Tp, _Abi>::type;
+
+template <typename _Tp, typename _Abi>
+struct __math_return_type<double, _Tp, _Abi>
+{
+  using type = std::experimental::simd<_Tp, _Abi>;
+};
+template <typename _Tp, typename _Abi>
+struct __math_return_type<bool, _Tp, _Abi>
+{
+  using type = std::experimental::simd_mask<_Tp, _Abi>;
+};
+template <typename _DoubleR, typename _Tp, typename _Abi>
+struct __math_return_type
+{
+  using type
+    = std::experimental::fixed_size_simd<_DoubleR, simd_size_v<_Tp, _Abi>>;
+};
+//}}}
+// _GLIBCXX_SIMD_MATH_CALL_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL_(__name)                                       \
+  template <typename _Tp, typename _Abi, typename...,                          \
+	    typename _R = std::experimental::__math_return_type_t<             \
+	      decltype(std::__name(std::declval<double>())), _Tp, _Abi>>       \
+  enable_if_t<std::is_floating_point_v<_Tp>, _R> __name(                       \
+    std::experimental::simd<_Tp, _Abi> __x)                                    \
+  {                                                                            \
+    return {std::experimental::__private_init,                                 \
+	    _Abi::_SimdImpl::__##__name(std::experimental::__data(__x))};      \
+  }
+
+// }}}
+//__extra_argument_type{{{
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type;
+
+template <typename _Tp, typename _Abi>
+struct __extra_argument_type<_Tp*, _Tp, _Abi>
+{
+  using type = std::experimental::simd<_Tp, _Abi>*;
+  static constexpr double* declval();
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto __data(type __x)
+  {
+    return &std::experimental::__data(*__x);
+  }
+  static constexpr bool __needs_temporary_scalar = true;
+};
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type<_Up*, _Tp, _Abi>
+{
+  static_assert(std::is_integral_v<_Up>);
+  using type = std::experimental::fixed_size_simd<
+    _Up, std::experimental::simd_size_v<_Tp, _Abi>>*;
+  static constexpr _Up* declval();
+  _GLIBCXX_SIMD_INTRINSIC static constexpr auto __data(type __x)
+  {
+    return &std::experimental::__data(*__x);
+  }
+  static constexpr bool __needs_temporary_scalar = true;
+};
+template <typename _Tp, typename _Abi>
+struct __extra_argument_type<_Tp, _Tp, _Abi>
+{
+  using type = std::experimental::simd<_Tp, _Abi>;
+  static constexpr double declval();
+  _GLIBCXX_SIMD_INTRINSIC static constexpr decltype(auto)
+  __data(const type& __x)
+  {
+    return std::experimental::__data(__x);
+  }
+  static constexpr bool __needs_temporary_scalar = false;
+};
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type
+{
+  static_assert(std::is_integral_v<_Up>);
+  using type = std::experimental::fixed_size_simd<
+    _Up, std::experimental::simd_size_v<_Tp, _Abi>>;
+  static constexpr _Up declval();
+  _GLIBCXX_SIMD_INTRINSIC static constexpr decltype(auto)
+  __data(const type& __x)
+  {
+    return std::experimental::__data(__x);
+  }
+  static constexpr bool __needs_temporary_scalar = false;
+};
+//}}}
+// _GLIBCXX_SIMD_MATH_CALL2_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL2_(__name, arg2_)                               \
+  template <typename _Tp, typename _Abi, typename...,                          \
+	    typename _Arg2                                                     \
+	    = std::experimental::__extra_argument_type<arg2_, _Tp, _Abi>,      \
+	    typename _R = std::experimental::__math_return_type_t<             \
+	      decltype(std::__name(std::declval<double>(), _Arg2::declval())), \
+	      _Tp, _Abi>>                                                      \
+  enable_if_t<std::is_floating_point_v<_Tp>, _R> __name(                       \
+    const std::experimental::simd<_Tp, _Abi>& __x,                             \
+    const typename _Arg2::type& __y)                                           \
+  {                                                                            \
+    return {std::experimental::__private_init,                                 \
+	    _Abi::_SimdImpl::__##__name(std::experimental::__data(__x),        \
+					_Arg2::__data(__y))};                  \
+  }                                                                            \
+  template <typename _Up, typename _Tp, typename _Abi>                         \
+  _GLIBCXX_SIMD_INTRINSIC std::experimental::__math_return_type_t<             \
+    decltype(std::__name(                                                      \
+      std::declval<double>(),                                                  \
+      std::declval<enable_if_t<                                                \
+	std::conjunction_v<                                                    \
+	  std::is_same<arg2_, _Tp>,                                            \
+	  std::negation<std::is_same<__remove_cvref_t<_Up>,                    \
+				     std::experimental::simd<_Tp, _Abi>>>,     \
+	  std::is_convertible<_Up, std::experimental::simd<_Tp, _Abi>>,        \
+	  std::is_floating_point<_Tp>>,                                        \
+	double>>())),                                                          \
+    _Tp, _Abi>                                                                 \
+  __name(_Up&& __xx, const std::experimental::simd<_Tp, _Abi>& __yy)           \
+  {                                                                            \
+    return std::experimental::__name(std::experimental::simd<_Tp, _Abi>(       \
+				       static_cast<_Up&&>(__xx)),              \
+				     __yy);                                    \
+  }
+
+// }}}
+// _GLIBCXX_SIMD_MATH_CALL3_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL3_(__name, arg2_, arg3_)                        \
+  template <typename _Tp, typename _Abi, typename...,                          \
+	    typename _Arg2                                                     \
+	    = std::experimental::__extra_argument_type<arg2_, _Tp, _Abi>,      \
+	    typename _Arg3                                                     \
+	    = std::experimental::__extra_argument_type<arg3_, _Tp, _Abi>,      \
+	    typename _R = std::experimental::__math_return_type_t<             \
+	      decltype(std::__name(std::declval<double>(), _Arg2::declval(),   \
+				   _Arg3::declval())),                         \
+	      _Tp, _Abi>>                                                      \
+  enable_if_t<std::is_floating_point_v<_Tp>, _R> __name(                       \
+    std::experimental::simd<_Tp, _Abi> __x, typename _Arg2::type __y,          \
+    typename _Arg3::type __z)                                                  \
+  {                                                                            \
+    return {std::experimental::__private_init,                                 \
+	    _Abi::_SimdImpl::__##__name(std::experimental::__data(__x),        \
+					_Arg2::__data(__y),                    \
+					_Arg3::__data(__z))};                  \
+  }                                                                            \
+  template <typename _Tp, typename _Up, typename _V, typename...,              \
+	    typename _TT = __remove_cvref_t<_Tp>,                              \
+	    typename _UU = __remove_cvref_t<_Up>,                              \
+	    typename _VV = __remove_cvref_t<_V>,                               \
+	    typename _Simd                                                     \
+	    = std::conditional_t<std::experimental::is_simd_v<_UU>, _UU, _VV>> \
+  _GLIBCXX_SIMD_INTRINSIC decltype(                                            \
+    std::experimental::__name(_Simd(std::declval<_Tp>()),                      \
+			      _Simd(std::declval<_Up>()),                      \
+			      _Simd(std::declval<_V>())))                      \
+  __name(_Tp&& __xx, _Up&& __yy, _V&& __zz)                                    \
+  {                                                                            \
+    return std::experimental::__name(_Simd(static_cast<_Tp&&>(__xx)),          \
+				     _Simd(static_cast<_Up&&>(__yy)),          \
+				     _Simd(static_cast<_V&&>(__zz)));          \
+  }
+
+// }}}
+// __cosSeries {{{
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<float, _Abi>
+__cosSeries(const simd<float, _Abi>& __x)
+{
+  const simd<float, _Abi> __x2 = __x * __x;
+  simd<float, _Abi> __y;
+  __y = 0x1.ap-16f;                  //  1/8!
+  __y = __y * __x2 - 0x1.6c1p-10f;   // -1/6!
+  __y = __y * __x2 + 0x1.555556p-5f; //  1/4!
+  return __y * (__x2 * __x2) - .5f * __x2 + 1.f;
+}
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<double, _Abi>
+__cosSeries(const simd<double, _Abi>& __x)
+{
+  const simd<double, _Abi> __x2 = __x * __x;
+  simd<double, _Abi> __y;
+  __y = 0x1.AC00000000000p-45;              //  1/16!
+  __y = __y * __x2 - 0x1.9394000000000p-37; // -1/14!
+  __y = __y * __x2 + 0x1.1EED8C0000000p-29; //  1/12!
+  __y = __y * __x2 - 0x1.27E4FB7400000p-22; // -1/10!
+  __y = __y * __x2 + 0x1.A01A01A018000p-16; //  1/8!
+  __y = __y * __x2 - 0x1.6C16C16C16C00p-10; // -1/6!
+  __y = __y * __x2 + 0x1.5555555555554p-5;  //  1/4!
+  return (__y * __x2 - .5f) * __x2 + 1.f;
+}
+
+// }}}
+// __sinSeries {{{
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<float, _Abi>
+__sinSeries(const simd<float, _Abi>& __x)
+{
+  const simd<float, _Abi> __x2 = __x * __x;
+  simd<float, _Abi> __y;
+  __y = -0x1.9CC000p-13f;            // -1/7!
+  __y = __y * __x2 + 0x1.111100p-7f; //  1/5!
+  __y = __y * __x2 - 0x1.555556p-3f; // -1/3!
+  return __y * (__x2 * __x) + __x;
+}
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<double, _Abi>
+__sinSeries(const simd<double, _Abi>& __x)
+{
+  // __x  = [0, 0.7854 = pi/4]
+  // __x² = [0, 0.6169 = pi²/8]
+  const simd<double, _Abi> __x2 = __x * __x;
+  simd<double, _Abi> __y;
+  __y = -0x1.ACF0000000000p-41;             // -1/15!
+  __y = __y * __x2 + 0x1.6124400000000p-33; //  1/13!
+  __y = __y * __x2 - 0x1.AE64567000000p-26; // -1/11!
+  __y = __y * __x2 + 0x1.71DE3A5540000p-19; //  1/9!
+  __y = __y * __x2 - 0x1.A01A01A01A000p-13; // -1/7!
+  __y = __y * __x2 + 0x1.1111111111110p-7;  //  1/5!
+  __y = __y * __x2 - 0x1.5555555555555p-3;  // -1/3!
+  return __y * (__x2 * __x) + __x;
+}
+
+// }}}
+// __zero_low_bits {{{
+template <int _Bits, typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+__zero_low_bits(simd<_Tp, _Abi> __x)
+{
+  const simd<_Tp, _Abi> __bitmask = __bit_cast<_Tp>(
+    ~std::make_unsigned_t<__int_for_sizeof_t<_Tp>>() << _Bits);
+  return {__private_init,
+	  _Abi::_SimdImpl::__bit_and(__data(__x), __data(__bitmask))};
+}
+
+// }}}
+// __fold_input {{{
+
+/**\internal
+ * Fold \p x into [-¼π, ¼π] and remember the quadrant it came from:
+ * quadrant 0: [-¼π,  ¼π]
+ * quadrant 1: [ ¼π,  ¾π]
+ * quadrant 2: [ ¾π, 1¼π]
+ * quadrant 3: [1¼π, 1¾π]
+ *
+ * The algorithm determines `y` as the multiple `x - y * ¼π = [-¼π, ¼π]`. Using
+ * a bitmask, `y` is reduced to `quadrant`. `y` can be calculated as
+ * ```
+ * y = trunc(x / ¼π);
+ * y += fmod(y, 2);
+ * ```
+ * This can be simplified by moving the (implicit) division by 2 into the
+ * truncation expression. The `+= fmod` effect can the be achieved by using
+ * rounding instead of truncation: `y = round(x / ½π) * 2`. If precision allows,
+ * `2/π * x` is better (faster).
+ */
+template <typename _Tp, typename _Abi> struct __folded
+{
+  simd<_Tp, _Abi> _M_x;
+  rebind_simd_t<int, simd<_Tp, _Abi>> _M_quadrant;
+};
+
+namespace __math_float {
+inline constexpr float __pi_over_4 = 0x1.921FB6p-1f; // π/4
+inline constexpr float __2_over_pi = 0x1.45F306p-1f; // 2/π
+inline constexpr float __pi_2_5bits0
+  = 0x1.921fc0p0f; // π/2, 5 0-bits (least significant)
+inline constexpr float __pi_2_5bits0_rem
+  = -0x1.5777a6p-21f; // π/2 - __pi_2_5bits0
+} // namespace __math_float
+namespace __math_double {
+inline constexpr double __pi_over_4 = 0x1.921fb54442d18p-1; // π/4
+inline constexpr double __2_over_pi = 0x1.45F306DC9C883p-1; // 2/π
+inline constexpr double __pi_2 = 0x1.921fb54442d18p0;       // π/2
+} // namespace __math_double
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE __folded<float, _Abi>
+__fold_input(const simd<float, _Abi>& __x)
+{
+  using _V = simd<float, _Abi>;
+  using _IV = rebind_simd_t<int, _V>;
+  using namespace __math_float;
+  __folded<float, _Abi> __r;
+  __r._M_x = abs(__x);
+#if 0
+  // zero most mantissa bits:
+  constexpr float __1_over_pi = 0x1.45F306p-2f; // 1/π
+  const auto __y = (__r._M_x * __1_over_pi + 0x1.8p23f) - 0x1.8p23f;
+  // split π into 4 parts, the first three with 13 trailing zeros (to make the following
+  // multiplications precise):
+  constexpr float __pi0 = 0x1.920000p1f;
+  constexpr float __pi1 = 0x1.fb4000p-11f;
+  constexpr float __pi2 = 0x1.444000p-23f;
+  constexpr float __pi3 = 0x1.68c234p-38f;
+  __r._M_x - __y*__pi0 - __y*__pi1 - __y*__pi2 - __y*__pi3
+#else
+  if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__r._M_x < __pi_over_4)))
+    __r._M_quadrant = 0;
+  else if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__r._M_x < 6 * __pi_over_4)))
+    {
+      const _V __y = nearbyint(__r._M_x * __2_over_pi);
+      __r._M_quadrant = static_simd_cast<_IV>(__y) & 3; // __y mod 4
+      __r._M_x -= __y * __pi_2_5bits0;
+      __r._M_x -= __y * __pi_2_5bits0_rem;
+    }
+  else
+    {
+      using __math_double::__2_over_pi;
+      using __math_double::__pi_2;
+      using _VD = rebind_simd_t<double, _V>;
+      _VD __xd = static_simd_cast<_VD>(__r._M_x);
+      _VD __y = nearbyint(__xd * __2_over_pi);
+      __r._M_quadrant = static_simd_cast<_IV>(__y) & 3; // = __y mod 4
+      __r._M_x = static_simd_cast<_V>(__xd - __y * __pi_2);
+    }
+#endif
+  return __r;
+}
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE __folded<double, _Abi>
+__fold_input(const simd<double, _Abi>& __x)
+{
+  using _V = simd<double, _Abi>;
+  using _IV = rebind_simd_t<int, _V>;
+  using namespace __math_double;
+
+  __folded<double, _Abi> __r;
+  __r._M_x = abs(__x);
+  if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__r._M_x < __pi_over_4)))
+    {
+      __r._M_quadrant = 0;
+      return __r;
+    }
+  const _V __y = nearbyint(__r._M_x / (2 * __pi_over_4));
+  __r._M_quadrant = static_simd_cast<_IV>(__y) & 3;
+
+  if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__r._M_x < 1025 * __pi_over_4)))
+    {
+      // x - y * pi/2, y uses no more than 11 mantissa bits
+      __r._M_x -= __y * 0x1.921FB54443000p0;
+      __r._M_x -= __y * -0x1.73DCB3B39A000p-43;
+      __r._M_x -= __y * 0x1.45C06E0E68948p-86;
+    }
+  else if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__y <= 0x1.0p30)))
+    {
+      // x - y * pi/2, y uses no more than 29 mantissa bits
+      __r._M_x -= __y * 0x1.921FB40000000p0;
+      __r._M_x -= __y * 0x1.4442D00000000p-24;
+      __r._M_x -= __y * 0x1.8469898CC5170p-48;
+    }
+  else
+    {
+      // x - y * pi/2, y may require all mantissa bits
+      const _V __y_hi = __zero_low_bits<26>(__y);
+      const _V __y_lo = __y - __y_hi;
+      const auto __pi_2_1 = 0x1.921FB50000000p0;
+      const auto __pi_2_2 = 0x1.110B460000000p-26;
+      const auto __pi_2_3 = 0x1.1A62630000000p-54;
+      const auto __pi_2_4 = 0x1.8A2E03707344Ap-81;
+      __r._M_x = __r._M_x - __y_hi * __pi_2_1
+		 - max(__y_hi * __pi_2_2, __y_lo * __pi_2_1)
+		 - min(__y_hi * __pi_2_2, __y_lo * __pi_2_1)
+		 - max(__y_hi * __pi_2_3, __y_lo * __pi_2_2)
+		 - min(__y_hi * __pi_2_3, __y_lo * __pi_2_2)
+		 - max(__y * __pi_2_4, __y_lo * __pi_2_3)
+		 - min(__y * __pi_2_4, __y_lo * __pi_2_3);
+    }
+  return __r;
+}
+
+// }}}
+// __extract_exponent_bits {{{
+template <typename _Abi>
+rebind_simd_t<int, simd<float, _Abi>>
+__extract_exponent_bits(const simd<float, _Abi>& __v)
+{
+  using namespace std::experimental::__proposed;
+  using namespace std::experimental::__proposed::float_bitwise_operators;
+  _GLIBCXX_SIMD_CONSTEXPR simd<float, _Abi> __exponent_mask
+    = std::numeric_limits<float>::infinity(); // 0x7f800000
+  return __bit_cast<rebind_simd_t<int, simd<float, _Abi>>>(__v
+							   & __exponent_mask);
+}
+
+template <typename _Abi>
+rebind_simd_t<int, simd<double, _Abi>>
+__extract_exponent_bits(const simd<double, _Abi>& __v)
+{
+  using namespace std::experimental::_P0918;
+  using namespace std::experimental::__proposed::float_bitwise_operators;
+  const simd<double, _Abi> __exponent_mask
+    = std::numeric_limits<double>::infinity(); // 0x7ff0000000000000
+  constexpr auto _Np = simd_size_v<double, _Abi> * 2;
+  constexpr auto _Max = simd_abi::max_fixed_size<int>;
+  if constexpr (_Np > _Max)
+    {
+      const auto __tup
+	= split<_Max / 2, (_Np - _Max) / 2>(__v & __exponent_mask);
+      return concat(
+	shuffle<strided<2, 1>>(
+	  __bit_cast<simd<int, simd_abi::deduce_t<int, _Max>>>(
+	    std::get<0>(__tup))),
+	shuffle<strided<2, 1>>(
+	  __bit_cast<simd<int, simd_abi::deduce_t<int, _Np - _Max>>>(
+	    std::get<1>(__tup))));
+    }
+  else
+    return shuffle<strided<2, 1>>(
+      __bit_cast<simd<int, simd_abi::deduce_t<int, _Np>>>(__v
+							  & __exponent_mask));
+}
+
+// }}}
+// __impl_or_fallback {{{
+template <typename ImplFun, typename FallbackFun, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC auto
+__impl_or_fallback_dispatch(int, ImplFun&& __impl_fun, FallbackFun&&,
+			    _Args&&... __args)
+  -> decltype(__impl_fun(static_cast<_Args&&>(__args)...))
+{
+  return __impl_fun(static_cast<_Args&&>(__args)...);
+}
+
+template <typename ImplFun, typename FallbackFun, typename... _Args>
+inline auto
+__impl_or_fallback_dispatch(float, ImplFun&&, FallbackFun&& __fallback_fun,
+			    _Args&&... __args)
+  -> decltype(__fallback_fun(static_cast<_Args&&>(__args)...))
+{
+  return __fallback_fun(static_cast<_Args&&>(__args)...);
+}
+
+template <typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC auto
+__impl_or_fallback(_Args&&... __args)
+{
+  return __impl_or_fallback_dispatch(int(), static_cast<_Args&&>(__args)...);
+} //}}}
+
+// trigonometric functions {{{
+_GLIBCXX_SIMD_MATH_CALL_(acos)
+_GLIBCXX_SIMD_MATH_CALL_(asin)
+_GLIBCXX_SIMD_MATH_CALL_(atan)
+_GLIBCXX_SIMD_MATH_CALL2_(atan2, _Tp)
+
+/*
+ * algorithm for sine and cosine:
+ *
+ * The result can be calculated with sine or cosine depending on the π/4 section
+ * the input is in. sine   ≈ __x + __x³ cosine ≈ 1 - __x²
+ *
+ * sine:
+ * Map -__x to __x and invert the output
+ * Extend precision of __x - n * π/4 by calculating
+ * ((__x - n * p1) - n * p2) - n * p3 (p1 + p2 + p3 = π/4)
+ *
+ * Calculate Taylor series with tuned coefficients.
+ * Fix sign.
+ */
+// cos{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+cos(const simd<_Tp, _Abi>& __x)
+{
+  using _V = simd<_Tp, _Abi>;
+  if constexpr (__is_scalar_abi<_Abi>() || __is_fixed_size_abi_v<_Abi>)
+    return {__private_init, _Abi::_SimdImpl::__cos(__data(__x))};
+  else
+    {
+      if constexpr (is_same_v<_Tp, float>)
+	if (_GLIBCXX_SIMD_IS_UNLIKELY(any_of(abs(__x) >= 393382)))
+	  return static_simd_cast<_V>(
+	    cos(static_simd_cast<rebind_simd_t<double, _V>>(__x)));
+
+      const auto __f = __fold_input(__x);
+      // quadrant | effect
+      //        0 | cosSeries, +
+      //        1 | sinSeries, -
+      //        2 | cosSeries, -
+      //        3 | sinSeries, +
+      using namespace std::experimental::__proposed::float_bitwise_operators;
+      const _V __sign_flip
+	= _V(-0.f) & static_simd_cast<_V>((1 + __f._M_quadrant) << 30);
+
+      const auto __need_cos = (__f._M_quadrant & 1) == 0;
+      if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__need_cos)))
+	return __sign_flip ^ __cosSeries(__f._M_x);
+      else if (_GLIBCXX_SIMD_IS_UNLIKELY(none_of(__need_cos)))
+	return __sign_flip ^ __sinSeries(__f._M_x);
+      else // some_of(__need_cos)
+	{
+	  _V __r = __sinSeries(__f._M_x);
+	  where(__need_cos.__cvt(), __r) = __cosSeries(__f._M_x);
+	  return __r ^ __sign_flip;
+	}
+    }
+}
+
+template <typename _Tp>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+  enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, simd_abi::scalar>>
+  cos(simd<_Tp, simd_abi::scalar> __x)
+{
+  return std::cos(__data(__x));
+}
+//}}}
+// sin{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sin(const simd<_Tp, _Abi>& __x)
+{
+  using _V = simd<_Tp, _Abi>;
+  if constexpr (__is_scalar_abi<_Abi>() || __is_fixed_size_abi_v<_Abi>)
+    return {__private_init, _Abi::_SimdImpl::__sin(__data(__x))};
+  else
+    {
+      if constexpr (is_same_v<_Tp, float>)
+	if (_GLIBCXX_SIMD_IS_UNLIKELY(any_of(abs(__x) >= 527449)))
+	  return static_simd_cast<_V>(
+	    sin(static_simd_cast<rebind_simd_t<double, _V>>(__x)));
+
+      const auto __f = __fold_input(__x);
+      // quadrant | effect
+      //        0 | sinSeries
+      //        1 | cosSeries
+      //        2 | sinSeries, sign flip
+      //        3 | cosSeries, sign flip
+      using namespace std::experimental::__proposed::float_bitwise_operators;
+      const auto __sign_flip
+	= (__x ^ static_simd_cast<_V>(1 - __f._M_quadrant)) & _V(_Tp(-0.));
+
+      const auto __need_sin = (__f._M_quadrant & 1) == 0;
+      if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__need_sin)))
+	return __sign_flip ^ __sinSeries(__f._M_x);
+      else if (_GLIBCXX_SIMD_IS_UNLIKELY(none_of(__need_sin)))
+	return __sign_flip ^ __cosSeries(__f._M_x);
+      else // some_of(__need_sin)
+	{
+	  _V __r = __cosSeries(__f._M_x);
+	  where(__need_sin.__cvt(), __r) = __sinSeries(__f._M_x);
+	  return __sign_flip ^ __r;
+	}
+    }
+}
+
+template <typename _Tp>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+  enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, simd_abi::scalar>>
+  sin(simd<_Tp, simd_abi::scalar> __x)
+{
+  return std::sin(__data(__x));
+}
+//}}}
+
+_GLIBCXX_SIMD_MATH_CALL_(tan)
+_GLIBCXX_SIMD_MATH_CALL_(acosh)
+_GLIBCXX_SIMD_MATH_CALL_(asinh)
+_GLIBCXX_SIMD_MATH_CALL_(atanh)
+_GLIBCXX_SIMD_MATH_CALL_(cosh)
+_GLIBCXX_SIMD_MATH_CALL_(sinh)
+_GLIBCXX_SIMD_MATH_CALL_(tanh)
+// }}}
+// exponential functions {{{
+_GLIBCXX_SIMD_MATH_CALL_(exp)
+_GLIBCXX_SIMD_MATH_CALL_(exp2)
+_GLIBCXX_SIMD_MATH_CALL_(expm1)
+// }}}
+// frexp {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+template <typename _Tp, size_t _Np>
+_SimdWrapper<_Tp, _Np>
+__getexp(_SimdWrapper<_Tp, _Np> __x)
+{
+  if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+    return __auto_bitcast(_mm_getexp_ps(__to_intrin(__x)));
+  else if constexpr (__have_avx512f && __is_sse_ps<_Tp, _Np>())
+    return __auto_bitcast(_mm512_getexp_ps(__auto_bitcast(__to_intrin(__x))));
+  else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+    return _mm_getexp_pd(__x);
+  else if constexpr (__have_avx512f && __is_sse_pd<_Tp, _Np>())
+    return __lo128(_mm512_getexp_pd(__auto_bitcast(__x)));
+  else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+    return _mm256_getexp_ps(__x);
+  else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+    return __lo256(_mm512_getexp_ps(__auto_bitcast(__x)));
+  else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+    return _mm256_getexp_pd(__x);
+  else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+    return __lo256(_mm512_getexp_pd(__auto_bitcast(__x)));
+  else if constexpr (__is_avx512_ps<_Tp, _Np>())
+    return _mm512_getexp_ps(__x);
+  else if constexpr (__is_avx512_pd<_Tp, _Np>())
+    return _mm512_getexp_pd(__x);
+  else
+    __assert_unreachable<_Tp>();
+}
+
+template <typename _Tp, size_t _Np>
+_SimdWrapper<_Tp, _Np>
+__getmant_avx512(_SimdWrapper<_Tp, _Np> __x)
+{
+  if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+    return __auto_bitcast(
+      _mm_getmant_ps(__to_intrin(__x), _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src));
+  else if constexpr (__have_avx512f && __is_sse_ps<_Tp, _Np>())
+    return __auto_bitcast(_mm512_getmant_ps(__auto_bitcast(__to_intrin(__x)),
+					    _MM_MANT_NORM_p5_1,
+					    _MM_MANT_SIGN_src));
+  else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+    return _mm_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+  else if constexpr (__have_avx512f && __is_sse_pd<_Tp, _Np>())
+    return __lo128(_mm512_getmant_pd(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+				     _MM_MANT_SIGN_src));
+  else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+    return _mm256_getmant_ps(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+  else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+    return __lo256(_mm512_getmant_ps(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+				     _MM_MANT_SIGN_src));
+  else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+    return _mm256_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+  else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+    return __lo256(_mm512_getmant_pd(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+				     _MM_MANT_SIGN_src));
+  else if constexpr (__is_avx512_ps<_Tp, _Np>())
+    return _mm512_getmant_ps(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+  else if constexpr (__is_avx512_pd<_Tp, _Np>())
+    return _mm512_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+  else
+    __assert_unreachable<_Tp>();
+}
+#endif // _GLIBCXX_SIMD_X86INTRIN
+
+/**
+ * splits \p __v into exponent and mantissa, the sign is kept with the mantissa
+ *
+ * The return value will be in the range [0.5, 1.0[
+ * The \p __e value will be an integer defining the power-of-two exponent
+ */
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+frexp(const simd<_Tp, _Abi>& __x, __samesize<int, simd<_Tp, _Abi>>* __exp)
+{
+  if constexpr (simd_size_v<_Tp, _Abi> == 1)
+    {
+      int __tmp;
+      const auto __r = std::frexp(__x[0], &__tmp);
+      (*__exp)[0] = __tmp;
+      return __r;
+    }
+  else if constexpr (__is_fixed_size_abi_v<_Abi>)
+    {
+      return {__private_init,
+	      _Abi::_SimdImpl::__frexp(__data(__x), __data(*__exp))};
+#if _GLIBCXX_SIMD_X86INTRIN
+    }
+  else if constexpr (__have_avx512f)
+    {
+      using _IV = __samesize<int, simd<_Tp, _Abi>>;
+      constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+      constexpr size_t _NI = _Np < 4 ? 4 : _Np;
+      const auto __v = __data(__x);
+      const auto __isnonzero
+	= _Abi::_SimdImpl::__isnonzerovalue_mask(__v._M_data);
+      const _SimdWrapper<int, _NI> __exp_plus1
+	= 1 + __convert<_SimdWrapper<int, _NI>>(__getexp(__v))._M_data;
+      const _SimdWrapper<int, _Np> __e = __wrapper_bitcast<int, _Np>(
+	_Abi::_CommonImpl::_S_blend(_SimdWrapper<bool, _NI>(__isnonzero),
+				    _SimdWrapper<int, _NI>(), __exp_plus1));
+      simd_abi::deduce_t<int, _Np>::_CommonImpl::__store(
+	__e, __exp, overaligned<alignof(_IV)>);
+      return {__private_init,
+	      _Abi::_CommonImpl::_S_blend(_SimdWrapper<bool, _Np>(__isnonzero),
+					  __v, __getmant_avx512(__v))};
+#endif // _GLIBCXX_SIMD_X86INTRIN
+    }
+  else
+    {
+      // fallback implementation
+      static_assert(sizeof(_Tp) == 4 || sizeof(_Tp) == 8);
+      using _V = simd<_Tp, _Abi>;
+      using _IV = rebind_simd_t<int, _V>;
+      using _Limits = std::numeric_limits<_Tp>;
+      using namespace std::experimental::__proposed;
+      using namespace std::experimental::__proposed::float_bitwise_operators;
+
+      constexpr int __exp_shift = sizeof(_Tp) == 4 ? 23 : 20;
+      constexpr int __exp_adjust = sizeof(_Tp) == 4 ? 0x7e : 0x3fe;
+      constexpr int __exp_offset = sizeof(_Tp) == 4 ? 0x70 : 0x200;
+      constexpr _Tp __subnorm_scale = sizeof(_Tp) == 4 ? 0x1p112 : 0x1p512;
+      _GLIBCXX_SIMD_CONSTEXPR _V __exponent_mask
+	= _Limits::infinity(); // 0x7f800000 or 0x7ff0000000000000
+      _GLIBCXX_SIMD_CONSTEXPR _V __p5_1_exponent
+	= _Tp(sizeof(_Tp) == 4 ? -0x1.fffffep-1 : -0x1.fffffffffffffp-1);
+
+      _V __mant = __p5_1_exponent & (__exponent_mask | __x);
+      const _IV __exponent_bits = __extract_exponent_bits(__x);
+      if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))))
+	{
+	  *__exp = simd_cast<__samesize<int, _V>>(
+	    (__exponent_bits >> __exp_shift) - __exp_adjust);
+	  return __mant;
+	}
+
+      // can't use isunordered(x*inf, x*0) because inf*0 raises invalid
+      const auto __as_int
+	= __bit_cast<rebind_simd_t<__int_for_sizeof_t<_Tp>, _V>>(abs(__x));
+      const auto __inf = __bit_cast<rebind_simd_t<__int_for_sizeof_t<_Tp>, _V>>(
+	_V(std::numeric_limits<_Tp>::infinity()));
+      const auto __iszero_inf_nan = static_simd_cast<typename _V::mask_type>(
+	__as_int == 0 || __as_int >= __inf);
+
+      const _V __scaled_subnormal = __x * __subnorm_scale;
+      const _V __mant_subnormal
+	= __p5_1_exponent & (__exponent_mask | __scaled_subnormal);
+      where(!isnormal(__x), __mant) = __mant_subnormal;
+      where(__iszero_inf_nan, __mant) = __x;
+      _IV __e = __extract_exponent_bits(__scaled_subnormal);
+      using _MaskType = typename std::conditional_t<
+	sizeof(typename _V::mask_type) == sizeof(_IV), _V, _IV>::mask_type;
+      const _MaskType __value_isnormal = isnormal(__x).__cvt();
+      where(__value_isnormal.__cvt(), __e) = __exponent_bits;
+      static_assert(sizeof(_IV) == sizeof(__value_isnormal));
+      const _IV __offset
+	= (__bit_cast<_IV>(__value_isnormal) & _IV(__exp_adjust))
+	  | (__bit_cast<_IV>(static_simd_cast<_MaskType>(__exponent_bits == 0)
+			     & static_simd_cast<_MaskType>(__x != 0))
+	     & _IV(__exp_adjust + __exp_offset));
+      *__exp = simd_cast<__samesize<int, _V>>((__e >> __exp_shift) - __offset);
+      return __mant;
+    }
+}
+// }}}
+_GLIBCXX_SIMD_MATH_CALL2_(ldexp, int)
+_GLIBCXX_SIMD_MATH_CALL_(ilogb)
+
+// logarithms {{{
+_GLIBCXX_SIMD_MATH_CALL_(log)
+_GLIBCXX_SIMD_MATH_CALL_(log10)
+_GLIBCXX_SIMD_MATH_CALL_(log1p)
+_GLIBCXX_SIMD_MATH_CALL_(log2)
+//}}}
+// logb{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, _Abi>>
+logb(const simd<_Tp, _Abi>& __x)
+{
+  constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+  if constexpr (_Np == 1)
+    return std::logb(__x[0]);
+  else if constexpr (__is_fixed_size_abi_v<_Abi>)
+    {
+      return {__private_init,
+	      __data(__x).__apply_per_chunk([](auto __impl, auto __xx) {
+		using _V = typename decltype(__impl)::simd_type;
+		return __data(
+		  std::experimental::logb(_V(__private_init, __xx)));
+	      })};
+    }
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+  else if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+    return {__private_init,
+	    __auto_bitcast(_mm_getexp_ps(__to_intrin(__as_vector(__x))))};
+  else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+    return {__private_init, _mm_getexp_pd(__data(__x))};
+  else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+    return {__private_init, _mm256_getexp_ps(__data(__x))};
+  else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+    return {__private_init, _mm256_getexp_pd(__data(__x))};
+  else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+    return {__private_init,
+	    __lo256(_mm512_getexp_ps(__auto_bitcast(__data(__x))))};
+  else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+    return {__private_init,
+	    __lo256(_mm512_getexp_pd(__auto_bitcast(__data(__x))))};
+  else if constexpr (__is_avx512_ps<_Tp, _Np>())
+    return {__private_init, _mm512_getexp_ps(__data(__x))};
+  else if constexpr (__is_avx512_pd<_Tp, _Np>())
+    return {__private_init, _mm512_getexp_pd(__data(__x))};
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+  else
+    {
+      using _V = simd<_Tp, _Abi>;
+      using namespace std::experimental::__proposed;
+      auto __is_normal = isnormal(__x);
+
+      // work on __abs(__x) to reflect the return value on Linux for negative
+      // inputs (domain-error => implementation-defined value is returned)
+      const _V abs_x = abs(__x);
+
+      // __exponent(__x) returns the exponent value (bias removed) as simd<_Up>
+      // with integral _Up
+      auto&& __exponent = [](const _V& __v) {
+	using namespace std::experimental::__proposed;
+	using _IV = rebind_simd_t<
+	  std::conditional_t<sizeof(_Tp) == sizeof(_LLong), _LLong, int>, _V>;
+	return (__bit_cast<_IV>(__v) >> (std::numeric_limits<_Tp>::digits - 1))
+	       - (std::numeric_limits<_Tp>::max_exponent - 1);
+      };
+      _V __r = static_simd_cast<_V>(__exponent(abs_x));
+      if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__is_normal)))
+	// without corner cases (nan, inf, subnormal, zero) we have our
+	// answer:
+	return __r;
+      const auto __is_zero = __x == 0;
+      const auto __is_nan = isnan(__x);
+      const auto __is_inf = isinf(__x);
+      where(__is_zero, __r) = -std::numeric_limits<_Tp>::infinity();
+      where(__is_nan, __r) = __x;
+      where(__is_inf, __r) = std::numeric_limits<_Tp>::infinity();
+      __is_normal |= __is_zero || __is_nan || __is_inf;
+      if (all_of(__is_normal))
+	// at this point everything but subnormals is handled
+	return __r;
+      // subnormals repeat the exponent extraction after multiplication of the
+      // input with __a floating point value that has 112 (0x70) in its exponent
+      // (not too big for sp and large enough for dp)
+      const _V __scaled = abs_x * _Tp(0x1p112);
+      _V __scaled_exp = static_simd_cast<_V>(__exponent(__scaled) - 112);
+      where(__is_normal, __scaled_exp) = __r;
+      return __scaled_exp;
+    }
+}
+//}}}
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+modf(const simd<_Tp, _Abi>& __x, simd<_Tp, _Abi>* __iptr)
+{
+  const auto __integral = trunc(__x);
+  *__iptr = __integral;
+  auto __r = __x - __integral;
+  where(isinf(__x), __r) = _Tp();
+  return copysign(__r, __x);
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(scalbn, int)
+_GLIBCXX_SIMD_MATH_CALL2_(scalbln, long)
+
+_GLIBCXX_SIMD_MATH_CALL_(cbrt)
+
+_GLIBCXX_SIMD_MATH_CALL_(abs)
+_GLIBCXX_SIMD_MATH_CALL_(fabs)
+
+// [parallel.simd.math] only asks for is_floating_point_v<_Tp> and forgot to
+// allow signed integral _Tp
+template <typename _Tp, typename _Abi>
+enable_if_t<!std::is_floating_point_v<_Tp> && std::is_signed_v<_Tp>,
+	    simd<_Tp, _Abi>>
+abs(const simd<_Tp, _Abi>& __x)
+{
+  return {__private_init, _Abi::_SimdImpl::__abs(__data(__x))};
+}
+template <typename _Tp, typename _Abi>
+enable_if_t<!std::is_floating_point_v<_Tp> && std::is_signed_v<_Tp>,
+	    simd<_Tp, _Abi>>
+fabs(const simd<_Tp, _Abi>& __x)
+{
+  return {__private_init, _Abi::_SimdImpl::__abs(__data(__x))};
+}
+
+// the following are overloads for functions in <cstdlib> and not covered by
+// [parallel.simd.math]. I don't see much value in making them work, though
+/*
+template <typename _Abi> simd<long, _Abi> labs(const simd<long, _Abi> &__x)
+{
+    return {__private_init, _Abi::_SimdImpl::abs(__data(__x))};
+}
+template <typename _Abi> simd<long long, _Abi> llabs(const simd<long long, _Abi>
+&__x)
+{
+    return {__private_init, _Abi::_SimdImpl::abs(__data(__x))};
+}
+*/
+
+#define _GLIBCXX_SIMD_CVTING2(_NAME)                                           \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y)              \
+  {                                                                            \
+    return _NAME(__x, __y);                                                    \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y)              \
+  {                                                                            \
+    return _NAME(__x, __y);                                                    \
+  }
+
+#define _GLIBCXX_SIMD_CVTING3(_NAME)                                           \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y,              \
+    const simd<_Tp, _Abi>& __z)                                                \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y,              \
+    const simd<_Tp, _Abi>& __z)                                                \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y,                    \
+    const __id<simd<_Tp, _Abi>>& __z)                                          \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y,              \
+    const __id<simd<_Tp, _Abi>>& __z)                                          \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y,              \
+    const __id<simd<_Tp, _Abi>>& __z)                                          \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }                                                                            \
+  template <typename _Tp, typename _Abi>                                       \
+  _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(                               \
+    const __id<simd<_Tp, _Abi>>& __x, const __id<simd<_Tp, _Abi>>& __y,        \
+    const simd<_Tp, _Abi>& __z)                                                \
+  {                                                                            \
+    return _NAME(__x, __y, __z);                                               \
+  }
+
+template <typename _R, typename _ToApply, typename _Tp, typename... _Tps>
+_GLIBCXX_SIMD_INTRINSIC _R
+__fixed_size_apply(_ToApply&& __apply, const _Tp& __arg0, const _Tps&... __args)
+{
+  return {__private_init,
+	  __data(__arg0).__apply_per_chunk(
+	    [&](auto __impl, const auto&... __inner) {
+	      using _V = typename decltype(__impl)::simd_type;
+	      return __data(__apply(_V(__private_init, __inner)...));
+	    },
+	    __data(__args)...)};
+}
+
+template <typename _VV>
+__remove_cvref_t<_VV>
+__hypot(_VV __x, _VV __y)
+{
+  using _V = __remove_cvref_t<_VV>;
+  using _Tp = typename _V::value_type;
+  if constexpr (_V::size() == 1)
+    return std::hypot(_Tp(__x[0]), _Tp(__y[0]));
+  else if constexpr (__is_fixed_size_abi_v<typename _V::abi_type>)
+    {
+      return __fixed_size_apply<_V>([](auto __a,
+				       auto __b) { return hypot(__a, __b); },
+				    __x, __y);
+    }
+  else
+    {
+      // A simple solution for _Tp == float would be to cast to double and
+      // simply calculate sqrt(x²+y²) as it can't over-/underflow anymore with
+      // dp. It still needs the Annex F fixups though and isn't faster on
+      // Skylake-AVX512 (not even for SSE and AVX vectors, and really bad for
+      // AVX-512).
+      using namespace __proposed::float_bitwise_operators;
+      using _Limits = std::numeric_limits<_Tp>;
+      _V __absx = abs(__x);          // no error
+      _V __absy = abs(__y);          // no error
+      _V __hi = max(__absx, __absy); // no error
+      _V __lo = min(__absy, __absx); // no error
+
+      // round __hi down to the next power-of-2:
+      _GLIBCXX_SIMD_CONSTEXPR _V __inf(_Limits::infinity());
+
+      if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))
+				  && all_of(isnormal(__y))))
+	{
+	  const _V __hi_exp = __hi & __inf;
+	  //((__hi + __hi) & __inf) ^ __inf almost works for computing __scale,
+	  // except when (__hi + __hi) & __inf == __inf, in which case __scale
+	  // becomes 0 (should be min/2 instead) and thus loses the information
+	  // from __lo.
+	  const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
+	  _GLIBCXX_SIMD_CONSTEXPR _V __mant_mask
+	    = _Limits::min() - _Limits::denorm_min();
+	  const _V __h1 = (__hi & __mant_mask) | _V(1);
+	  const _V __l1 = __lo * __scale;
+	  return __hi_exp * sqrt(__h1 * __h1 + __l1 * __l1);
+	}
+      else
+	{
+	  // slower path to support subnormals
+	  // if __hi is subnormal, avoid scaling by inf & final mul by 0 (which
+	  // yields NaN) by using min()
+	  _V __scale = _V(1 / _Limits::min());
+	  // invert exponent w/o error and w/o using the slow divider unit:
+	  // xor inverts the exponent but off by 1. Multiplication with .5
+	  // adjusts for the discrepancy.
+	  where(__hi >= _Limits::min(), __scale)
+	    = ((__hi & __inf) ^ __inf) * _Tp(.5);
+	  // adjust final exponent for subnormal inputs
+	  _V __hi_exp = _Limits::min();
+	  where(__hi >= _Limits::min(), __hi_exp) = __hi & __inf; // no error
+	  _V __h1 = __hi * __scale;                               // no error
+	  _V __l1 = __lo * __scale;                               // no error
+
+	  // sqrt(x²+y²) = e*sqrt((x/e)²+(y/e)²):
+	  // this ensures no overflow in the argument to sqrt
+	  _V __r = __hi_exp * sqrt(__h1 * __h1 + __l1 * __l1);
+#ifdef __STDC_IEC_559__
+	  // fixup for Annex F requirements
+	  // the naive fixup goes like this:
+	  //
+	  // where(__l1 == 0, __r)                      = __hi;
+	  // where(isunordered(__x, __y), __r)          = _Limits::quiet_NaN();
+	  // where(isinf(__absx) || isinf(__absy), __r) = __inf;
+	  //
+	  // The fixup can be prepared in parallel with the sqrt, requiring a
+	  // single blend step after hi_exp * sqrt, reducing latency and
+	  // throughput:
+	  _V __fixup = __hi; // __lo == 0
+	  where(isunordered(__x, __y), __fixup) = _Limits::quiet_NaN();
+	  where(isinf(__absx) || isinf(__absy), __fixup) = __inf;
+	  where(!(__lo == 0 || isunordered(__x, __y)
+		  || (isinf(__absx) || isinf(__absy))),
+		__fixup)
+	    = __r;
+	  __r = __fixup;
+#endif
+	  return __r;
+	}
+    }
+}
+
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+hypot(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y)
+{
+  return __hypot<conditional_t<__is_fixed_size_abi_v<_Abi>,
+			       const simd<_Tp, _Abi>&, simd<_Tp, _Abi>>>(__x,
+									 __y);
+}
+_GLIBCXX_SIMD_CVTING2(hypot)
+
+template <typename _VV>
+__remove_cvref_t<_VV>
+__hypot(_VV __x, _VV __y, _VV __z)
+{
+  using _V = __remove_cvref_t<_VV>;
+  using _Abi = typename _V::abi_type;
+  using _Tp = typename _V::value_type;
+  /* FIXME: enable after PR77776 is resolved
+  if constexpr (_V::size() == 1)
+    return std::hypot(_Tp(__x[0]), _Tp(__y[0]), _Tp(__z[0]));
+  else
+  */
+  if constexpr (__is_fixed_size_abi_v<_Abi> && _V::size() > 1)
+    {
+      return __fixed_size_apply<simd<_Tp, _Abi>>(
+	[](auto __a, auto __b, auto __c) { return hypot(__a, __b, __c); }, __x,
+	__y, __z);
+    }
+  else
+    {
+      using namespace __proposed::float_bitwise_operators;
+      using _Limits = std::numeric_limits<_Tp>;
+      const _V __absx = abs(__x);                 // no error
+      const _V __absy = abs(__y);                 // no error
+      const _V __absz = abs(__z);                 // no error
+      _V __hi = max(max(__absx, __absy), __absz); // no error
+      _V __l0 = min(__absz, max(__absx, __absy)); // no error
+      _V __l1 = min(__absy, __absx);              // no error
+      if constexpr (numeric_limits<_Tp>::digits == 64
+		    && numeric_limits<_Tp>::max_exponent == 0x4000
+		    && numeric_limits<_Tp>::min_exponent == -0x3FFD
+		    && _V::size() == 1)
+	{ // Seems like x87 fp80, where bit 63 is always 1 unless subnormal or
+	  // NaN. In this case the bit-tricks don't work, they require IEC559
+	  // binary32 or binary64 format.
+#ifdef __STDC_IEC_559__
+	  // fixup for Annex F requirements
+	  if (isinf(__absx[0]) || isinf(__absy[0]) || isinf(__absz[0]))
+	    return _Limits::infinity();
+	  else if (isunordered(__absx[0], __absy[0] + __absz[0]))
+	    return _Limits::quiet_NaN();
+	  else if (__l0[0] == 0 && __l1[0] == 0)
+	    return __hi;
+#endif
+	  _V __hi_exp = __hi;
+	  const _ULLong __tmp = 0x8000'0000'0000'0000ull;
+	  __builtin_memcpy(&__hi_exp, &__tmp, 8);
+	  const _V __scale = 1 / __hi_exp;
+	  __hi *= __scale;
+	  __l0 *= __scale;
+	  __l1 *= __scale;
+	  return __hi_exp * sqrt((__l0 * __l0 + __l1 * __l1) + __hi * __hi);
+	}
+      else
+	{
+	  // round __hi down to the next power-of-2:
+	  _GLIBCXX_SIMD_CONSTEXPR _V __inf(_Limits::infinity());
+
+	  if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))
+				      && all_of(isnormal(__y))
+				      && all_of(isnormal(__z))))
+	    {
+	      const _V __hi_exp = __hi & __inf;
+	      //((__hi + __hi) & __inf) ^ __inf almost works for computing
+	      //__scale, except when (__hi + __hi) & __inf == __inf, in which
+	      // case __scale
+	      // becomes 0 (should be min/2 instead) and thus loses the
+	      // information from __lo.
+	      const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
+	      _GLIBCXX_SIMD_CONSTEXPR _V __mant_mask
+		= _Limits::min() - _Limits::denorm_min();
+	      const _V __h1 = (__hi & __mant_mask) | _V(1);
+	      __l0 *= __scale;
+	      __l1 *= __scale;
+	      const _V __lo
+		= __l0 * __l0 + __l1 * __l1; // add the two smaller values first
+	      return __hi_exp * sqrt(__lo + __h1 * __h1);
+	    }
+	  else
+	    {
+	      // slower path to support subnormals
+	      // if __hi is subnormal, avoid scaling by inf & final mul by 0
+	      // (which yields NaN) by using min()
+	      _V __scale = _V(1 / _Limits::min());
+	      // invert exponent w/o error and w/o using the slow divider unit:
+	      // xor inverts the exponent but off by 1. Multiplication with .5
+	      // adjusts for the discrepancy.
+	      where(__hi >= _Limits::min(), __scale)
+		= ((__hi & __inf) ^ __inf) * _Tp(.5);
+	      // adjust final exponent for subnormal inputs
+	      _V __hi_exp = _Limits::min();
+	      where(__hi >= _Limits::min(), __hi_exp)
+		= __hi & __inf;         // no error
+	      _V __h1 = __hi * __scale; // no error
+	      __l0 *= __scale;          // no error
+	      __l1 *= __scale;          // no error
+	      _V __lo
+		= __l0 * __l0 + __l1 * __l1; // add the two smaller values first
+	      _V __r = __hi_exp * sqrt(__lo + __h1 * __h1);
+#ifdef __STDC_IEC_559__
+	      // fixup for Annex F requirements
+	      _V __fixup = __hi; // __lo == 0
+	      // where(__lo == 0, __fixup)                   = __hi;
+	      where(isunordered(__x, __y + __z), __fixup)
+		= _Limits::quiet_NaN();
+	      where(isinf(__absx) || isinf(__absy) || isinf(__absz), __fixup)
+		= __inf;
+	      // Instead of __lo == 0, the following could depend on __h1² ==
+	      // __h1² + __lo (i.e. __hi is so much larger than the other two
+	      // inputs that the result is exactly __hi). While this may improve
+	      // precision, it is likely to reduce efficiency if the ISA has
+	      // FMAs (because __h1² + __lo is an FMA, but the intermediate
+	      // __h1² must be kept)
+	      where(!(__lo == 0 || isunordered(__x, __y + __z) || isinf(__absx)
+		      || isinf(__absy) || isinf(__absz)),
+		    __fixup)
+		= __r;
+	      __r = __fixup;
+#endif
+	      return __r;
+	    }
+	}
+    }
+}
+
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+hypot(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y,
+      const simd<_Tp, _Abi>& __z)
+{
+  return __hypot<conditional_t<__is_fixed_size_abi_v<_Abi>,
+			       const simd<_Tp, _Abi>&, simd<_Tp, _Abi>>>(__x,
+									 __y,
+									 __z);
+}
+_GLIBCXX_SIMD_CVTING3(hypot)
+
+_GLIBCXX_SIMD_MATH_CALL2_(pow, _Tp)
+
+_GLIBCXX_SIMD_MATH_CALL_(sqrt)
+_GLIBCXX_SIMD_MATH_CALL_(erf)
+_GLIBCXX_SIMD_MATH_CALL_(erfc)
+_GLIBCXX_SIMD_MATH_CALL_(lgamma)
+_GLIBCXX_SIMD_MATH_CALL_(tgamma)
+_GLIBCXX_SIMD_MATH_CALL_(ceil)
+_GLIBCXX_SIMD_MATH_CALL_(floor)
+_GLIBCXX_SIMD_MATH_CALL_(nearbyint)
+_GLIBCXX_SIMD_MATH_CALL_(rint)
+_GLIBCXX_SIMD_MATH_CALL_(lrint)
+_GLIBCXX_SIMD_MATH_CALL_(llrint)
+
+_GLIBCXX_SIMD_MATH_CALL_(round)
+_GLIBCXX_SIMD_MATH_CALL_(lround)
+_GLIBCXX_SIMD_MATH_CALL_(llround)
+
+_GLIBCXX_SIMD_MATH_CALL_(trunc)
+
+_GLIBCXX_SIMD_MATH_CALL2_(fmod, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(remainder, _Tp)
+_GLIBCXX_SIMD_MATH_CALL3_(remquo, _Tp, int*)
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+copysign(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y)
+{
+  using namespace std::experimental::__proposed::float_bitwise_operators;
+  const auto __signmask = -simd<_Tp, _Abi>();
+  return (__x & (__x ^ __signmask)) | (__y & __signmask);
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(nextafter, _Tp)
+// not covered in [parallel.simd.math]:
+// _GLIBCXX_SIMD_MATH_CALL2_(nexttoward, long double)
+_GLIBCXX_SIMD_MATH_CALL2_(fdim, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(fmax, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(fmin, _Tp)
+
+_GLIBCXX_SIMD_MATH_CALL3_(fma, _Tp, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(fpclassify)
+_GLIBCXX_SIMD_MATH_CALL_(isfinite)
+
+// isnan and isinf require special treatment because old glibc may declare
+// `int std::isinf(double)`.
+template <typename _Tp, typename _Abi, typename...,
+	  typename _R
+	  = std::experimental::__math_return_type_t<bool, _Tp, _Abi>>
+enable_if_t<std::is_floating_point_v<_Tp>, _R>
+isinf(std::experimental::simd<_Tp, _Abi> __x)
+{
+  return {std::experimental::__private_init,
+	  _Abi::_SimdImpl::__isinf(std::experimental::__data(__x))};
+}
+template <typename _Tp, typename _Abi, typename...,
+	  typename _R
+	  = std::experimental::__math_return_type_t<bool, _Tp, _Abi>>
+enable_if_t<std::is_floating_point_v<_Tp>, _R>
+isnan(std::experimental::simd<_Tp, _Abi> __x)
+{
+  return {std::experimental::__private_init,
+	  _Abi::_SimdImpl::__isnan(std::experimental::__data(__x))};
+}
+_GLIBCXX_SIMD_MATH_CALL_(isnormal)
+
+template <typename..., typename _Tp, typename _Abi>
+std::experimental::simd_mask<_Tp, _Abi>
+signbit(std::experimental::simd<_Tp, _Abi> __x)
+{
+  if constexpr (std::is_integral_v<_Tp>)
+    {
+      if constexpr (std::is_unsigned_v<_Tp>)
+	return std::experimental::simd_mask<_Tp, _Abi>{}; // false
+      else
+	return __x < 0;
+    }
+  else
+    return {std::experimental::__private_init,
+	    _Abi::_SimdImpl::__signbit(std::experimental::__data(__x))};
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(isgreater, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isgreaterequal, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isless, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(islessequal, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(islessgreater, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isunordered, _Tp)
+
+/* not covered in [parallel.simd.math]
+template <typename _Abi> __doublev<_Abi> nan(const char* tagp);
+template <typename _Abi> __floatv<_Abi> nanf(const char* tagp);
+template <typename _Abi> __ldoublev<_Abi> nanl(const char* tagp);
+
+template <typename _V> struct simd_div_t {
+    _V quot, rem;
+};
+template <typename _Abi>
+simd_div_t<_SCharv<_Abi>> div(_SCharv<_Abi> numer,
+					 _SCharv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__shortv<_Abi>> div(__shortv<_Abi> numer,
+					 __shortv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__intv<_Abi>> div(__intv<_Abi> numer, __intv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__longv<_Abi>> div(__longv<_Abi> numer,
+					__longv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__llongv<_Abi>> div(__llongv<_Abi> numer,
+					 __llongv<_Abi> denom);
+*/
+
+// special math {{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+assoc_laguerre(const std::experimental::fixed_size_simd<
+		 unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	       const std::experimental::fixed_size_simd<
+		 unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+	       const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+    return std::assoc_laguerre(__n[__i], __m[__i], __x[__i]);
+  });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+assoc_legendre(const std::experimental::fixed_size_simd<
+		 unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	       const std::experimental::fixed_size_simd<
+		 unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+	       const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+    return std::assoc_legendre(__n[__i], __m[__i], __x[__i]);
+  });
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(beta, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(comp_ellint_1)
+_GLIBCXX_SIMD_MATH_CALL_(comp_ellint_2)
+_GLIBCXX_SIMD_MATH_CALL2_(comp_ellint_3, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_i, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_j, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_k, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_neumann, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(ellint_1, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(ellint_2, _Tp)
+_GLIBCXX_SIMD_MATH_CALL3_(ellint_3, _Tp, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(expint)
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+hermite(const std::experimental::fixed_size_simd<
+	  unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>(
+    [&](auto __i) { return std::hermite(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+laguerre(const std::experimental::fixed_size_simd<
+	   unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	 const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>(
+    [&](auto __i) { return std::laguerre(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+legendre(const std::experimental::fixed_size_simd<
+	   unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	 const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>(
+    [&](auto __i) { return std::legendre(__n[__i], __x[__i]); });
+}
+
+_GLIBCXX_SIMD_MATH_CALL_(riemann_zeta)
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_bessel(const std::experimental::fixed_size_simd<
+	     unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	   const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>(
+    [&](auto __i) { return std::sph_bessel(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_legendre(const std::experimental::fixed_size_simd<
+	       unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __l,
+	     const std::experimental::fixed_size_simd<
+	       unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+	     const std::experimental::simd<_Tp, _Abi>& theta)
+{
+  return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+    return std::assoc_legendre(__l[__i], __m[__i], theta[__i]);
+  });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_neumann(const std::experimental::fixed_size_simd<
+	      unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+	    const std::experimental::simd<_Tp, _Abi>& __x)
+{
+  return std::experimental::simd<_Tp, _Abi>(
+    [&](auto __i) { return std::sph_neumann(__n[__i], __x[__i]); });
+}
+// }}}
+
+#undef _GLIBCXX_SIMD_MATH_CALL_
+#undef _GLIBCXX_SIMD_MATH_CALL2_
+#undef _GLIBCXX_SIMD_MATH_CALL3_
+
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+
+// vim: foldmethod=marker sw=2 ts=8 noet sts=2
diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++-v3/include/experimental/bits/simd_neon.h
new file mode 100644
index 00000000000..efff0150b8a
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_neon.h
@@ -0,0 +1,466 @@
+// Simd NEON specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+
+#if __cplusplus >= 201703L
+
+#if !_GLIBCXX_SIMD_HAVE_NEON
+#error "simd_neon.h may only be included when NEON on ARM is available"
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// _CommonImplNeon {{{
+struct _CommonImplNeon : _CommonImplBuiltin
+{
+  // __store {{{
+  using _CommonImplBuiltin::__store;
+
+  // }}}
+};
+
+// }}}
+// _SimdImplNeon {{{
+template <typename _Abi> struct _SimdImplNeon : _SimdImplBuiltin<_Abi>
+{
+  using _Base = _SimdImplBuiltin<_Abi>;
+  template <typename _Tp> static constexpr size_t _S_max_store_size = 16;
+
+  // __masked_load {{{
+  template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+  static inline _SimdWrapper<_Tp, _Np>
+  __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __k,
+		const _Up* __mem, _Fp) noexcept
+  {
+    __execute_n_times<_Np>([&](auto __i) {
+      if (__k[__i] != 0)
+	__merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+    });
+    return __merge;
+  }
+
+  // }}}
+  // __masked_store_nocvt {{{
+  template <typename _Tp, std::size_t _Np, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+		       _SimdWrapper<_Tp, _Np> __k)
+  {
+    __execute_n_times<_Np>([&](auto __i) {
+      if (__k[__i] != 0)
+	__mem[__i] = __v[__i];
+    });
+  }
+
+  // }}}
+  // __reduce {{{
+  template <typename _Tp, typename _BinaryOperation>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __reduce(simd<_Tp, _Abi> __x,
+					      _BinaryOperation&& __binary_op)
+  {
+    constexpr size_t _Np = __x.size();
+    if constexpr (sizeof(__x) == 16 && _Np >= 4 && !_Abi::_S_is_partial)
+      {
+	const auto __halves = split<simd<_Tp, simd_abi::_Neon<8>>>(__x);
+	const auto __y = __binary_op(__halves[0], __halves[1]);
+	return _SimdImplNeon<simd_abi::_Neon<8>>::__reduce(
+	  __y, static_cast<_BinaryOperation&&>(__binary_op));
+      }
+    else if constexpr (_Np == 8)
+      {
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<1, 0, 3, 2, 5, 4, 7, 6>(
+				   __x._M_data)));
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<3, 2, 1, 0, 7, 6, 5, 4>(
+				   __x._M_data)));
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<7, 6, 5, 4, 3, 2, 1, 0>(
+				   __x._M_data)));
+	return __x[0];
+      }
+    else if constexpr (_Np == 4)
+      {
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<1, 0, 3, 2>(__x._M_data)));
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<3, 2, 1, 0>(__x._M_data)));
+	return __x[0];
+      }
+    else if constexpr (_Np == 2)
+      {
+	__x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+				 __vector_permute<1, 0>(__x._M_data)));
+	return __x[0];
+      }
+    else
+      return _Base::__reduce(__x, static_cast<_BinaryOperation&&>(__binary_op));
+  }
+
+  // }}}
+  // math {{{
+  // __sqrt {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __sqrt(_Tp __x)
+  {
+    if constexpr (__have_neon_a64)
+      {
+	const auto __intrin = __to_intrin(__x);
+	if constexpr (_TVT::template __is<float, 2>)
+	  return vsqrt_f32(__intrin);
+	else if constexpr (_TVT::template __is<float, 4>)
+	  return vsqrtq_f32(__intrin);
+	else if constexpr (_TVT::template __is<double, 1>)
+	  return vsqrt_f64(__intrin);
+	else if constexpr (_TVT::template __is<double, 2>)
+	  return vsqrtq_f64(__intrin);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__sqrt(__x);
+  } // }}}
+  // __trunc {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __trunc(_Tp __x)
+  {
+    if constexpr (__have_neon_a32)
+      {
+	const auto __intrin = __to_intrin(__x);
+	if constexpr (_TVT::template __is<float, 2>)
+	  return vrnd_f32(__intrin);
+	else if constexpr (_TVT::template __is<float, 4>)
+	  return vrndq_f32(__intrin);
+	else if constexpr (_TVT::template __is<double, 1>)
+	  return vrnd_f64(__intrin);
+	else if constexpr (_TVT::template __is<double, 2>)
+	  return vrndq_f64(__intrin);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__trunc(__x);
+  } // }}}
+  // __floor {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __floor(_Tp __x)
+  {
+    if constexpr (__have_neon_a32)
+      {
+	const auto __intrin = __to_intrin(__x);
+	if constexpr (_TVT::template __is<float, 2>)
+	  return vrndm_f32(__intrin);
+	else if constexpr (_TVT::template __is<float, 4>)
+	  return vrndmq_f32(__intrin);
+	else if constexpr (_TVT::template __is<double, 1>)
+	  return vrndm_f64(__intrin);
+	else if constexpr (_TVT::template __is<double, 2>)
+	  return vrndmq_f64(__intrin);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__floor(__x);
+  } // }}}
+  // __ceil {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __ceil(_Tp __x)
+  {
+    if constexpr (__have_neon_a32)
+      {
+	const auto __intrin = __to_intrin(__x);
+	if constexpr (_TVT::template __is<float, 2>)
+	  return vrndp_f32(__intrin);
+	else if constexpr (_TVT::template __is<float, 4>)
+	  return vrndpq_f32(__intrin);
+	else if constexpr (_TVT::template __is<double, 1>)
+	  return vrndp_f64(__intrin);
+	else if constexpr (_TVT::template __is<double, 2>)
+	  return vrndpq_f64(__intrin);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__ceil(__x);
+  } //}}}
+  //}}}
+}; // }}}
+// _MaskImplNeonMixin {{{
+struct _MaskImplNeonMixin
+{
+  using _Base = _MaskImplBuiltinMixin;
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __to_bits(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if (__builtin_is_constant_evaluated())
+      return _Base::__to_bits(__x);
+
+    using _I = __int_for_sizeof_t<_Tp>;
+    if constexpr (sizeof(__x) == 16)
+      {
+	auto __asint = __vector_bitcast<_I>(__x);
+#ifdef __aarch64__
+	[[maybe_unused]] constexpr auto __zero = decltype(__asint)();
+#else
+	[[maybe_unused]] constexpr auto __zero = decltype(__lo64(__asint))();
+#endif
+	if constexpr (sizeof(_Tp) == 1)
+	  {
+	    constexpr auto __bitsel
+	      = __generate_from_n_evaluations<16, __vector_type_t<_I, 16>>(
+		[&](auto __i) {
+		  return static_cast<_I>(
+		    __i < _Np ? (__i < 8 ? 1 << __i : 1 << (__i - 8)) : 0);
+		});
+	    __asint &= __bitsel;
+#ifdef __aarch64__
+	    return __vector_bitcast<_UShort>(
+	      vpaddq_s8(vpaddq_s8(vpaddq_s8(__asint, __zero), __zero),
+			__zero))[0];
+#else
+	    return __vector_bitcast<_UShort>(
+	      vpadd_s8(vpadd_s8(vpadd_s8(__lo64(__asint), __hi64(__asint)),
+				__zero),
+		       __zero))[0];
+#endif
+	  }
+	else if constexpr (sizeof(_Tp) == 2)
+	  {
+	    constexpr auto __bitsel
+	      = __generate_from_n_evaluations<8, __vector_type_t<_I, 8>>(
+		[&](auto __i) {
+		  return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+		});
+	    __asint &= __bitsel;
+#ifdef __aarch64__
+	    return vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero),
+			      __zero)[0];
+#else
+	    return vpadd_s16(
+	      vpadd_s16(vpadd_s16(__lo64(__asint), __hi64(__asint)), __zero),
+	      __zero)[0];
+#endif
+	  }
+	else if constexpr (sizeof(_Tp) == 4)
+	  {
+	    constexpr auto __bitsel
+	      = __generate_from_n_evaluations<4, __vector_type_t<_I, 4>>(
+		[&](auto __i) {
+		  return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+		});
+	    __asint &= __bitsel;
+#ifdef __aarch64__
+	    return vpaddq_s32(vpaddq_s32(__asint, __zero), __zero)[0];
+#else
+	    return vpadd_s32(vpadd_s32(__lo64(__asint), __hi64(__asint)),
+			     __zero)[0];
+#endif
+	  }
+	else if constexpr (sizeof(_Tp) == 8)
+	  return (__asint[0] & 1) | (__asint[1] & 2);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (sizeof(__x) == 8)
+      {
+	auto __asint = __vector_bitcast<_I>(__x);
+	[[maybe_unused]] constexpr auto __zero = decltype(__asint)();
+	if constexpr (sizeof(_Tp) == 1)
+	  {
+	    constexpr auto __bitsel
+	      = __generate_from_n_evaluations<8, __vector_type_t<_I, 8>>(
+		[&](auto __i) {
+		  return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+		});
+	    __asint &= __bitsel;
+	    return vpadd_s8(vpadd_s8(vpadd_s8(__asint, __zero), __zero),
+			    __zero)[0];
+	  }
+	else if constexpr (sizeof(_Tp) == 2)
+	  {
+	    constexpr auto __bitsel
+	      = __generate_from_n_evaluations<4, __vector_type_t<_I, 4>>(
+		[&](auto __i) {
+		  return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+		});
+	    __asint &= __bitsel;
+	    return vpadd_s16(vpadd_s16(__asint, __zero), __zero)[0];
+	  }
+	else if constexpr (sizeof(_Tp) == 4)
+	  {
+	    __asint &= __make_vector<_I>(0x1, 0x2);
+	    return vpadd_s32(__asint, __zero)[0];
+	  }
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__to_bits(__x);
+  }
+};
+
+// }}}
+// _MaskImplNeon {{{
+template <typename _Abi>
+struct _MaskImplNeon : _MaskImplNeonMixin, _MaskImplBuiltin<_Abi>
+{
+  using _MaskImplBuiltinMixin::__to_maskvector;
+  using _MaskImplNeonMixin::__to_bits;
+  using _Base = _MaskImplBuiltin<_Abi>;
+  using _Base::__convert;
+
+  // __all_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+  {
+    const auto __kk
+      = __vector_bitcast<char>(__k._M_data)
+	| ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+    if constexpr (sizeof(__k) == 16)
+      {
+	const auto __x = __vector_bitcast<long long>(__kk);
+	return __x[0] + __x[1] == -2;
+      }
+    else if constexpr (sizeof(__k) <= 8)
+      return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) == -1;
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __any_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+  {
+    const auto __kk
+      = __vector_bitcast<char>(__k._M_data)
+	| ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+    if constexpr (sizeof(__k) == 16)
+      {
+	const auto __x = __vector_bitcast<long long>(__kk);
+	return (__x[0] | __x[1]) != 0;
+      }
+    else if constexpr (sizeof(__k) <= 8)
+      return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) != 0;
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __none_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+  {
+    const auto __kk
+      = __vector_bitcast<char>(__k._M_data)
+	| ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+    if constexpr (sizeof(__k) == 16)
+      {
+	const auto __x = __vector_bitcast<long long>(__kk);
+	return (__x[0] | __x[1]) == 0;
+      }
+    else if constexpr (sizeof(__k) <= 8)
+      return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) == 0;
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __some_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (sizeof(__k) <= 8)
+      {
+	const auto __kk
+	  = __vector_bitcast<char>(__k._M_data)
+	    | ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+	using _Up = std::make_unsigned_t<__int_for_sizeof_t<decltype(__kk)>>;
+	return __bit_cast<_Up>(__kk) + 1 > 1;
+      }
+    else
+      return _Base::__some_of(__k);
+  }
+
+  // }}}
+  // __popcount {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (sizeof(_Tp) == 1)
+      {
+	const auto __s8 = __vector_bitcast<_SChar>(__k._M_data);
+	int8x8_t __tmp = __lo64(__s8) + __hi64z(__s8);
+	return -vpadd_s8(vpadd_s8(vpadd_s8(__tmp, int8x8_t()), int8x8_t()),
+			 int8x8_t())[0];
+      }
+    else if constexpr (sizeof(_Tp) == 2)
+      {
+	const auto __s16 = __vector_bitcast<short>(__k._M_data);
+	int16x4_t __tmp = __lo64(__s16) + __hi64z(__s16);
+	return -vpadd_s16(vpadd_s16(__tmp, int16x4_t()), int16x4_t())[0];
+      }
+    else if constexpr (sizeof(_Tp) == 4)
+      {
+	const auto __s32 = __vector_bitcast<int>(__k._M_data);
+	int32x2_t __tmp = __lo64(__s32) + __hi64z(__s32);
+	return -vpadd_s32(__tmp, int32x2_t())[0];
+      }
+    else if constexpr (sizeof(_Tp) == 8)
+      {
+	static_assert(sizeof(__k) == 16);
+	const auto __s64 = __vector_bitcast<long>(__k._M_data);
+	return -(__s64[0] + __s64[1]);
+      }
+  }
+
+  // }}}
+  // __find_first_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+  {
+    // TODO: the _Base implementation is not optimal for NEON
+    return _Base::__find_first_set(__k);
+  }
+
+  // }}}
+  // __find_last_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+  {
+    // TODO: the _Base implementation is not optimal for NEON
+    return _Base::__find_last_set(__k);
+  }
+
+  // }}}
+}; // }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
new file mode 100644
index 00000000000..fc4ffe12298
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -0,0 +1,877 @@
+// Simd scalar ABI specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+#if __cplusplus >= 201703L
+
+#include <cmath>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __promote_preserving_unsigned{{{
+// work around crazy semantics of unsigned integers of lower rank than int:
+// Before applying an operator the operands are promoted to int. In which case
+// over- or underflow is UB, even though the operand types were unsigned.
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+__promote_preserving_unsigned(const _Tp& __x)
+{
+  if constexpr (std::is_signed_v<decltype(+__x)> && std::is_unsigned_v<_Tp>)
+    return static_cast<unsigned int>(__x);
+  else
+    return __x;
+}
+
+// }}}
+
+struct _CommonImplScalar;
+struct _CommonImplBuiltin;
+struct _SimdImplScalar;
+struct _MaskImplScalar;
+// simd_abi::_Scalar {{{
+struct simd_abi::_Scalar
+{
+  template <typename _Tp> static constexpr size_t size = 1;
+  template <typename _Tp> static constexpr size_t _S_full_size = 1;
+  static constexpr bool _S_is_partial = false;
+  struct _IsValidAbiTag : true_type
+  {
+  };
+  template <typename _Tp> struct _IsValidSizeFor : true_type
+  {
+  };
+  template <typename _Tp> struct _IsValid : __is_vectorizable<_Tp>
+  {
+  };
+  template <typename _Tp>
+  static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool __masked(bool __x)
+  {
+    return __x;
+  }
+
+  using _CommonImpl = _CommonImplScalar;
+  using _SimdImpl = _SimdImplScalar;
+  using _MaskImpl = _MaskImplScalar;
+
+  template <typename _Tp, bool = _S_is_valid_v<_Tp>>
+  struct __traits : _InvalidTraits
+  {
+  };
+
+  template <typename _Tp> struct __traits<_Tp, true>
+  {
+    using _IsValid = true_type;
+    using _SimdImpl = _SimdImplScalar;
+    using _MaskImpl = _MaskImplScalar;
+    using _SimdMember = _Tp;
+    using _MaskMember = bool;
+    static constexpr size_t _S_simd_align = alignof(_SimdMember);
+    static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+    // nothing the user can spell converts to/from simd/simd_mask
+    struct _SimdCastType
+    {
+      _SimdCastType() = delete;
+    };
+    struct _MaskCastType
+    {
+      _MaskCastType() = delete;
+    };
+    struct _SimdBase
+    {
+    };
+    struct _MaskBase
+    {
+    };
+  };
+};
+// }}}
+// _CommonImplScalar {{{
+struct _CommonImplScalar
+{
+  // __store {{{
+  template <typename _Flags, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_Tp __x, void* __addr, _Flags)
+  {
+    __builtin_memcpy(__addr, &__x, sizeof(_Tp));
+  }
+
+  // }}}
+  // __store_bool_array(_BitMask) {{{
+  template <size_t _Np, typename _Flags, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
+  __store_bool_array(_BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+  {
+    __make_dependent_t<_Flags, _CommonImplBuiltin>::__store_bool_array(__x, __mem,
+                                                                       _Flags());
+  }
+
+  // }}}
+};
+
+// }}}
+// _SimdImplScalar {{{
+struct _SimdImplScalar
+{
+  // member types {{{2
+  using abi_type = simd_abi::scalar;
+  template <typename _Tp> using _TypeTag = _Tp*;
+
+  // broadcast {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp __broadcast(_Tp __x) noexcept
+  {
+    return __x;
+  }
+
+  // __generator {{{2
+  template <typename _Fp, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp __generator(_Fp&& __gen,
+							   _TypeTag<_Tp>)
+  {
+    return __gen(_SizeConstant<0>());
+  }
+
+  // __load {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __load(const _Up* __mem, _Fp,
+					    _TypeTag<_Tp>) noexcept
+  {
+    return static_cast<_Tp>(__mem[0]);
+  }
+
+  // __masked_load {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  static inline _Tp __masked_load(_Tp __merge, bool __k, const _Up* __mem,
+				  _Fp) noexcept
+  {
+    if (__k)
+      __merge = static_cast<_Tp>(__mem[0]);
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  static inline void __store(_Tp __v, _Up* __mem, _Fp, _TypeTag<_Tp>) noexcept
+  {
+    __mem[0] = static_cast<_Tp>(__v);
+  }
+
+  // __masked_store {{{2
+  template <typename _Tp, typename _Up, typename _Fp>
+  static inline void __masked_store(const _Tp __v, _Up* __mem, _Fp,
+				    const bool __k) noexcept
+  {
+    if (__k)
+      __mem[0] = __v;
+  }
+
+  // __negate {{{2
+  template <typename _Tp>
+  static constexpr inline bool __negate(_Tp __x) noexcept
+  {
+    return !__x;
+  }
+
+  // __reduce {{{2
+  template <typename _Tp, typename _BinaryOperation>
+  static constexpr inline _Tp __reduce(const simd<_Tp, simd_abi::scalar>& __x,
+				       _BinaryOperation&)
+  {
+    return __x._M_data;
+  }
+
+  // __min, __max {{{2
+  template <typename _Tp>
+  static constexpr inline _Tp __min(const _Tp __a, const _Tp __b)
+  {
+    return std::min(__a, __b);
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __max(const _Tp __a, const _Tp __b)
+  {
+    return std::max(__a, __b);
+  }
+
+  // __complement {{{2
+  template <typename _Tp>
+  static constexpr inline _Tp __complement(_Tp __x) noexcept
+  {
+    return static_cast<_Tp>(~__x);
+  }
+
+  // __unary_minus {{{2
+  template <typename _Tp>
+  static constexpr inline _Tp __unary_minus(_Tp __x) noexcept
+  {
+    return static_cast<_Tp>(-__x);
+  }
+
+  // arithmetic operators {{{2
+  template <typename _Tp> static constexpr inline _Tp __plus(_Tp __x, _Tp __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			    + __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp> static constexpr inline _Tp __minus(_Tp __x, _Tp __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			    - __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __multiplies(_Tp __x, _Tp __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			    * __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __divides(_Tp __x, _Tp __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			    / __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __modulus(_Tp __x, _Tp __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			    % __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __bit_and(_Tp __x, _Tp __y)
+  {
+    if constexpr (is_floating_point_v<_Tp>)
+      {
+	using _I = __int_for_sizeof_t<_Tp>;
+	const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+		       & reinterpret_cast<const __may_alias<_I>&>(__y);
+	return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+      }
+    else
+      return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			      & __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp> static constexpr inline _Tp __bit_or(_Tp __x, _Tp __y)
+  {
+    if constexpr (is_floating_point_v<_Tp>)
+      {
+	using _I = __int_for_sizeof_t<_Tp>;
+	const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+		       | reinterpret_cast<const __may_alias<_I>&>(__y);
+	return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+      }
+    else
+      return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			      | __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __bit_xor(_Tp __x, _Tp __y)
+  {
+    if constexpr (is_floating_point_v<_Tp>)
+      {
+	using _I = __int_for_sizeof_t<_Tp>;
+	const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+		       ^ reinterpret_cast<const __may_alias<_I>&>(__y);
+	return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+      }
+    else
+      return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+			      ^ __promote_preserving_unsigned(__y));
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __bit_shift_left(_Tp __x, int __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x) << __y);
+  }
+
+  template <typename _Tp>
+  static constexpr inline _Tp __bit_shift_right(_Tp __x, int __y)
+  {
+    return static_cast<_Tp>(__promote_preserving_unsigned(__x) >> __y);
+  }
+
+  // math {{{2
+  // frexp, modf and copysign implemented in simd_math.h
+  template <typename _Tp> using _ST = _SimdTuple<_Tp, simd_abi::scalar>;
+
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __acos(_Tp __x)
+  {
+    return std::acos(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __asin(_Tp __x)
+  {
+    return std::asin(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __atan(_Tp __x)
+  {
+    return std::atan(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cos(_Tp __x)
+  {
+    return std::cos(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sin(_Tp __x)
+  {
+    return std::sin(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tan(_Tp __x)
+  {
+    return std::tan(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __acosh(_Tp __x)
+  {
+    return std::acosh(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __asinh(_Tp __x)
+  {
+    return std::asinh(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __atanh(_Tp __x)
+  {
+    return std::atanh(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cosh(_Tp __x)
+  {
+    return std::cosh(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sinh(_Tp __x)
+  {
+    return std::sinh(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tanh(_Tp __x)
+  {
+    return std::tanh(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __atan2(_Tp __x, _Tp __y)
+  {
+    return std::atan2(__x, __y);
+  }
+
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __exp(_Tp __x)
+  {
+    return std::exp(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __exp2(_Tp __x)
+  {
+    return std::exp2(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __expm1(_Tp __x)
+  {
+    return std::expm1(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log(_Tp __x)
+  {
+    return std::log(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log10(_Tp __x)
+  {
+    return std::log10(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log1p(_Tp __x)
+  {
+    return std::log1p(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log2(_Tp __x)
+  {
+    return std::log2(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __logb(_Tp __x)
+  {
+    return std::logb(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _ST<int> __ilogb(_Tp __x)
+  {
+    return {std::ilogb(__x)};
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __pow(_Tp __x, _Tp __y)
+  {
+    return std::pow(__x, __y);
+  }
+
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __abs(_Tp __x)
+  {
+    return std::abs(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __fabs(_Tp __x)
+  {
+    return std::fabs(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sqrt(_Tp __x)
+  {
+    return std::sqrt(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cbrt(_Tp __x)
+  {
+    return std::cbrt(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __erf(_Tp __x)
+  {
+    return std::erf(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __erfc(_Tp __x)
+  {
+    return std::erfc(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __lgamma(_Tp __x)
+  {
+    return std::lgamma(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tgamma(_Tp __x)
+  {
+    return std::tgamma(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __trunc(_Tp __x)
+  {
+    return std::trunc(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __floor(_Tp __x)
+  {
+    return std::floor(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __ceil(_Tp __x)
+  {
+    return std::ceil(__x);
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x)
+  {
+    return std::nearbyint(__x);
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x)
+  {
+    return std::rint(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _ST<long> __lrint(_Tp __x)
+  {
+    return {std::lrint(__x)};
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _ST<long long> __llrint(_Tp __x)
+  {
+    return {std::llrint(__x)};
+  }
+  template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __round(_Tp __x)
+  {
+    return std::round(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _ST<long> __lround(_Tp __x)
+  {
+    return {std::lround(__x)};
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _ST<long long> __llround(_Tp __x)
+  {
+    return {std::llround(__x)};
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __ldexp(_Tp __x, _ST<int> __y)
+  {
+    return std::ldexp(__x, __y.first);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __scalbn(_Tp __x, _ST<int> __y)
+  {
+    return std::scalbn(__x, __y.first);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __scalbln(_Tp __x, _ST<long> __y)
+  {
+    return std::scalbln(__x, __y.first);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __fmod(_Tp __x, _Tp __y)
+  {
+    return std::fmod(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __remainder(_Tp __x, _Tp __y)
+  {
+    return std::remainder(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __nextafter(_Tp __x, _Tp __y)
+  {
+    return std::nextafter(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __fdim(_Tp __x, _Tp __y)
+  {
+    return std::fdim(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __fmax(_Tp __x, _Tp __y)
+  {
+    return std::fmax(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __fmin(_Tp __x, _Tp __y)
+  {
+    return std::fmin(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __fma(_Tp __x, _Tp __y, _Tp __z)
+  {
+    return std::fma(__x, __y, __z);
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __remquo(_Tp __x, _Tp __y, _ST<int>* __z)
+  {
+    return std::remquo(__x, __y, &__z->first);
+  }
+  template <typename _Tp>
+  [[deprecated]] _GLIBCXX_SIMD_INTRINSIC static _Tp __remquo(_Tp __x, _Tp __y,
+							     int* __z)
+  {
+    return std::remquo(__x, __y, __z);
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static _ST<int> __fpclassify(_Tp __x)
+  {
+    return {std::fpclassify(__x)};
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isfinite(_Tp __x)
+  {
+    return std::isfinite(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isinf(_Tp __x)
+  {
+    return std::isinf(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isnan(_Tp __x)
+  {
+    return std::isnan(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isnormal(_Tp __x)
+  {
+    return std::isnormal(__x);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __signbit(_Tp __x)
+  {
+    return std::signbit(__x);
+  }
+
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isgreater(_Tp __x, _Tp __y)
+  {
+    return std::isgreater(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isgreaterequal(_Tp __x, _Tp __y)
+  {
+    return std::isgreaterequal(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isless(_Tp __x, _Tp __y)
+  {
+    return std::isless(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __islessequal(_Tp __x, _Tp __y)
+  {
+    return std::islessequal(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __islessgreater(_Tp __x, _Tp __y)
+  {
+    return std::islessgreater(__x, __y);
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isunordered(_Tp __x, _Tp __y)
+  {
+    return std::isunordered(__x, __y);
+  }
+
+  // __increment & __decrement{{{2
+  template <typename _Tp> constexpr static inline void __increment(_Tp& __x)
+  {
+    ++__x;
+  }
+  template <typename _Tp> constexpr static inline void __decrement(_Tp& __x)
+  {
+    --__x;
+  }
+
+  // compares {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __equal_to(_Tp __x, _Tp __y)
+  {
+    return __x == __y;
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __not_equal_to(_Tp __x, _Tp __y)
+  {
+    return __x != __y;
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __less(_Tp __x, _Tp __y)
+  {
+    return __x < __y;
+  }
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __less_equal(_Tp __x, _Tp __y)
+  {
+    return __x <= __y;
+  }
+
+  // smart_reference access {{{2
+  template <typename _Tp, typename _Up>
+  constexpr static void __set(_Tp& __v, [[maybe_unused]] int __i,
+			      _Up&& __x) noexcept
+  {
+    _GLIBCXX_DEBUG_ASSERT(__i == 0);
+    __v = static_cast<_Up&&>(__x);
+  }
+
+  // __masked_assign {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static void
+  __masked_assign(bool __k, _Tp& __lhs, _Tp __rhs)
+  {
+    if (__k)
+      __lhs = __rhs;
+  }
+
+  // __masked_cassign {{{2
+  template <typename _Op, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static void
+  __masked_cassign(const bool __k, _Tp& __lhs, const _Tp __rhs, _Op __op)
+  {
+    if (__k)
+      __lhs = __op(_SimdImplScalar{}, __lhs, __rhs);
+  }
+
+  // __masked_unary {{{2
+  template <template <typename> class _Op, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static _Tp __masked_unary(const bool __k,
+							      const _Tp __v)
+  {
+    return static_cast<_Tp>(__k ? _Op<_Tp>{}(__v) : __v);
+  }
+
+  // }}}2
+};
+
+// }}}
+// _MaskImplScalar {{{
+struct _MaskImplScalar
+{
+  // member types {{{
+  template <typename _Tp> using _TypeTag = _Tp*;
+
+  // }}}
+  // __broadcast {{{
+  template <typename>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool __broadcast(bool __x)
+  {
+    return __x;
+  }
+
+  // }}}
+  // __load {{{
+  template <typename, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool __load(const bool* __mem)
+  {
+    return __mem[0];
+  }
+
+  // }}}
+  // __to_bits {{{
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<1>
+  __to_bits(bool __x)
+  {
+    return __x;
+  }
+
+  // }}}
+  // __convert {{{
+  template <typename _Tp, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  __convert(_BitMask<1, _Sanitized> __x)
+  {
+    return __x[0];
+  }
+
+  template <typename _Tp, typename _Up, typename _UAbi>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+  __convert(simd_mask<_Up, _UAbi> __x)
+  {
+    return __x[0];
+  }
+
+  // }}}
+  // __from_bitmask {{{2
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  __from_bitmask(_SanitizedBitMask<1> __bits, _TypeTag<_Tp>) noexcept
+  {
+    return __bits[0];
+  }
+
+  // __masked_load {{{2
+  template <typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  __masked_load(bool __merge, bool __mask, const bool* __mem, _Fp) noexcept
+  {
+    if (__mask)
+      __merge = __mem[0];
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(bool __v, bool* __mem,
+					      _Fp) noexcept
+  {
+    __mem[0] = __v;
+  }
+
+  // __masked_store {{{2
+  template <typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_store(const bool __v, bool* __mem, _Fp, const bool __k) noexcept
+  {
+    if (__k)
+      __mem[0] = __v;
+  }
+
+  // logical and bitwise operators {{{2
+  static constexpr bool __logical_and(bool __x, bool __y) { return __x && __y; }
+  static constexpr bool __logical_or(bool __x, bool __y) { return __x || __y; }
+  static constexpr bool __bit_not(bool __x) { return !__x; }
+  static constexpr bool __bit_and(bool __x, bool __y) { return __x && __y; }
+  static constexpr bool __bit_or(bool __x, bool __y) { return __x || __y; }
+  static constexpr bool __bit_xor(bool __x, bool __y) { return __x != __y; }
+
+  // smart_reference access {{{2
+  constexpr static void __set(bool& __k, [[maybe_unused]] int __i,
+			      bool __x) noexcept
+  {
+    _GLIBCXX_DEBUG_ASSERT(__i == 0);
+    __k = __x;
+  }
+
+  // __masked_assign {{{2
+  _GLIBCXX_SIMD_INTRINSIC static void __masked_assign(bool __k, bool& __lhs,
+						      bool __rhs)
+  {
+    if (__k)
+      __lhs = __rhs;
+  }
+
+  // }}}2
+  // __all_of {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  __all_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __k._M_data;
+  }
+
+  // }}}
+  // __any_of {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  __any_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return __k._M_data;
+  }
+
+  // }}}
+  // __none_of {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  __none_of(simd_mask<_Tp, _Abi> __k)
+  {
+    return !__k._M_data;
+  }
+
+  // }}}
+  // __some_of {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static bool __some_of(simd_mask<_Tp, _Abi>)
+  {
+    return false;
+  }
+
+  // }}}
+  // __popcount {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static int
+  __popcount(simd_mask<_Tp, _Abi> __k)
+  {
+    return __k._M_data;
+  }
+
+  // }}}
+  // __find_first_set {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static int
+    __find_first_set(simd_mask<_Tp, _Abi>)
+  {
+    return 0;
+  }
+
+  // }}}
+  // __find_last_set {{{
+  template <typename _Tp, typename _Abi>
+  _GLIBCXX_SIMD_INTRINSIC constexpr static int
+    __find_last_set(simd_mask<_Tp, _Abi>)
+  {
+    return 0;
+  }
+
+  // }}}
+};
+
+// }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
new file mode 100644
index 00000000000..4e15aac8b62
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -0,0 +1,5037 @@
+// Simd x86 specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+
+#if __cplusplus >= 201703L
+
+#if !_GLIBCXX_SIMD_X86INTRIN
+#error                                                                         \
+  "simd_x86.h may only be included when MMX or SSE on x86(_64) are available"
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __interleave128_lo {{{
+template <typename _Ap, typename _B, typename _Tp = std::common_type_t<_Ap, _B>,
+	  typename _Trait = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__interleave128_lo(const _Ap& __av, const _B& __bv)
+{
+  const _Tp __a(__av);
+  const _Tp __b(__bv);
+  if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 2)
+    return _Tp{__a[0], __b[0]};
+  else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 4)
+    return _Tp{__a[0], __b[0], __a[1], __b[1]};
+  else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 8)
+    return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3], __b[3]};
+  else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 16)
+    return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3], __b[3],
+	       __a[4], __b[4], __a[5], __b[5], __a[6], __b[6], __a[7], __b[7]};
+  else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 4)
+    return _Tp{__a[0], __b[0], __a[2], __b[2]};
+  else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 8)
+    return _Tp{__a[0], __b[0], __a[1], __b[1], __a[4], __b[4], __a[5], __b[5]};
+  else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 16)
+    return _Tp{__a[0],  __b[0],  __a[1],  __b[1], __a[2], __b[2],
+	       __a[3],  __b[3],  __a[8],  __b[8], __a[9], __b[9],
+	       __a[10], __b[10], __a[11], __b[11]};
+  else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 32)
+    return _Tp{__a[0],  __b[0],  __a[1],  __b[1],  __a[2],  __b[2],  __a[3],
+	       __b[3],  __a[4],  __b[4],  __a[5],  __b[5],  __a[6],  __b[6],
+	       __a[7],  __b[7],  __a[16], __b[16], __a[17], __b[17], __a[18],
+	       __b[18], __a[19], __b[19], __a[20], __b[20], __a[21], __b[21],
+	       __a[22], __b[22], __a[23], __b[23]};
+  else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 8)
+    return _Tp{__a[0], __b[0], __a[2], __b[2], __a[4], __b[4], __a[6], __b[6]};
+  else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 16)
+    return _Tp{__a[0],  __b[0],  __a[1],  __b[1], __a[4], __b[4],
+	       __a[5],  __b[5],  __a[8],  __b[8], __a[9], __b[9],
+	       __a[12], __b[12], __a[13], __b[13]};
+  else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 32)
+    return _Tp{__a[0],  __b[0],  __a[1],  __b[1],  __a[2],  __b[2],  __a[3],
+	       __b[3],  __a[8],  __b[8],  __a[9],  __b[9],  __a[10], __b[10],
+	       __a[11], __b[11], __a[16], __b[16], __a[17], __b[17], __a[18],
+	       __b[18], __a[19], __b[19], __a[24], __b[24], __a[25], __b[25],
+	       __a[26], __b[26], __a[27], __b[27]};
+  else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 64)
+    return _Tp{__a[0],  __b[0],  __a[1],  __b[1],  __a[2],  __b[2],  __a[3],
+	       __b[3],  __a[4],  __b[4],  __a[5],  __b[5],  __a[6],  __b[6],
+	       __a[7],  __b[7],  __a[16], __b[16], __a[17], __b[17], __a[18],
+	       __b[18], __a[19], __b[19], __a[20], __b[20], __a[21], __b[21],
+	       __a[22], __b[22], __a[23], __b[23], __a[32], __b[32], __a[33],
+	       __b[33], __a[34], __b[34], __a[35], __b[35], __a[36], __b[36],
+	       __a[37], __b[37], __a[38], __b[38], __a[39], __b[39], __a[48],
+	       __b[48], __a[49], __b[49], __a[50], __b[50], __a[51], __b[51],
+	       __a[52], __b[52], __a[53], __b[53], __a[54], __b[54], __a[55],
+	       __b[55]};
+  else
+    __assert_unreachable<_Tp>();
+}
+
+// }}}
+// __is_zero{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr bool
+__is_zero(_Tp __a)
+{
+  if (!__builtin_is_constant_evaluated())
+    {
+      if constexpr (__have_avx)
+	{
+	  if constexpr (_TVT::template __is<float, 8>)
+	    return _mm256_testz_ps(__a, __a);
+	  else if constexpr (_TVT::template __is<double, 4>)
+	    return _mm256_testz_pd(__a, __a);
+	  else if constexpr (sizeof(_Tp) == 32)
+	    return _mm256_testz_si256(__to_intrin(__a), __to_intrin(__a));
+	  else if constexpr (_TVT::template __is<float>)
+	    return _mm_testz_ps(__to_intrin(__a), __to_intrin(__a));
+	  else if constexpr (_TVT::template __is<double, 2>)
+	    return _mm_testz_pd(__a, __a);
+	  else
+	    return _mm_testz_si128(__to_intrin(__a), __to_intrin(__a));
+	}
+      else if constexpr (__have_sse4_1)
+	return _mm_testz_si128(__intrin_bitcast<__m128i>(__a),
+			       __intrin_bitcast<__m128i>(__a));
+    }
+  else if constexpr (sizeof(_Tp) <= 8)
+    return reinterpret_cast<__int_for_sizeof_t<_Tp>>(__a) == 0;
+  else
+    {
+      const auto __b = __vector_bitcast<_LLong>(__a);
+      if constexpr (sizeof(__b) == 16)
+	return (__b[0] | __b[1]) == 0;
+      else if constexpr (sizeof(__b) == 32)
+	return __is_zero(__lo128(__b) | __hi128(__b));
+      else if constexpr (sizeof(__b) == 64)
+	return __is_zero(__lo256(__b) | __hi256(__b));
+      else
+	__assert_unreachable<_Tp>();
+    }
+}
+// }}}
+// __movemask{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST int
+__movemask(_Tp __a)
+{
+  if constexpr (sizeof(_Tp) == 32)
+    {
+      if constexpr (_TVT::template __is<float>)
+	return _mm256_movemask_ps(__to_intrin(__a));
+      else if constexpr (_TVT::template __is<double>)
+	return _mm256_movemask_pd(__to_intrin(__a));
+      else
+	return _mm256_movemask_epi8(__to_intrin(__a));
+    }
+  else if constexpr (_TVT::template __is<float>)
+    return _mm_movemask_ps(__to_intrin(__a));
+  else if constexpr (_TVT::template __is<double>)
+    return _mm_movemask_pd(__to_intrin(__a));
+  else
+    return _mm_movemask_epi8(__to_intrin(__a));
+}
+
+// }}}
+// __testz{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testz(_Tp __a, typename _TVT::type __b)
+{
+  if (!__builtin_is_constant_evaluated())
+    {
+      if constexpr (sizeof(_Tp) == 32)
+	{
+	  if constexpr (_TVT::template __is<float>)
+	    return _mm256_testz_ps(__to_intrin(__a), __to_intrin(__b));
+	  else if constexpr (_TVT::template __is<double>)
+	    return _mm256_testz_pd(__to_intrin(__a), __to_intrin(__b));
+	  else
+	    return _mm256_testz_si256(__to_intrin(__a), __to_intrin(__b));
+	}
+      else if constexpr (_TVT::template __is<float> && __have_avx)
+	return _mm_testz_ps(__to_intrin(__a), __to_intrin(__b));
+      else if constexpr (_TVT::template __is<double> && __have_avx)
+	return _mm_testz_pd(__to_intrin(__a), __to_intrin(__b));
+      else if constexpr (__have_sse4_1)
+	return _mm_testz_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+			       __intrin_bitcast<__m128i>(__to_intrin(__b)));
+      else
+	return __movemask(0 == __and(__a, __b)) != 0;
+    }
+  else
+    return __is_zero(__and(__a, __b));
+}
+
+// }}}
+// __testc{{{
+// requires SSE4.1 or above
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testc(_Tp __a, typename _TVT::type __b)
+{
+  if (__builtin_is_constant_evaluated())
+    return __is_zero(__andnot(__a, __b));
+
+  if constexpr (sizeof(_Tp) == 32)
+    {
+      if constexpr (_TVT::template __is<float>)
+	return _mm256_testc_ps(__a, __b);
+      else if constexpr (_TVT::template __is<double>)
+	return _mm256_testc_pd(__a, __b);
+      else
+	return _mm256_testc_si256(__to_intrin(__a), __to_intrin(__b));
+    }
+  else if constexpr (_TVT::template __is<float> && __have_avx)
+    return _mm_testc_ps(__to_intrin(__a), __to_intrin(__b));
+  else if constexpr (_TVT::template __is<double> && __have_avx)
+    return _mm_testc_pd(__to_intrin(__a), __to_intrin(__b));
+  else
+    {
+      static_assert(is_same_v<_Tp, _Tp> && __have_sse4_1);
+      return _mm_testc_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+			     __intrin_bitcast<__m128i>(__to_intrin(__b)));
+    }
+}
+
+// }}}
+// __testnzc{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testnzc(_Tp __a, typename _TVT::type __b)
+{
+  if (!__builtin_is_constant_evaluated())
+    {
+      if constexpr (sizeof(_Tp) == 32)
+	{
+	  if constexpr (_TVT::template __is<float>)
+	    return _mm256_testnzc_ps(__a, __b);
+	  else if constexpr (_TVT::template __is<double>)
+	    return _mm256_testnzc_pd(__a, __b);
+	  else
+	    return _mm256_testnzc_si256(__to_intrin(__a), __to_intrin(__b));
+	}
+      else if constexpr (_TVT::template __is<float> && __have_avx)
+	return _mm_testnzc_ps(__to_intrin(__a), __to_intrin(__b));
+      else if constexpr (_TVT::template __is<double> && __have_avx)
+	return _mm_testnzc_pd(__to_intrin(__a), __to_intrin(__b));
+      else if constexpr (__have_sse4_1)
+	return _mm_testnzc_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+				 __intrin_bitcast<__m128i>(__to_intrin(__b)));
+      else
+	return __movemask(0 == __and(__a, __b)) == 0
+	       && __movemask(0 == __andnot(__a, __b)) == 0;
+    }
+  else
+    return !(__is_zero(__and(__a, __b)) || __is_zero(__andnot(__a, __b)));
+}
+
+// }}}
+// __xzyw{{{
+// shuffles the complete vector, swapping the inner two quarters. Often useful
+// for AVX for fixing up a shuffle result.
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _Tp
+__xzyw(_Tp __a)
+{
+  if constexpr (sizeof(_Tp) == 16)
+    {
+      const auto __x = __vector_bitcast<conditional_t<
+	is_floating_point_v<typename _TVT::value_type>, float, int>>(__a);
+      return reinterpret_cast<_Tp>(
+	decltype(__x){__x[0], __x[2], __x[1], __x[3]});
+    }
+  else if constexpr (sizeof(_Tp) == 32)
+    {
+      const auto __x = __vector_bitcast<conditional_t<
+	is_floating_point_v<typename _TVT::value_type>, double, _LLong>>(__a);
+      return reinterpret_cast<_Tp>(
+	decltype(__x){__x[0], __x[2], __x[1], __x[3]});
+    }
+  else if constexpr (sizeof(_Tp) == 64)
+    {
+      const auto __x = __vector_bitcast<conditional_t<
+	is_floating_point_v<typename _TVT::value_type>, double, _LLong>>(__a);
+      return reinterpret_cast<_Tp>(decltype(
+	__x){__x[0], __x[1], __x[4], __x[5], __x[2], __x[3], __x[6], __x[7]});
+    }
+  else
+    __assert_unreachable<_Tp>();
+}
+
+// }}}
+
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+#include "simd_x86_conversions.h"
+#endif
+
+// ISA & type detection {{{
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_sse_ps()
+{
+  return __have_sse
+	 && std::is_same_v<_Tp,
+			   float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 16;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_sse_pd()
+{
+  return __have_sse2
+	 && std::is_same_v<
+	   _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 16;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx_ps()
+{
+  return __have_avx
+	 && std::is_same_v<_Tp,
+			   float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 32;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx_pd()
+{
+  return __have_avx
+	 && std::is_same_v<
+	   _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 32;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx512_ps()
+{
+  return __have_avx512f
+	 && std::is_same_v<_Tp,
+			   float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 64;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx512_pd()
+{
+  return __have_avx512f
+	 && std::is_same_v<
+	   _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 64;
+}
+
+// }}}
+struct _MaskImplX86Mixin;
+// _CommonImplX86 {{{
+struct _CommonImplX86 : _CommonImplBuiltin
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+  // __converts_via_decomposition {{{
+  template <typename _From, typename _To, size_t _ToSize>
+  static constexpr bool __converts_via_decomposition()
+  {
+    if constexpr (is_integral_v<
+		    _From> && is_integral_v<_To> && sizeof(_From) == 8
+		  && _ToSize == 16)
+      return (sizeof(_To) == 2 && !__have_ssse3)
+	     || (sizeof(_To) == 1 && !__have_avx512f);
+    else if constexpr (is_floating_point_v<_From> && is_integral_v<_To>)
+      return ((sizeof(_From) == 4 || sizeof(_From) == 8) && sizeof(_To) == 8
+	      && !__have_avx512dq)
+	     || (sizeof(_From) == 8 && sizeof(_To) == 4 && !__have_sse4_1
+		 && _ToSize == 16);
+    else if constexpr (
+      is_integral_v<_From> && is_floating_point_v<_To> && sizeof(_From) == 8
+      && !__have_avx512dq)
+      return (sizeof(_To) == 4 && _ToSize == 16)
+	     || (sizeof(_To) == 8 && _ToSize < 64);
+    else
+      return false;
+  }
+
+  template <typename _From, typename _To, size_t _ToSize>
+  static inline constexpr bool __converts_via_decomposition_v
+    = __converts_via_decomposition<_From, _To, _ToSize>();
+
+  // }}}
+#endif
+  // __store {{{
+  using _CommonImplBuiltin::__store;
+
+  template <typename _Flags, typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __x,
+					      void* __addr, _Flags)
+  {
+    constexpr size_t _Bytes = _Np * sizeof(_Tp);
+
+    if constexpr ((_Bytes & (_Bytes - 1)) != 0 && __have_avx512bw_vl)
+      {
+	const auto __v = __to_intrin(__x);
+	if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+	  __addr
+	    = __builtin_assume_aligned(__addr, alignof(_SimdWrapper<_Tp, _Np>));
+	else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+	  __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+
+	if constexpr (_Bytes & 1)
+	  {
+	    if constexpr (_Bytes < 16)
+	      _mm_mask_storeu_epi8(__addr, 0xffffu >> (16 - _Bytes),
+				   __intrin_bitcast<__m128i>(__v));
+	    else if constexpr (_Bytes < 32)
+	      _mm256_mask_storeu_epi8(__addr, 0xffffffffu >> (32 - _Bytes),
+				      __intrin_bitcast<__m256i>(__v));
+	    else
+	      _mm512_mask_storeu_epi8(__addr,
+				      0xffffffffffffffffull >> (64 - _Bytes),
+				      __intrin_bitcast<__m512i>(__v));
+	  }
+	else if constexpr (_Bytes & 2)
+	  {
+	    if constexpr (_Bytes < 16)
+	      _mm_mask_storeu_epi16(__addr, 0xffu >> (8 - _Bytes / 2),
+				    __intrin_bitcast<__m128i>(__v));
+	    else if constexpr (_Bytes < 32)
+	      _mm256_mask_storeu_epi16(__addr, 0xffffu >> (16 - _Bytes / 2),
+				       __intrin_bitcast<__m256i>(__v));
+	    else
+	      _mm512_mask_storeu_epi16(__addr,
+				       0xffffffffull >> (32 - _Bytes / 2),
+				       __intrin_bitcast<__m512i>(__v));
+	  }
+	else if constexpr (_Bytes & 4)
+	  {
+	    if constexpr (_Bytes < 16)
+	      _mm_mask_storeu_epi32(__addr, 0xfu >> (4 - _Bytes / 4),
+				    __intrin_bitcast<__m128i>(__v));
+	    else if constexpr (_Bytes < 32)
+	      _mm256_mask_storeu_epi32(__addr, 0xffu >> (8 - _Bytes / 4),
+				       __intrin_bitcast<__m256i>(__v));
+	    else
+	      _mm512_mask_storeu_epi32(__addr, 0xffffull >> (16 - _Bytes / 4),
+				       __intrin_bitcast<__m512i>(__v));
+	  }
+	else
+	  {
+	    static_assert(
+	      _Bytes > 16,
+	      "_Bytes < 16 && (_Bytes & 7) == 0 && (_Bytes & (_Bytes "
+	      "- 1)) != 0 is impossible");
+	    if constexpr (_Bytes < 32)
+	      _mm256_mask_storeu_epi64(__addr, 0xfu >> (4 - _Bytes / 8),
+				       __intrin_bitcast<__m256i>(__v));
+	    else
+	      _mm512_mask_storeu_epi64(__addr, 0xffull >> (8 - _Bytes / 8),
+				       __intrin_bitcast<__m512i>(__v));
+	  }
+      }
+    else
+      _CommonImplBuiltin::__store(__x, __addr, _Flags());
+  }
+
+  // }}}
+  // __store_bool_array(_BitMask) {{{
+  template <size_t _Np, typename _Flags, bool _Sanitized>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
+  __store_bool_array(const _BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+  {
+    if constexpr (__have_avx512bw_vl) // don't care for BW w/o VL
+      __store<_Np>(1 & __vector_bitcast<_UChar, _Np>([=]() constexpr {
+		     if constexpr (_Np <= 16)
+		       return _mm_movm_epi8(__x._M_to_bits());
+		     else if constexpr (_Np <= 32)
+		       return _mm256_movm_epi8(__x._M_to_bits());
+		     else if constexpr (_Np <= 64)
+		       return _mm512_movm_epi8(__x._M_to_bits());
+		     else
+		       __assert_unreachable<_SizeConstant<_Np>>();
+		   }()),
+		   __mem, _Flags());
+    else if constexpr (__have_bmi2)
+      {
+	if constexpr (_Np <= 4)
+	  __store<_Np>(_pdep_u32(__x._M_to_bits(), 0x01010101U), __mem,
+		       _Flags());
+	else
+	  __execute_n_times<__div_roundup(_Np, sizeof(size_t))>([&](auto __i) {
+	    constexpr size_t __offset = __i * sizeof(size_t);
+	    constexpr int __todo = std::min(sizeof(size_t), _Np - __offset);
+	    if constexpr (__todo == 1)
+	      __mem[__offset] = __x[__offset];
+	    else
+	      {
+		const auto __bools =
+#ifdef __x86_64__
+		  _pdep_u64(__x.template _M_extract<__offset>().to_ullong(),
+			    0x0101010101010101ULL);
+#else  // __x86_64__
+		  _pdep_u32(__x.template _M_extract<__offset>()._M_to_bits(),
+			    0x01010101U);
+#endif // __x86_64__
+		__store<__todo>(__bools, __mem + __offset, _Flags());
+	      }
+	  });
+      }
+    else if constexpr (__have_sse2 && _Np > 7)
+      __execute_n_times<__div_roundup(_Np, 16)>([&](auto __i) {
+	constexpr int __offset = __i * 16;
+	constexpr int __todo = std::min(16, int(_Np) - __offset);
+	const int __bits = __x.template _M_extract<__offset>()._M_to_bits();
+	__vector_type16_t<_UChar> __bools;
+	if constexpr (__have_avx512f)
+	  {
+	    auto __as32bits
+	      = _mm512_maskz_mov_epi32(__bits,
+				       __to_intrin(__vector_broadcast<16>(1)));
+	    auto __as16bits = __xzyw(
+	      _mm256_packs_epi32(__lo256(__as32bits),
+				 __todo > 8 ? __hi256(__as32bits) : __m256i()));
+	    __bools = __vector_bitcast<_UChar>(
+	      _mm_packs_epi16(__lo128(__as16bits), __hi128(__as16bits)));
+	  }
+	else
+	  {
+	    using _V = __vector_type_t<_UChar, 16>;
+	    auto __tmp = _mm_cvtsi32_si128(__bits);
+	    __tmp = _mm_unpacklo_epi8(__tmp, __tmp);
+	    __tmp = _mm_unpacklo_epi16(__tmp, __tmp);
+	    __tmp = _mm_unpacklo_epi32(__tmp, __tmp);
+	    _V __tmp2 = reinterpret_cast<_V>(__tmp);
+	    __tmp2 &= _V{1, 2, 4, 8, 16, 32, 64, 128,
+			 1, 2, 4, 8, 16, 32, 64, 128}; // mask bit index
+	    __bools = (__tmp2 == 0) + 1; // 0xff -> 0x00 | 0x00 -> 0x01
+	  }
+	__store<__todo>(__bools, __mem + __offset, _Flags());
+      });
+    else
+      _CommonImplBuiltin::__store_bool_array(__x, __mem, _Flags());
+  }
+
+  // }}}
+  // _S_blend_avx512 {{{
+  // Returns: __k ? __b : __a
+  // TODO: reverse __a and __b to match COND_EXPR
+  // Requires: _TV to be a __vector_type_t matching valuetype for the bitmask
+  //           __k
+  template <typename _Kp, typename _TV>
+  _GLIBCXX_SIMD_INTRINSIC static _TV
+  _S_blend_avx512(const _Kp __k, const _TV __a, const _TV __b) noexcept
+  {
+    static_assert(__is_vector_type_v<_TV>);
+    using _Tp = typename _VectorTraits<_TV>::value_type;
+    static_assert(sizeof(_TV) >= 16);
+    static_assert(sizeof(_Tp) <= 8);
+    using _IntT = conditional_t<(sizeof(_Tp) > 2),
+				conditional_t<sizeof(_Tp) == 4, int, long long>,
+				conditional_t<sizeof(_Tp) == 1, char, short>>;
+    [[maybe_unused]] const auto __aa = __vector_bitcast<_IntT>(__a);
+    [[maybe_unused]] const auto __bb = __vector_bitcast<_IntT>(__b);
+    if constexpr (sizeof(_TV) == 64)
+      {
+	if constexpr (sizeof(_Tp) == 1)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmb_512_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 2)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmw_512_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmps_512_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 4)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmd_512_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmpd_512_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 8)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmq_512_mask(__aa, __bb, __k));
+      }
+    else if constexpr (sizeof(_TV) == 32)
+      {
+	if constexpr (sizeof(_Tp) == 1)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmb_256_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 2)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmw_256_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmps_256_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 4)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmd_256_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmpd_256_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 8)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmq_256_mask(__aa, __bb, __k));
+      }
+    else if constexpr (sizeof(_TV) == 16)
+      {
+	if constexpr (sizeof(_Tp) == 1)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmb_128_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 2)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmw_128_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmps_128_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 4)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmd_128_mask(__aa, __bb, __k));
+	else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+	  return __builtin_ia32_blendmpd_128_mask(__a, __b, __k);
+	else if constexpr (sizeof(_Tp) == 8)
+	  return reinterpret_cast<_TV>(
+	    __builtin_ia32_blendmq_128_mask(__aa, __bb, __k));
+      }
+  }
+
+  // }}}
+  // _S_blend_intrin {{{
+  // Returns: __k ? __b : __a
+  // TODO: reverse __a and __b to match COND_EXPR
+  // Requires: _Tp to be an intrinsic type (integers blend per byte) and 16/32
+  //           Bytes wide
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp _S_blend_intrin(_Tp __k, _Tp __a,
+						     _Tp __b) noexcept
+  {
+    static_assert(is_same_v<decltype(__to_intrin(__a)), _Tp>);
+    constexpr struct
+    {
+      _GLIBCXX_SIMD_INTRINSIC __m128 operator()(__m128 __a, __m128 __b,
+						__m128 __k) const noexcept
+      {
+	return __builtin_ia32_blendvps(__a, __b, __k);
+      }
+      _GLIBCXX_SIMD_INTRINSIC __m128d operator()(__m128d __a, __m128d __b,
+						 __m128d __k) const noexcept
+      {
+	return __builtin_ia32_blendvpd(__a, __b, __k);
+      }
+      _GLIBCXX_SIMD_INTRINSIC __m128i operator()(__m128i __a, __m128i __b,
+						 __m128i __k) const noexcept
+      {
+	return reinterpret_cast<__m128i>(
+	  __builtin_ia32_pblendvb128(reinterpret_cast<__v16qi>(__a),
+				     reinterpret_cast<__v16qi>(__b),
+				     reinterpret_cast<__v16qi>(__k)));
+      }
+      _GLIBCXX_SIMD_INTRINSIC __m256 operator()(__m256 __a, __m256 __b,
+						__m256 __k) const noexcept
+      {
+	return __builtin_ia32_blendvps256(__a, __b, __k);
+      }
+      _GLIBCXX_SIMD_INTRINSIC __m256d operator()(__m256d __a, __m256d __b,
+						 __m256d __k) const noexcept
+      {
+	return __builtin_ia32_blendvpd256(__a, __b, __k);
+      }
+      _GLIBCXX_SIMD_INTRINSIC __m256i operator()(__m256i __a, __m256i __b,
+						 __m256i __k) const noexcept
+      {
+	return reinterpret_cast<__m256i>(
+	  __builtin_ia32_pblendvb256(reinterpret_cast<__v32qi>(__a),
+				     reinterpret_cast<__v32qi>(__b),
+				     reinterpret_cast<__v32qi>(__k)));
+      }
+    } __eval;
+    return __eval(__a, __b, __k);
+  }
+
+  // }}}
+  // _S_blend {{{
+  // Returns: __k ? __at1 : __at0
+  // TODO: reverse __at0 and __at1 to match COND_EXPR
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  _S_blend(_SimdWrapper<bool, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+	   _SimdWrapper<_Tp, _Np> __at1)
+  {
+    static_assert(is_same_v<_Tp, _Tp> && __have_avx512f);
+    if (__k._M_is_constprop() && __at0._M_is_constprop()
+	&& __at1._M_is_constprop())
+      return __generate_from_n_evaluations<_Np, __vector_type_t<_Tp, _Np>>([&](
+	auto __i) constexpr { return __k[__i] ? __at1[__i] : __at0[__i]; });
+    else if constexpr (sizeof(__at0) == 64
+		  || (__have_avx512vl && sizeof(__at0) >= 16))
+      return _S_blend_avx512(__k._M_data, __at0._M_data, __at1._M_data);
+    else
+      {
+	static_assert((__have_avx512vl && sizeof(__at0) < 16)
+		      || !__have_avx512vl);
+	constexpr size_t __size = (__have_avx512vl ? 16 : 64) / sizeof(_Tp);
+	return __vector_bitcast<_Tp, _Np>(
+	  _S_blend_avx512(__k._M_data, __vector_bitcast<_Tp, __size>(__at0),
+			  __vector_bitcast<_Tp, __size>(__at1)));
+      }
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  _S_blend(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+	   _SimdWrapper<_Tp, _Np> __at1)
+  {
+    if (__builtin_is_constant_evaluated()
+	|| (__k._M_is_constprop() && __at0._M_is_constprop()
+	    && __at1._M_is_constprop()))
+      {
+	auto __r = __or(__andnot(__k, __at0), __and(__k, __at1));
+	if (__r._M_is_constprop())
+	  return __r;
+      }
+    if constexpr (((__have_avx512f && sizeof(__at0) == 64)
+			|| __have_avx512vl)
+		       && (sizeof(_Tp) >= 4 || __have_avx512bw))
+      // convert to bitmask and call overload above
+      return _S_blend(_SimdWrapper<bool, _Np>(
+			__make_dependent_t<_Tp, _MaskImplX86Mixin>::__to_bits(
+			  __k)
+			  ._M_to_bits()),
+		      __at0, __at1);
+    else
+      {
+	// Since GCC does not assume __k to be a mask, using the builtin
+	// conditional operator introduces an extra compare against 0 before
+	// blending. So we rather call the intrinsic here.
+	if constexpr (__have_sse4_1)
+	  return _S_blend_intrin(__to_intrin(__k), __to_intrin(__at0),
+				 __to_intrin(__at1));
+	else
+	  return __or(__andnot(__k, __at0), __and(__k, __at1));
+      }
+  }
+
+  // }}}
+};
+
+// }}}
+// _SimdImplX86 {{{
+template <typename _Abi> struct _SimdImplX86 : _SimdImplBuiltin<_Abi>
+{
+  using _Base = _SimdImplBuiltin<_Abi>;
+  template <typename _Tp>
+  using _MaskMember = typename _Base::template _MaskMember<_Tp>;
+  template <typename _Tp>
+  static constexpr size_t _S_full_size = _Abi::template _S_full_size<_Tp>;
+  template <typename _Tp>
+  static constexpr size_t size = _Abi::template size<_Tp>;
+  template <typename _Tp>
+  static constexpr size_t _S_max_store_size
+    = (sizeof(_Tp) >= 4 && __have_avx512f) || __have_avx512bw
+	? 64
+	: (std::is_floating_point_v<_Tp>&& __have_avx) || __have_avx2 ? 32 : 16;
+  using _MaskImpl = typename _Abi::_MaskImpl;
+
+  // __masked_load {{{
+  template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+  static inline _SimdWrapper<_Tp, _Np>
+  __masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
+		const _Up* __mem, _Fp) noexcept
+  {
+    static_assert(_Np == size<_Tp>);
+    if constexpr (std::is_same_v<_Tp, _Up> || // no conversion
+		  (sizeof(_Tp) == sizeof(_Up)
+		   && std::is_integral_v<
+			_Tp> == std::is_integral_v<_Up>) // conversion via bit
+							 // reinterpretation
+    )
+      {
+	[[maybe_unused]] const auto __intrin = __to_intrin(__merge);
+	if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512bw_vl)
+		      && sizeof(_Tp) == 1)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_epi8(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__merge) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_epi8(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__merge) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_epi8(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512bw_vl)
+			   && sizeof(_Tp) == 2)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_epi16(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_epi16(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_epi16(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+			   && sizeof(_Tp) == 4 && std::is_integral_v<_Up>)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_epi32(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_epi32(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_epi32(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+			   && sizeof(_Tp) == 4 && std::is_floating_point_v<_Up>)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_ps(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_ps(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_ps(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+			   && std::is_integral_v<_Up>)
+	  {
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge
+		= __or(__andnot(__k._M_data, __merge._M_data),
+		       __vector_bitcast<_Tp, _Np>(
+			 _mm_maskload_epi32(reinterpret_cast<const int*>(__mem),
+					    __to_intrin(__k))));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge
+		= (~__k._M_data & __merge._M_data)
+		  | __vector_bitcast<_Tp, _Np>(
+		    _mm256_maskload_epi32(reinterpret_cast<const int*>(__mem),
+					  __to_intrin(__k)));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (__have_avx && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __or(__andnot(__k._M_data, __merge._M_data),
+			     __vector_bitcast<_Tp, _Np>(_mm_maskload_ps(
+			       reinterpret_cast<const float*>(__mem),
+			       __intrin_bitcast<__m128i>(__as_vector(__k)))));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge
+		= __or(__andnot(__k._M_data, __merge._M_data),
+		       _mm256_maskload_ps(reinterpret_cast<const float*>(__mem),
+					  __vector_bitcast<_LLong>(__k)));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+			   && sizeof(_Tp) == 8 && std::is_integral_v<_Up>)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_epi64(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_epi64(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_epi64(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+			   && sizeof(_Tp) == 8 && std::is_floating_point_v<_Up>)
+	  {
+	    const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm_mask_loadu_pd(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm256_mask_loadu_pd(__intrin, __kk, __mem));
+	    else if constexpr (sizeof(__intrin) == 64)
+	      __merge = __vector_bitcast<_Tp, _Np>(
+		_mm512_mask_loadu_pd(__intrin, __kk, __mem));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+			   && std::is_integral_v<_Up>)
+	  {
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge = __or(__andnot(__k._M_data, __merge._M_data),
+			     __vector_bitcast<_Tp, _Np>(_mm_maskload_epi64(
+			       reinterpret_cast<const _LLong*>(__mem),
+			       __to_intrin(__k))));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge
+		= (~__k._M_data & __merge._M_data)
+		  | __vector_bitcast<_Tp>(_mm256_maskload_epi64(
+		    reinterpret_cast<const _LLong*>(__mem), __to_intrin(__k)));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (__have_avx && sizeof(_Tp) == 8)
+	  {
+	    if constexpr (sizeof(__intrin) == 16)
+	      __merge
+		= __or(__andnot(__k._M_data, __merge._M_data),
+		       __vector_bitcast<_Tp, _Np>(
+			 _mm_maskload_pd(reinterpret_cast<const double*>(__mem),
+					 __vector_bitcast<_LLong>(__k))));
+	    else if constexpr (sizeof(__intrin) == 32)
+	      __merge = __or(__andnot(__k._M_data, __merge._M_data),
+			     _mm256_maskload_pd(reinterpret_cast<const double*>(
+						  __mem),
+						__vector_bitcast<_LLong>(__k)));
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else
+	  _BitOps::__bit_iteration(_MaskImpl::__to_bits(__k), [&](auto __i) {
+	    __merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+	  });
+      }
+    /* Very uncertain, that the following improves anything. Needs benchmarking
+     * before it's activated.
+    else if constexpr (sizeof(_Up) <= 8 && // no long double
+		       !__converts_via_decomposition_v<
+			 _Up, _Tp,
+			 sizeof(__merge)> // conversion via decomposition
+					  // is better handled via the
+					  // bit_iteration fallback below
+    )
+      {
+	// TODO: copy pattern from __masked_store, which doesn't resort to
+	// fixed_size
+	using _Ap       = simd_abi::deduce_t<_Up, _Np>;
+	using _ATraits = _SimdTraits<_Up, _Ap>;
+	using _AImpl   = typename _ATraits::_SimdImpl;
+	typename _ATraits::_SimdMember __uncvted{};
+	typename _ATraits::_MaskMember __kk = _Ap::_MaskImpl::template
+    __convert<_Up>(__k);
+	__uncvted = _AImpl::__masked_load(__uncvted, __kk, __mem, _Fp());
+	_SimdConverter<_Up, _Ap, _Tp, _Abi> __converter;
+	_Base::__masked_assign(__k, __merge, __converter(__uncvted));
+      }
+      */
+    else
+      __merge = _Base::__masked_load(__merge, __k, __mem, _Fp());
+    return __merge;
+    return __merge;
+  }
+
+  // }}}
+  // __masked_store_nocvt {{{
+  template <typename _Tp, std::size_t _Np, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+		       _SimdWrapper<bool, _Np> __k)
+  {
+    [[maybe_unused]] const auto __vi = __to_intrin(__v);
+    if constexpr (sizeof(__vi) == 64)
+      {
+	static_assert(sizeof(__v) == 64 && __have_avx512f);
+	if constexpr (__have_avx512bw && sizeof(_Tp) == 1)
+	  _mm512_mask_storeu_epi8(__mem, __k, __vi);
+	else if constexpr (__have_avx512bw && sizeof(_Tp) == 2)
+	  _mm512_mask_storeu_epi16(__mem, __k, __vi);
+	else if constexpr (__have_avx512f && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (__is_aligned_v<_Fp, 64> && std::is_integral_v<_Tp>)
+	      _mm512_mask_store_epi32(__mem, __k, __vi);
+	    else if constexpr (__is_aligned_v<
+				 _Fp, 64> && std::is_floating_point_v<_Tp>)
+	      _mm512_mask_store_ps(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm512_mask_storeu_epi32(__mem, __k, __vi);
+	    else
+	      _mm512_mask_storeu_ps(__mem, __k, __vi);
+	  }
+	else if constexpr (__have_avx512f && sizeof(_Tp) == 8)
+	  {
+	    if constexpr (__is_aligned_v<_Fp, 64> && std::is_integral_v<_Tp>)
+	      _mm512_mask_store_epi64(__mem, __k, __vi);
+	    else if constexpr (__is_aligned_v<
+				 _Fp, 64> && std::is_floating_point_v<_Tp>)
+	      _mm512_mask_store_pd(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm512_mask_storeu_epi64(__mem, __k, __vi);
+	    else
+	      _mm512_mask_storeu_pd(__mem, __k, __vi);
+	  }
+#if 0 // with KNL either sizeof(_Tp) >= 4 or sizeof(_vi) <= 32
+      // with Skylake-AVX512, __have_avx512bw is true
+	else if constexpr (__have_sse2)
+	  {
+	    using _M   = __vector_type_t<_Tp, _Np>;
+	    using _MVT = _VectorTraits<_M>;
+	    _mm_maskmoveu_si128(__auto_bitcast(__extract<0, 4>(__v._M_data)),
+				__auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(__k._M_data)),
+				reinterpret_cast<char*>(__mem));
+	    _mm_maskmoveu_si128(__auto_bitcast(__extract<1, 4>(__v._M_data)),
+				__auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+				  __k._M_data >> 1 * _MVT::_S_width)),
+				reinterpret_cast<char*>(__mem) + 1 * 16);
+	    _mm_maskmoveu_si128(__auto_bitcast(__extract<2, 4>(__v._M_data)),
+				__auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+				  __k._M_data >> 2 * _MVT::_S_width)),
+				reinterpret_cast<char*>(__mem) + 2 * 16);
+	    if constexpr (_Np > 48 / sizeof(_Tp))
+	      _mm_maskmoveu_si128(
+		__auto_bitcast(__extract<3, 4>(__v._M_data)),
+		__auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+		  __k._M_data >> 3 * _MVT::_S_width)),
+		reinterpret_cast<char*>(__mem) + 3 * 16);
+	  }
+#endif
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (sizeof(__vi) == 32)
+      {
+	if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+	  _mm256_mask_storeu_epi8(__mem, __k, __vi);
+	else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+	  _mm256_mask_storeu_epi16(__mem, __k, __vi);
+	else if constexpr (__have_avx512vl && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (__is_aligned_v<_Fp, 32> && std::is_integral_v<_Tp>)
+	      _mm256_mask_store_epi32(__mem, __k, __vi);
+	    else if constexpr (__is_aligned_v<
+				 _Fp, 32> && std::is_floating_point_v<_Tp>)
+	      _mm256_mask_store_ps(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm256_mask_storeu_epi32(__mem, __k, __vi);
+	    else
+	      _mm256_mask_storeu_ps(__mem, __k, __vi);
+	  }
+	else if constexpr (__have_avx512vl && sizeof(_Tp) == 8)
+	  {
+	    if constexpr (__is_aligned_v<_Fp, 32> && std::is_integral_v<_Tp>)
+	      _mm256_mask_store_epi64(__mem, __k, __vi);
+	    else if constexpr (__is_aligned_v<
+				 _Fp, 32> && std::is_floating_point_v<_Tp>)
+	      _mm256_mask_store_pd(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm256_mask_storeu_epi64(__mem, __k, __vi);
+	    else
+	      _mm256_mask_storeu_pd(__mem, __k, __vi);
+	  }
+	else if constexpr (__have_avx512f
+			   && (sizeof(_Tp) >= 4 || __have_avx512bw))
+	  {
+	    // use a 512-bit maskstore, using zero-extension of the bitmask
+	    __masked_store_nocvt(
+	      _SimdWrapper64<_Tp>(
+		__intrin_bitcast<__vector_type64_t<_Tp>>(__v._M_data)),
+	      __mem,
+	      // careful, vector_aligned has a stricter meaning in the
+	      // 512-bit maskstore:
+	      std::conditional_t<std::is_same_v<_Fp, vector_aligned_tag>,
+				 overaligned_tag<32>, _Fp>(),
+	      _SimdWrapper<bool, 64 / sizeof(_Tp)>(__k._M_data));
+	  }
+	else
+	  __masked_store_nocvt(
+	    __v, __mem, _Fp(),
+	    _MaskImpl::template __to_maskvector<_Tp, 32 / sizeof(_Tp)>(__k));
+      }
+    else if constexpr (sizeof(__vi) == 16)
+      {
+	// the store is aligned if _Fp is overaligned_tag<16> (or higher) or _Fp
+	// is vector_aligned_tag while __v is actually a 16-Byte vector (could
+	// be 2/4/8 as well)
+	[[maybe_unused]] constexpr bool __aligned
+	  = __is_aligned_v<
+	      _Fp,
+	      16> && (sizeof(__v) == 16 || !std::is_same_v<_Fp, vector_aligned_tag>);
+	if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+	  _mm_mask_storeu_epi8(__mem, __k, __vi);
+	else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+	  _mm_mask_storeu_epi16(__mem, __k, __vi);
+	else if constexpr (__have_avx512vl && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (__aligned && std::is_integral_v<_Tp>)
+	      _mm_mask_store_epi32(__mem, __k, __vi);
+	    else if constexpr (__aligned && std::is_floating_point_v<_Tp>)
+	      _mm_mask_store_ps(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm_mask_storeu_epi32(__mem, __k, __vi);
+	    else
+	      _mm_mask_storeu_ps(__mem, __k, __vi);
+	  }
+	else if constexpr (__have_avx512vl && sizeof(_Tp) == 8)
+	  {
+	    if constexpr (__aligned && std::is_integral_v<_Tp>)
+	      _mm_mask_store_epi64(__mem, __k, __vi);
+	    else if constexpr (__aligned && std::is_floating_point_v<_Tp>)
+	      _mm_mask_store_pd(__mem, __k, __vi);
+	    else if constexpr (std::is_integral_v<_Tp>)
+	      _mm_mask_storeu_epi64(__mem, __k, __vi);
+	    else
+	      _mm_mask_storeu_pd(__mem, __k, __vi);
+	  }
+	else if constexpr (__have_avx512f
+			   && (sizeof(_Tp) >= 4 || __have_avx512bw))
+	  {
+	    // use a 512-bit maskstore, using zero-extension of the bitmask
+	    __masked_store_nocvt(
+	      _SimdWrapper64<_Tp>(
+		__intrin_bitcast<__intrinsic_type64_t<_Tp>>(__v._M_data)),
+	      __mem,
+	      // careful, vector_aligned has a stricter meaning in the 512-bit
+	      // maskstore:
+	      std::conditional_t<std::is_same_v<_Fp, vector_aligned_tag>,
+				 overaligned_tag<sizeof(__v)>, _Fp>(),
+	      _SimdWrapper<bool, 64 / sizeof(_Tp)>(__k._M_data));
+	  }
+	else
+	  __masked_store_nocvt(
+	    __v, __mem, _Fp(),
+	    _MaskImpl::template __to_maskvector<_Tp, 16 / sizeof(_Tp)>(__k));
+      }
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  template <typename _TW,
+	    typename _Tp = typename _VectorTraits<_TW>::value_type,
+	    typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void __masked_store_nocvt(_TW __v, _Tp* __mem,
+							   _Fp, _TW __k)
+  {
+    if constexpr (sizeof(_TW) <= 16)
+      {
+	[[maybe_unused]] const auto __vi
+	  = __intrin_bitcast<__m128i>(__as_vector(__v));
+	[[maybe_unused]] const auto __ki
+	  = __intrin_bitcast<__m128i>(__as_vector(__k));
+	if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+	  _mm_mask_storeu_epi8(__mem, _mm_movepi8_mask(__ki), __vi);
+	else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+	  _mm_mask_storeu_epi16(__mem, _mm_movepi16_mask(__ki), __vi);
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+			   && std::is_integral_v<_Tp>)
+	  _mm_maskstore_epi32(reinterpret_cast<int*>(__mem), __ki, __vi);
+	else if constexpr (__have_avx && sizeof(_Tp) == 4)
+	  _mm_maskstore_ps(reinterpret_cast<float*>(__mem), __ki,
+			   __vector_bitcast<float>(__vi));
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+			   && std::is_integral_v<_Tp>)
+	  _mm_maskstore_epi64(reinterpret_cast<_LLong*>(__mem), __ki, __vi);
+	else if constexpr (__have_avx && sizeof(_Tp) == 8)
+	  _mm_maskstore_pd(reinterpret_cast<double*>(__mem), __ki,
+			   __vector_bitcast<double>(__vi));
+	else if constexpr (__have_sse2)
+	  _mm_maskmoveu_si128(__vi, __ki, reinterpret_cast<char*>(__mem));
+      }
+    else if constexpr (sizeof(_TW) == 32)
+      {
+	[[maybe_unused]] const auto __vi
+	  = __intrin_bitcast<__m256i>(__as_vector(__v));
+	[[maybe_unused]] const auto __ki
+	  = __intrin_bitcast<__m256i>(__as_vector(__k));
+	if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+	  _mm256_mask_storeu_epi8(__mem, _mm256_movepi8_mask(__ki), __vi);
+	else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+	  _mm256_mask_storeu_epi16(__mem, _mm256_movepi16_mask(__ki), __vi);
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+			   && std::is_integral_v<_Tp>)
+	  _mm256_maskstore_epi32(reinterpret_cast<int*>(__mem), __ki, __vi);
+	else if constexpr (sizeof(_Tp) == 4)
+	  _mm256_maskstore_ps(reinterpret_cast<float*>(__mem), __ki,
+			      __vector_bitcast<float>(__v));
+	else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+			   && std::is_integral_v<_Tp>)
+	  _mm256_maskstore_epi64(reinterpret_cast<_LLong*>(__mem), __ki, __vi);
+	else if constexpr (__have_avx && sizeof(_Tp) == 8)
+	  _mm256_maskstore_pd(reinterpret_cast<double*>(__mem), __ki,
+			      __vector_bitcast<double>(__v));
+	else if constexpr (__have_sse2)
+	  {
+	    _mm_maskmoveu_si128(__lo128(__vi), __lo128(__ki),
+				reinterpret_cast<char*>(__mem));
+	    _mm_maskmoveu_si128(__hi128(__vi), __hi128(__ki),
+				reinterpret_cast<char*>(__mem) + 16);
+	  }
+      }
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __masked_store {{{
+  template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_store(const _SimdWrapper<_Tp, _Np> __v, _Up* __mem, _Fp,
+		 const _MaskMember<_Tp> __k) noexcept
+  {
+    if constexpr (std::is_integral_v<
+		    _Tp> && std::is_integral_v<_Up> && sizeof(_Tp) > sizeof(_Up)
+		  && __have_avx512f && (sizeof(_Tp) >= 4 || __have_avx512bw)
+		  && (sizeof(__v) == 64 || __have_avx512vl))
+      { // truncating store
+	const auto __vi = __to_intrin(__v);
+	const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+	if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+		      && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 64)
+	  _mm512_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 32)
+	  _mm256_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+	else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+			   && sizeof(__vi) == 16)
+	  _mm_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      _Base::__masked_store(__v, __mem, _Fp(), __k);
+  }
+
+  // }}}
+  // __multiplies {{{
+  template <typename _V, typename _VVT = _VectorTraits<_V>>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _V __multiplies(_V __x, _V __y)
+  {
+    using _Tp = typename _VVT::value_type;
+    if (__builtin_is_constant_evaluated() || __x._M_is_constprop()
+	|| __y._M_is_constprop())
+      return __as_vector(__x) * __as_vector(__y);
+    else if constexpr (sizeof(_Tp) == 1)
+      {
+	if constexpr (sizeof(_V) == 2)
+	  {
+	    const auto __xs = reinterpret_cast<short>(__x._M_data);
+	    const auto __ys = reinterpret_cast<short>(__y._M_data);
+	    return reinterpret_cast<__vector_type_t<_Tp, 2>>(
+	      short(((__xs * __ys) & 0xff) | ((__xs >> 8) * (__ys & 0xff00))));
+	  }
+	else if constexpr (sizeof(_V) == 4 && _VVT::_S_partial_width == 3)
+	  {
+	    const auto __xi = reinterpret_cast<int>(__x._M_data);
+	    const auto __yi = reinterpret_cast<int>(__y._M_data);
+	    return reinterpret_cast<__vector_type_t<_Tp, 3>>(
+	      ((__xi * __yi) & 0xff)
+	      | (((__xi >> 8) * (__yi & 0xff00)) & 0xff00)
+	      | ((__xi >> 16) * (__yi & 0xff0000)));
+	  }
+	else if constexpr (sizeof(_V) == 4)
+	  {
+	    const auto __xi = reinterpret_cast<int>(__x._M_data);
+	    const auto __yi = reinterpret_cast<int>(__y._M_data);
+	    return reinterpret_cast<__vector_type_t<_Tp, 4>>(
+	      ((__xi * __yi) & 0xff)
+	      | (((__xi >> 8) * (__yi & 0xff00)) & 0xff00)
+	      | (((__xi >> 16) * (__yi & 0xff0000)) & 0xff0000)
+	      | ((__xi >> 24) * (__yi & 0xff000000u)));
+	  }
+	else if constexpr (sizeof(_V) == 8 && __have_avx2
+			   && std::is_signed_v<_Tp>)
+	  return __convert<typename _VVT::type>(
+	    __vector_bitcast<short>(_mm_cvtepi8_epi16(__to_intrin(__x)))
+	    * __vector_bitcast<short>(_mm_cvtepi8_epi16(__to_intrin(__y))));
+	else if constexpr (sizeof(_V) == 8 && __have_avx2
+			   && std::is_unsigned_v<_Tp>)
+	  return __convert<typename _VVT::type>(
+	    __vector_bitcast<short>(_mm_cvtepu8_epi16(__to_intrin(__x)))
+	    * __vector_bitcast<short>(_mm_cvtepu8_epi16(__to_intrin(__y))));
+	else
+	  {
+	    // codegen of `x*y` is suboptimal (as of GCC 9.0.1)
+	    constexpr size_t __full_size = _VVT::_S_width;
+	    constexpr int _Np = sizeof(_V) >= 16 ? __full_size / 2 : 8;
+	    using _ShortW = _SimdWrapper<short, _Np>;
+	    const _ShortW __even = __vector_bitcast<short, _Np>(__x)
+				   * __vector_bitcast<short, _Np>(__y);
+	    _ShortW __high_byte = _ShortW()._M_data - 256;
+	    //[&]() { asm("" : "+x"(__high_byte._M_data)); }();
+	    const _ShortW __odd
+	      = (__vector_bitcast<short, _Np>(__x) >> 8)
+		* (__vector_bitcast<short, _Np>(__y) & __high_byte._M_data);
+	    if constexpr (__have_avx512bw && sizeof(_V) > 2)
+	      return _CommonImplX86::_S_blend_avx512(
+		0xaaaa'aaaa'aaaa'aaaaLL, __vector_bitcast<_Tp>(__even),
+		__vector_bitcast<_Tp>(__odd));
+	    else if constexpr (__have_sse4_1 && sizeof(_V) > 2)
+	      return _CommonImplX86::_S_blend_intrin(__to_intrin(__high_byte),
+						     __to_intrin(__even),
+						     __to_intrin(__odd));
+	    else
+	      return __to_intrin(__or(__andnot(__high_byte, __even), __odd));
+	  }
+      }
+    else
+      return _Base::__multiplies(__x, __y);
+  }
+
+  // }}}
+  // __divides {{{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90993
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __divides(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if (!__builtin_is_constant_evaluated()
+	&& !__builtin_constant_p(__y._M_data))
+      if constexpr (is_integral_v<_Tp> && sizeof(_Tp) <= 4)
+	{ // use divps - codegen of `x/y` is suboptimal (as of GCC 9.0.1)
+	  // Note that using floating-point division is likely to raise the
+	  // *Inexact* exception flag and thus appears like an invalid "as-if"
+	  // transformation. However, C++ doesn't specify how the fpenv can be
+	  // observed and points to C. C says that function calls are assumed to
+	  // potentially raise fp exceptions, unless documented otherwise.
+	  // Consequently, operator/, which is a function call, may raise fp
+	  // exceptions.
+	  /*const struct _CsrGuard
+	  {
+	    const unsigned _M_data = _mm_getcsr();
+	    _CsrGuard()
+	    {
+	      _mm_setcsr(0x9f80); // turn off FP exceptions and flush-to-zero
+	    }
+	    ~_CsrGuard() { _mm_setcsr(_M_data); }
+	  } __csr;*/
+	  using _Float = conditional_t<sizeof(_Tp) == 4, double, float>;
+	  constexpr size_t __n_intermediate
+	    = std::min(_Np, (__have_avx512f ? 64 : __have_avx ? 32 : 16)
+			      / sizeof(_Float));
+	  using _FloatV = __vector_type_t<_Float, __n_intermediate>;
+	  constexpr size_t __n_floatv = __div_roundup(_Np, __n_intermediate);
+	  using _R = __vector_type_t<_Tp, _Np>;
+	  const auto __xf = __convert_all<_FloatV, __n_floatv>(__x);
+	  const auto __yf = __convert_all<_FloatV, __n_floatv>(
+	    _Abi::__make_padding_nonzero(__as_vector(__y)));
+	  return __call_with_n_evaluations<__n_floatv>(
+	    [](auto... __quotients) {
+	      return __vector_convert<_R>(__quotients...);
+	    },
+	    [&__xf, &__yf](auto __i) { return __xf[__i] / __yf[__i]; });
+	}
+    /* 64-bit int division is potentially optimizable via double division if
+     * the value in __x is small enough and the conversion between
+     * int<->double is efficient enough:
+    else if constexpr (is_integral_v<_Tp> && is_unsigned_v<_Tp> &&
+		       sizeof(_Tp) == 8)
+      {
+	if constexpr (__have_sse4_1 && sizeof(__x) == 16)
+	  {
+	    if (_mm_test_all_zeros(__x, __m128i{0xffe0'0000'0000'0000ull,
+						0xffe0'0000'0000'0000ull}))
+	      {
+		__x._M_data | 0x __vector_convert<__m128d>(__x._M_data)
+	      }
+	  }
+      }
+      */
+    return _Base::__divides(__x, __y);
+  }
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90993
+
+  // }}}
+  // __modulus {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __modulus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if (__builtin_is_constant_evaluated() || __builtin_constant_p(__y._M_data)
+	|| sizeof(_Tp) >= 8)
+      return _Base::__modulus(__x, __y);
+    else
+      return _Base::__minus(__x, __multiplies(__y, __divides(__x, __y)));
+  }
+
+  // }}}
+  // __bit_shift_left {{{
+  // Notes on UB. C++2a [expr.shift] says:
+  // -1- [...] The operands shall be of integral or unscoped enumeration type
+  //     and integral promotions are performed. The type of the result is that
+  //     of the promoted left operand. The behavior is undefined if the right
+  //     operand is negative, or greater than or equal to the width of the
+  //     promoted left operand.
+  // -2- The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo
+  //     2^N, where N is the width of the type of the result.
+  //
+  // C++17 [expr.shift] says:
+  // -2- The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated
+  //     bits are zero-filled. If E1 has an unsigned type, the value of the
+  //     result is E1 × 2^E2 , reduced modulo one more than the maximum value
+  //     representable in the result type. Otherwise, if E1 has a signed type
+  //     and non-negative value, and E1 × 2^E2 is representable in the
+  //     corresponding unsigned type of the result type, then that value,
+  //     converted to the result type, is the resulting value; otherwise, the
+  //     behavior is undefined.
+  //
+  // Consequences:
+  // With C++2a signed and unsigned types have the same UB
+  // characteristics:
+  // - left shift is not UB for 0 <= RHS < max(32, #bits(T))
+  //
+  // With C++17 there's little room for optimizations because the standard
+  // requires all shifts to happen on promoted integrals (i.e. int). Thus,
+  // short and char shifts must assume shifts affect bits of neighboring
+  // values.
+#ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  inline _GLIBCXX_CONST static typename _TVT::type __bit_shift_left(_Tp __xx,
+								    int __y)
+  {
+    using _V = typename _TVT::type;
+    using _Up = typename _TVT::value_type;
+    _V __x = __xx;
+    [[maybe_unused]] const auto __ix = __to_intrin(__x);
+    if (__builtin_is_constant_evaluated())
+      return __x << __y;
+#if __cplusplus > 201703
+    // after C++17, signed shifts have no UB, and behave just like unsigned
+    // shifts
+    else if constexpr (sizeof(_Up) == 1 && is_signed_v<_Up>)
+      return __vector_bitcast<_Up>(
+	__bit_shift_left(__vector_bitcast<make_unsigned_t<_Up>>(__x), __y));
+#endif
+    else if constexpr (sizeof(_Up) == 1)
+      {
+	// (cf. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83894)
+	if (__builtin_constant_p(__y))
+	  {
+	    if (__y == 0)
+	      return __x;
+	    else if (__y == 1)
+	      return __x + __x;
+	    else if (__y == 2)
+	      {
+		__x = __x + __x;
+		return __x + __x;
+	      }
+	    else if (__y > 2 && __y < 8)
+	      {
+		if constexpr (sizeof(__x) > sizeof(unsigned))
+		  {
+		    const _UChar __mask = 0xff << __y; // precomputed vector
+		    return __vector_bitcast<_Up>(
+		      __vector_bitcast<_UChar>(__vector_bitcast<unsigned>(__x)
+					       << __y)
+		      & __mask);
+		  }
+		else
+		  {
+		    const unsigned __mask
+		      = (0xff & (0xff << __y)) * 0x01010101u;
+		    return reinterpret_cast<_V>(
+		      static_cast<__int_for_sizeof_t<_V>>(
+			unsigned(reinterpret_cast<__int_for_sizeof_t<_V>>(__x)
+				 << __y)
+			& __mask));
+		  }
+	      }
+	    else if (__y >= 8 && __y < 32)
+	      return _V();
+	    else
+	      __builtin_unreachable();
+	  }
+	// general strategy in the following: use an sllv instead of sll
+	// instruction, because it's 2 to 4 times faster:
+	else if constexpr (__have_avx512bw_vl && sizeof(__x) == 16)
+	  return __vector_bitcast<_Up>(
+	    _mm256_cvtepi16_epi8(_mm256_sllv_epi16(_mm256_cvtepi8_epi16(__ix),
+						   _mm256_set1_epi16(__y))));
+	else if constexpr (__have_avx512bw && sizeof(__x) == 32)
+	  return __vector_bitcast<_Up>(
+	    _mm512_cvtepi16_epi8(_mm512_sllv_epi16(_mm512_cvtepi8_epi16(__ix),
+						   _mm512_set1_epi16(__y))));
+	else if constexpr (__have_avx512bw && sizeof(__x) == 64)
+	  {
+	    const auto __shift = _mm512_set1_epi16(__y);
+	    return __vector_bitcast<_Up>(
+	      __concat(_mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+			 _mm512_cvtepi8_epi16(__lo256(__ix)), __shift)),
+		       _mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+			 _mm512_cvtepi8_epi16(__hi256(__ix)), __shift))));
+	  }
+	else if constexpr (__have_avx2 && sizeof(__x) == 32)
+	  {
+#if 1
+	    const auto __shift = _mm_cvtsi32_si128(__y);
+	    auto __k
+	      = _mm256_sll_epi16(_mm256_slli_epi16(~__m256i(), 8), __shift);
+	    __k |= _mm256_srli_epi16(__k, 8);
+	    return __vector_bitcast<_Up>(_mm256_sll_epi32(__ix, __shift) & __k);
+#else
+	    const _Up __k = 0xff << __y;
+	    return __vector_bitcast<_Up>(__vector_bitcast<int>(__x) << __y)
+		   & __k;
+#endif
+	  }
+	else
+	  {
+	    const auto __shift = _mm_cvtsi32_si128(__y);
+	    auto __k = _mm_sll_epi16(_mm_slli_epi16(~__m128i(), 8), __shift);
+	    __k |= _mm_srli_epi16(__k, 8);
+	    return __intrin_bitcast<_V>(_mm_sll_epi16(__ix, __shift) & __k);
+	  }
+      }
+    return __x << __y;
+  }
+
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  inline _GLIBCXX_CONST static typename _TVT::type
+  __bit_shift_left(_Tp __xx, typename _TVT::type __y)
+  {
+    using _V = typename _TVT::type;
+    using _Up = typename _TVT::value_type;
+    _V __x = __xx;
+    [[maybe_unused]] const auto __ix = __to_intrin(__x);
+    [[maybe_unused]] const auto __iy = __to_intrin(__y);
+    if (__builtin_is_constant_evaluated())
+      return __x << __y;
+#if __cplusplus > 201703
+    // after C++17, signed shifts have no UB, and behave just like unsigned
+    // shifts
+    else if constexpr (is_signed_v<_Up>)
+      return __vector_bitcast<_Up>(
+	__bit_shift_left(__vector_bitcast<make_unsigned_t<_Up>>(__x),
+			 __vector_bitcast<make_unsigned_t<_Up>>(__y)));
+#endif
+    else if constexpr (sizeof(_Up) == 1)
+      {
+	if constexpr (sizeof __ix == 64 && __have_avx512bw)
+	  return __vector_bitcast<_Up>(
+	    __concat(_mm512_cvtepi16_epi8(
+		       _mm512_sllv_epi16(_mm512_cvtepu8_epi16(__lo256(__ix)),
+					 _mm512_cvtepu8_epi16(__lo256(__iy)))),
+		     _mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+		       _mm512_cvtepu8_epi16(__hi256(__ix)),
+		       _mm512_cvtepu8_epi16(__hi256(__iy))))));
+	else if constexpr (sizeof __ix == 32 && __have_avx512bw)
+	  return __vector_bitcast<_Up>(_mm512_cvtepi16_epi8(
+	    _mm512_sllv_epi16(_mm512_cvtepu8_epi16(__ix),
+			      _mm512_cvtepu8_epi16(__iy))));
+	else if constexpr (sizeof __x <= 8 && __have_avx512bw_vl)
+	  return __intrin_bitcast<_V>(_mm_cvtepi16_epi8(
+	    _mm_sllv_epi16(_mm_cvtepu8_epi16(__ix), _mm_cvtepu8_epi16(__iy))));
+	else if constexpr (sizeof __ix == 16 && __have_avx512bw_vl)
+	  return __intrin_bitcast<_V>(_mm256_cvtepi16_epi8(
+	    _mm256_sllv_epi16(_mm256_cvtepu8_epi16(__ix),
+			      _mm256_cvtepu8_epi16(__iy))));
+	else if constexpr (sizeof __ix == 16 && __have_avx512bw)
+	  return __intrin_bitcast<_V>(
+	    __lo128(_mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+	      _mm512_cvtepu8_epi16(_mm256_castsi128_si256(__ix)),
+	      _mm512_cvtepu8_epi16(_mm256_castsi128_si256(__iy))))));
+	else if constexpr (__have_sse4_1 && sizeof(__x) == 16)
+	  {
+	    auto __mask
+	      = __vector_bitcast<_Up>(__vector_bitcast<short>(__y) << 5);
+	    auto __x4
+	      = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 4);
+	    __x4 &= char(0xf0);
+	    __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x4)));
+	    __mask += __mask;
+	    auto __x2
+	      = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 2);
+	    __x2 &= char(0xfc);
+	    __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x2)));
+	    __mask += __mask;
+	    auto __x1 = __x + __x;
+	    __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x1)));
+	    return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+	  }
+	else if constexpr (sizeof(__x) == 16)
+	  {
+	    auto __mask
+	      = __vector_bitcast<_UChar>(__vector_bitcast<short>(__y) << 5);
+	    auto __x4
+	      = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 4);
+	    __x4 &= char(0xf0);
+	    __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x4 : __x;
+	    __mask += __mask;
+	    auto __x2
+	      = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 2);
+	    __x2 &= char(0xfc);
+	    __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x2 : __x;
+	    __mask += __mask;
+	    auto __x1 = __x + __x;
+	    __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x1 : __x;
+	    return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+	  }
+	else
+	  return __x << __y;
+      }
+    else if constexpr (sizeof(_Up) == 2)
+      {
+	if constexpr (sizeof __ix == 64 && __have_avx512bw)
+	  return __vector_bitcast<_Up>(_mm512_sllv_epi16(__ix, __iy));
+	else if constexpr (sizeof __ix == 32 && __have_avx512bw_vl)
+	  return __vector_bitcast<_Up>(_mm256_sllv_epi16(__ix, __iy));
+	else if constexpr (sizeof __ix == 32 && __have_avx512bw)
+	  return __vector_bitcast<_Up>(
+	    __lo256(_mm512_sllv_epi16(_mm512_castsi256_si512(__ix),
+				      _mm512_castsi256_si512(__iy))));
+	else if constexpr (sizeof __ix == 32 && __have_avx2)
+	  {
+	    const auto __ux = __vector_bitcast<unsigned>(__x);
+	    const auto __uy = __vector_bitcast<unsigned>(__y);
+	    return __vector_bitcast<_Up>(_mm256_blend_epi16(
+	      __auto_bitcast(__ux << (__uy & 0x0000ffffu)),
+	      __auto_bitcast((__ux & 0xffff0000u) << (__uy >> 16)), 0xaa));
+	  }
+	else if constexpr (sizeof __ix == 16 && __have_avx512bw_vl)
+	  return __intrin_bitcast<_V>(_mm_sllv_epi16(__ix, __iy));
+	else if constexpr (sizeof __ix == 16 && __have_avx512bw)
+	  return __intrin_bitcast<_V>(
+	    __lo128(_mm512_sllv_epi16(_mm512_castsi128_si512(__ix),
+				      _mm512_castsi128_si512(__iy))));
+	else if constexpr (sizeof __ix == 16 && __have_avx2)
+	  {
+	    const auto __ux = __vector_bitcast<unsigned>(__ix);
+	    const auto __uy = __vector_bitcast<unsigned>(__iy);
+	    return __intrin_bitcast<_V>(_mm_blend_epi16(
+	      __auto_bitcast(__ux << (__uy & 0x0000ffffu)),
+	      __auto_bitcast((__ux & 0xffff0000u) << (__uy >> 16)), 0xaa));
+	  }
+	else if constexpr (sizeof __ix == 16)
+	  {
+	    __y += 0x3f8 >> 3;
+	    return __x
+		   * __intrin_bitcast<_V>(
+		     __vector_convert<__vector_type16_t<int>>(
+		       __vector_bitcast<float>(
+			 __vector_bitcast<unsigned>(__to_intrin(__y)) << 23))
+		     | (__vector_convert<__vector_type16_t<int>>(
+			  __vector_bitcast<float>(
+			    (__vector_bitcast<unsigned>(__to_intrin(__y)) >> 16)
+			    << 23))
+			<< 16));
+	  }
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (sizeof(_Up) == 4 && sizeof __ix == 16 && !__have_avx2)
+      // latency is suboptimal, but throughput is at full speedup
+      return __intrin_bitcast<_V>(
+	__vector_bitcast<unsigned>(__ix)
+	* __vector_convert<__vector_type16_t<int>>(__vector_bitcast<float>(
+	  (__vector_bitcast<unsigned, 4>(__y) << 23) + 0x3f80'0000)));
+    else if constexpr (sizeof(_Up) == 8 && sizeof __ix == 16 && !__have_avx2)
+      {
+	const auto __lo = _mm_sll_epi64(__ix, __iy);
+	const auto __hi = _mm_sll_epi64(__ix, _mm_unpackhi_epi64(__iy, __iy));
+	if constexpr (__have_sse4_1)
+	  return __vector_bitcast<_Up>(_mm_blend_epi16(__lo, __hi, 0xf0));
+	else
+	  return __vector_bitcast<_Up>(
+	    _mm_move_sd(__vector_bitcast<double>(__hi),
+			__vector_bitcast<double>(__lo)));
+      }
+    else
+      return __x << __y;
+  }
+#endif // _GLIBCXX_SIMD_NO_SHIFT_OPT
+
+  // }}}
+  // __bit_shift_right {{{
+#ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  inline _GLIBCXX_CONST static typename _TVT::type __bit_shift_right(_Tp __xx,
+								     int __y)
+  {
+    using _V = typename _TVT::type;
+    using _Up = typename _TVT::value_type;
+    _V __x = __xx;
+    [[maybe_unused]] const auto __ix = __to_intrin(__x);
+    if (__builtin_is_constant_evaluated())
+      return __x >> __y;
+    else if (__builtin_constant_p(__y)
+	     && std::is_unsigned_v<_Up> && __y >= int(sizeof(_Up) * CHAR_BIT))
+      return _V();
+    else if constexpr (sizeof(_Up) == 1 && is_unsigned_v<_Up>) //{{{
+      return __intrin_bitcast<_V>(__vector_bitcast<_UShort>(__ix) >> __y)
+	     & _Up(0xff >> __y);
+    //}}}
+    else if constexpr (sizeof(_Up) == 1 && is_signed_v<_Up>) //{{{
+      return __intrin_bitcast<_V>(
+	(__vector_bitcast<_UShort>(__vector_bitcast<short>(__ix) >> (__y + 8))
+	 << 8)
+	| (__vector_bitcast<_UShort>(
+	     __vector_bitcast<short>(__vector_bitcast<_UShort>(__ix) << 8)
+	     >> __y)
+	   >> 8));
+    //}}}
+    // GCC optimizes sizeof == 2, 4, and unsigned 8 as expected
+    else if constexpr (sizeof(_Up) == 8 && is_signed_v<_Up>) //{{{
+      {
+	if (__y > 32)
+	  return (__intrin_bitcast<_V>(__vector_bitcast<int>(__ix) >> 32)
+		  & _Up(0xffff'ffff'0000'0000ull))
+		 | __vector_bitcast<_Up>(
+		   __vector_bitcast<int>(__vector_bitcast<_ULLong>(__ix) >> 32)
+		   >> (__y - 32));
+	else
+	  return __intrin_bitcast<_V>(__vector_bitcast<_ULLong>(__ix) >> __y)
+		 | __vector_bitcast<_Up>(
+		   __vector_bitcast<int>(__ix & -0x8000'0000'0000'0000ll)
+		   >> __y);
+      }
+    //}}}
+    else
+      return __x >> __y;
+  }
+
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  inline _GLIBCXX_CONST static typename _TVT::type
+  __bit_shift_right(_Tp __xx, typename _TVT::type __y)
+  {
+    using _V = typename _TVT::type;
+    using _Up = typename _TVT::value_type;
+    _V __x = __xx;
+    [[maybe_unused]] const auto __ix = __to_intrin(__x);
+    [[maybe_unused]] const auto __iy = __to_intrin(__y);
+    if (__builtin_is_constant_evaluated()
+	|| (__builtin_constant_p(__x) && __builtin_constant_p(__y)))
+      return __x >> __y;
+    else if constexpr (sizeof(_Up) == 1) //{{{
+      {
+	if constexpr (sizeof(__x) <= 8 && __have_avx512bw_vl)
+	  return __intrin_bitcast<_V>(_mm_cvtepi16_epi8(
+	    is_signed_v<_Up>
+	      ? _mm_srav_epi16(_mm_cvtepi8_epi16(__ix), _mm_cvtepi8_epi16(__iy))
+	      : _mm_srlv_epi16(_mm_cvtepu8_epi16(__ix),
+			       _mm_cvtepu8_epi16(__iy))));
+	if constexpr (sizeof(__x) == 16 && __have_avx512bw_vl)
+	  return __intrin_bitcast<_V>(_mm256_cvtepi16_epi8(
+	    is_signed_v<_Up> ? _mm256_srav_epi16(_mm256_cvtepi8_epi16(__ix),
+						 _mm256_cvtepi8_epi16(__iy))
+			     : _mm256_srlv_epi16(_mm256_cvtepu8_epi16(__ix),
+						 _mm256_cvtepu8_epi16(__iy))));
+	else if constexpr (sizeof(__x) == 32 && __have_avx512bw)
+	  return __vector_bitcast<_Up>(_mm512_cvtepi16_epi8(
+	    is_signed_v<_Up> ? _mm512_srav_epi16(_mm512_cvtepi8_epi16(__ix),
+						 _mm512_cvtepi8_epi16(__iy))
+			     : _mm512_srlv_epi16(_mm512_cvtepu8_epi16(__ix),
+						 _mm512_cvtepu8_epi16(__iy))));
+	else if constexpr (sizeof(__x) == 64 && is_signed_v<_Up>)
+	  return __vector_bitcast<_Up>(_mm512_mask_mov_epi8(
+	    _mm512_srav_epi16(__ix, _mm512_srli_epi16(__iy, 8)),
+	    0x5555'5555'5555'5555ull,
+	    _mm512_srav_epi16(_mm512_slli_epi16(__ix, 8),
+			      _mm512_maskz_add_epi8(0x5555'5555'5555'5555ull,
+						    __iy,
+						    _mm512_set1_epi16(8)))));
+	else if constexpr (sizeof(__x) == 64 && is_unsigned_v<_Up>)
+	  return __vector_bitcast<_Up>(_mm512_mask_mov_epi8(
+	    _mm512_srlv_epi16(__ix, _mm512_srli_epi16(__iy, 8)),
+	    0x5555'5555'5555'5555ull,
+	    _mm512_srlv_epi16(
+	      _mm512_maskz_mov_epi8(0x5555'5555'5555'5555ull, __ix),
+	      _mm512_maskz_mov_epi8(0x5555'5555'5555'5555ull, __iy))));
+	/* This has better throughput but higher latency than the impl below
+	else if constexpr (__have_avx2 && sizeof(__x) == 16 &&
+			   is_unsigned_v<_Up>)
+	  {
+	    const auto __shorts = __to_intrin(__bit_shift_right(
+	      __vector_bitcast<_UShort>(_mm256_cvtepu8_epi16(__ix)),
+	      __vector_bitcast<_UShort>(_mm256_cvtepu8_epi16(__iy))));
+	    return __vector_bitcast<_Up>(
+	      _mm_packus_epi16(__lo128(__shorts), __hi128(__shorts)));
+	  }
+	  */
+	else if constexpr (__have_avx2 && sizeof(__x) > 8)
+	  // the following uses vpsr[al]vd, which requires AVX2
+	  if constexpr (is_signed_v<_Up>)
+	    {
+	      const auto r3 = __vector_bitcast<_UInt>(
+				(__vector_bitcast<int>(__x)
+				 >> (__vector_bitcast<_UInt>(__y) >> 24)))
+			      & 0xff000000u;
+	      const auto r2 = __vector_bitcast<_UInt>((
+				(__vector_bitcast<int>(__x) << 8)
+				>> ((__vector_bitcast<_UInt>(__y) << 8) >> 24)))
+			      & 0xff000000u;
+	      const auto r1
+		= __vector_bitcast<_UInt>(
+		    ((__vector_bitcast<int>(__x) << 16)
+		     >> ((__vector_bitcast<_UInt>(__y) << 16) >> 24)))
+		  & 0xff000000u;
+	      const auto r0 = __vector_bitcast<_UInt>(
+		(__vector_bitcast<int>(__x) << 24)
+		>> ((__vector_bitcast<_UInt>(__y) << 24) >> 24));
+	      return __vector_bitcast<_Up>(r3 | (r2 >> 8) | (r1 >> 16)
+					   | (r0 >> 24));
+	    }
+	  else
+	    {
+	      const auto r3 = (__vector_bitcast<_UInt>(__x)
+			       >> (__vector_bitcast<_UInt>(__y) >> 24))
+			      & 0xff000000u;
+	      const auto r2 = ((__vector_bitcast<_UInt>(__x) << 8)
+			       >> ((__vector_bitcast<_UInt>(__y) << 8) >> 24))
+			      & 0xff000000u;
+	      const auto r1 = ((__vector_bitcast<_UInt>(__x) << 16)
+			       >> ((__vector_bitcast<_UInt>(__y) << 16) >> 24))
+			      & 0xff000000u;
+	      const auto r0 = (__vector_bitcast<_UInt>(__x) << 24)
+			      >> ((__vector_bitcast<_UInt>(__y) << 24) >> 24);
+	      return __vector_bitcast<_Up>(r3 | (r2 >> 8) | (r1 >> 16)
+					   | (r0 >> 24));
+	    }
+	else if constexpr (__have_sse4_1
+			   && is_unsigned_v<_Up> && sizeof(__x) > 2)
+	  {
+	    auto __x128 = __vector_bitcast<_Up>(__ix);
+	    auto __mask
+	      = __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__iy) << 5);
+	    auto __x4 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x128) >> 4) & _UShort(0xff0f));
+	    __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x4)));
+	    __mask += __mask;
+	    auto __x2 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x128) >> 2) & _UShort(0xff3f));
+	    __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x2)));
+	    __mask += __mask;
+	    auto __x1 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x128) >> 1) & _UShort(0xff7f));
+	    __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x1)));
+	    return __intrin_bitcast<_V>(
+	      __x128
+	      & ((__vector_bitcast<_Up>(__iy) & char(0xf8))
+		 == 0)); // y > 7 nulls the result
+	  }
+	else if constexpr (__have_sse4_1 && is_signed_v<_Up> && sizeof(__x) > 2)
+	  {
+	    auto __mask
+	      = __vector_bitcast<_UChar>(__vector_bitcast<_UShort>(__iy) << 5);
+	    auto __maskl = [&]() {
+	      return __to_intrin(__vector_bitcast<_UShort>(__mask) << 8);
+	    };
+	    auto __xh = __vector_bitcast<short>(__ix);
+	    auto __xl = __vector_bitcast<short>(__ix) << 8;
+	    auto __xh4 = __xh >> 4;
+	    auto __xl4 = __xl >> 4;
+	    __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh4)));
+	    __xl = __vector_bitcast<short>(
+	      _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+					      __to_intrin(__xl4)));
+	    __mask += __mask;
+	    auto __xh2 = __xh >> 2;
+	    auto __xl2 = __xl >> 2;
+	    __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh2)));
+	    __xl = __vector_bitcast<short>(
+	      _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+					      __to_intrin(__xl2)));
+	    __mask += __mask;
+	    auto __xh1 = __xh >> 1;
+	    auto __xl1 = __xl >> 1;
+	    __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+	      __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh1)));
+	    __xl = __vector_bitcast<short>(
+	      _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+					      __to_intrin(__xl1)));
+	    return __intrin_bitcast<_V>(
+	      (__vector_bitcast<_Up>((__xh & short(0xff00)))
+	       | __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__xl) >> 8))
+	      & ((__vector_bitcast<_Up>(__iy) & char(0xf8))
+		 == 0)); // y > 7 nulls the result
+	  }
+	else if constexpr (is_unsigned_v<_Up> && sizeof(__x) > 2) // SSE2
+	  {
+	    auto __mask
+	      = __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__y) << 5);
+	    auto __x4 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x) >> 4) & _UShort(0xff0f));
+	    __x = __mask > 0x7f ? __x4 : __x;
+	    __mask += __mask;
+	    auto __x2 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x) >> 2) & _UShort(0xff3f));
+	    __x = __mask > 0x7f ? __x2 : __x;
+	    __mask += __mask;
+	    auto __x1 = __vector_bitcast<_Up>(
+	      (__vector_bitcast<_UShort>(__x) >> 1) & _UShort(0xff7f));
+	    __x = __mask > 0x7f ? __x1 : __x;
+	    return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+	  }
+	else if constexpr (sizeof(__x) > 2) // signed SSE2
+	  {
+	    static_assert(is_signed_v<_Up>);
+	    auto __maskh = __vector_bitcast<_UShort>(__y) << 5;
+	    auto __maskl = __vector_bitcast<_UShort>(__y) << (5 + 8);
+	    auto __xh = __vector_bitcast<short>(__x);
+	    auto __xl = __vector_bitcast<short>(__x) << 8;
+	    auto __xh4 = __xh >> 4;
+	    auto __xl4 = __xl >> 4;
+	    __xh = __maskh > 0x7fff ? __xh4 : __xh;
+	    __xl = __maskl > 0x7fff ? __xl4 : __xl;
+	    __maskh += __maskh;
+	    __maskl += __maskl;
+	    auto __xh2 = __xh >> 2;
+	    auto __xl2 = __xl >> 2;
+	    __xh = __maskh > 0x7fff ? __xh2 : __xh;
+	    __xl = __maskl > 0x7fff ? __xl2 : __xl;
+	    __maskh += __maskh;
+	    __maskl += __maskl;
+	    auto __xh1 = __xh >> 1;
+	    auto __xl1 = __xl >> 1;
+	    __xh = __maskh > 0x7fff ? __xh1 : __xh;
+	    __xl = __maskl > 0x7fff ? __xl1 : __xl;
+	    __x = __vector_bitcast<_Up>((__xh & short(0xff00)))
+		  | __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__xl) >> 8);
+	    return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+	  }
+	else
+	  return __x >> __y;
+      }                                                      //}}}
+    else if constexpr (sizeof(_Up) == 2 && sizeof(__x) >= 4) //{{{
+      {
+	[[maybe_unused]] auto __blend_0xaa = [](auto __a, auto __b) {
+	  if constexpr (sizeof(__a) == 16)
+	    return _mm_blend_epi16(__to_intrin(__a), __to_intrin(__b), 0xaa);
+	  else if constexpr (sizeof(__a) == 32)
+	    return _mm256_blend_epi16(__to_intrin(__a), __to_intrin(__b), 0xaa);
+	  else if constexpr (sizeof(__a) == 64)
+	    return _mm512_mask_blend_epi16(0xaaaa'aaaaU, __to_intrin(__a),
+					   __to_intrin(__b));
+	  else
+	    __assert_unreachable<decltype(__a)>();
+	};
+	if constexpr (__have_avx512bw_vl && sizeof(_Tp) <= 16)
+	  return __intrin_bitcast<_V>(is_signed_v<_Up>
+					? _mm_srav_epi16(__ix, __iy)
+					: _mm_srlv_epi16(__ix, __iy));
+	else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 32)
+	  return __vector_bitcast<_Up>(is_signed_v<_Up>
+					 ? _mm256_srav_epi16(__ix, __iy)
+					 : _mm256_srlv_epi16(__ix, __iy));
+	else if constexpr (__have_avx512bw && sizeof(_Tp) == 64)
+	  return __vector_bitcast<_Up>(is_signed_v<_Up>
+					 ? _mm512_srav_epi16(__ix, __iy)
+					 : _mm512_srlv_epi16(__ix, __iy));
+	else if constexpr (__have_avx2 && is_signed_v<_Up>)
+	  return __intrin_bitcast<_V>(
+	    __blend_0xaa(((__vector_bitcast<int>(__ix) << 16)
+			  >> (__vector_bitcast<int>(__iy) & 0xffffu))
+			   >> 16,
+			 __vector_bitcast<int>(__ix)
+			   >> (__vector_bitcast<int>(__iy) >> 16)));
+	else if constexpr (__have_avx2 && is_unsigned_v<_Up>)
+	  return __intrin_bitcast<_V>(
+	    __blend_0xaa((__vector_bitcast<_UInt>(__ix) & 0xffffu)
+			   >> (__vector_bitcast<_UInt>(__iy) & 0xffffu),
+			 __vector_bitcast<_UInt>(__ix)
+			   >> (__vector_bitcast<_UInt>(__iy) >> 16)));
+	else if constexpr (__have_sse4_1)
+	  {
+	    auto __mask = __vector_bitcast<_UShort>(__iy);
+	    auto __x128 = __vector_bitcast<_Up>(__ix);
+	    //__mask *= 0x0808;
+	    __mask = (__mask << 3) | (__mask << 11);
+	    // do __x128 = 0 where __y[4] is set
+	    __x128 = __vector_bitcast<_Up>(
+	      _mm_blendv_epi8(__to_intrin(__x128), __m128i(),
+			      __to_intrin(__mask)));
+	    // do __x128 =>> 8 where __y[3] is set
+	    __x128 = __vector_bitcast<_Up>(
+	      _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 8),
+			      __to_intrin(__mask += __mask)));
+	    // do __x128 =>> 4 where __y[2] is set
+	    __x128 = __vector_bitcast<_Up>(
+	      _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 4),
+			      __to_intrin(__mask += __mask)));
+	    // do __x128 =>> 2 where __y[1] is set
+	    __x128 = __vector_bitcast<_Up>(
+	      _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 2),
+			      __to_intrin(__mask += __mask)));
+	    // do __x128 =>> 1 where __y[0] is set
+	    return __intrin_bitcast<_V>(
+	      _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 1),
+			      __to_intrin(__mask + __mask)));
+	  }
+	else
+	  {
+	    auto __k = __vector_bitcast<_UShort>(__iy) << 11;
+	    auto __x128 = __vector_bitcast<_Up>(__ix);
+	    auto __mask = [](__vector_type16_t<_UShort> __kk) {
+	      return __vector_bitcast<short>(__kk) < 0;
+	    };
+	    // do __x128 = 0 where __y[4] is set
+	    __x128 = __mask(__k) ? decltype(__x128)() : __x128;
+	    // do __x128 =>> 8 where __y[3] is set
+	    __x128 = __mask(__k += __k) ? __x128 >> 8 : __x128;
+	    // do __x128 =>> 4 where __y[2] is set
+	    __x128 = __mask(__k += __k) ? __x128 >> 4 : __x128;
+	    // do __x128 =>> 2 where __y[1] is set
+	    __x128 = __mask(__k += __k) ? __x128 >> 2 : __x128;
+	    // do __x128 =>> 1 where __y[0] is set
+	    return __intrin_bitcast<_V>(__mask(__k + __k) ? __x128 >> 1
+							  : __x128);
+	  }
+      }                                                  //}}}
+    else if constexpr (sizeof(_Up) == 4 && !__have_avx2) //{{{
+      {
+	if constexpr (is_unsigned_v<_Up>)
+	  {
+	    // x >> y == x * 2^-y == (x * 2^(31-y)) >> 31
+	    const __m128 __factor_f = reinterpret_cast<__m128>(
+	      0x4f00'0000u - (__vector_bitcast<unsigned, 4>(__y) << 23));
+	    const __m128i __factor
+	      = __builtin_constant_p(__factor_f) ? __to_intrin(
+		  __make_vector<unsigned>(__factor_f[0], __factor_f[1],
+					  __factor_f[2], __factor_f[3]))
+						 : _mm_cvttps_epi32(__factor_f);
+	    const auto __r02
+	      = _mm_srli_epi64(_mm_mul_epu32(__ix, __factor), 31);
+	    const auto __r13 = _mm_mul_epu32(_mm_srli_si128(__ix, 4),
+					     _mm_srli_si128(__factor, 4));
+	    if constexpr (__have_sse4_1)
+	      return __intrin_bitcast<_V>(
+		_mm_blend_epi16(_mm_slli_epi64(__r13, 1), __r02, 0x33));
+	    else
+	      return __intrin_bitcast<_V>(
+		__r02 | _mm_slli_si128(_mm_srli_epi64(__r13, 31), 4));
+	  }
+	else
+	  {
+	    auto __shift = [](auto __a, auto __b) {
+	      if constexpr (is_signed_v<_Up>)
+		return _mm_sra_epi32(__a, __b);
+	      else
+		return _mm_srl_epi32(__a, __b);
+	    };
+	    const auto __r0
+	      = __shift(__ix, _mm_unpacklo_epi32(__iy, __m128i()));
+	    const auto __r1 = __shift(__ix, _mm_srli_epi64(__iy, 32));
+	    const auto __r2
+	      = __shift(__ix, _mm_unpackhi_epi32(__iy, __m128i()));
+	    const auto __r3 = __shift(__ix, _mm_srli_si128(__iy, 12));
+	    if constexpr (__have_sse4_1)
+	      return __intrin_bitcast<_V>(
+		_mm_blend_epi16(_mm_blend_epi16(__r1, __r0, 0x3),
+				_mm_blend_epi16(__r3, __r2, 0x30), 0xf0));
+	    else
+	      return __intrin_bitcast<_V>(_mm_unpacklo_epi64(
+		_mm_unpacklo_epi32(__r0, _mm_srli_si128(__r1, 4)),
+		_mm_unpackhi_epi32(__r2, _mm_srli_si128(__r3, 4))));
+	  }
+      } //}}}
+    else
+      return __x >> __y;
+  }
+#endif // _GLIBCXX_SIMD_NO_SHIFT_OPT
+
+  // }}}
+  // compares {{{
+  // __equal_to {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if constexpr (__is_avx512_abi<_Abi>()) // {{{
+      {
+	if (__builtin_is_constant_evaluated()
+	    || (__x._M_is_constprop() && __y._M_is_constprop()))
+	  return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+	    __vector_bitcast<_Tp>(__x._M_data == __y._M_data)));
+
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	[[maybe_unused]] const auto __xi = __to_intrin(__x);
+	[[maybe_unused]] const auto __yi = __to_intrin(__y);
+	if constexpr (std::is_floating_point_v<_Tp>)
+	  {
+	    if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	      return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	      return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 2)
+	  return _mm512_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 1)
+	  return _mm512_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 2)
+	  return _mm256_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 1)
+	  return _mm256_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 2)
+	  return _mm_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 1)
+	  return _mm_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else
+	  __assert_unreachable<_Tp>();
+      } // }}}
+    else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+      {
+	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+			    == __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+	_MaskMember<_Tp> __r64;
+	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+	return __r64;
+      } // }}}
+    else
+      return _Base::__equal_to(__x, __y);
+  }
+
+  // }}}
+  // __not_equal_to {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __not_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if constexpr (__is_avx512_abi<_Abi>()) // {{{
+      {
+	if (__builtin_is_constant_evaluated()
+	    || (__x._M_is_constprop() && __y._M_is_constprop()))
+	  return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+	    __vector_bitcast<_Tp>(__x._M_data != __y._M_data)));
+
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	[[maybe_unused]] const auto __xi = __to_intrin(__x);
+	[[maybe_unused]] const auto __yi = __to_intrin(__y);
+	if constexpr (std::is_floating_point_v<_Tp>)
+	  {
+	    if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	      return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	      return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return ~_mm512_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return ~_mm512_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 2)
+	  return ~_mm512_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 1)
+	  return ~_mm512_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return ~_mm256_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return ~_mm256_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 2)
+	  return ~_mm256_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 1)
+	  return ~_mm256_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return ~_mm_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return ~_mm_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 2)
+	  return ~_mm_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 1)
+	  return ~_mm_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+	else
+	  __assert_unreachable<_Tp>();
+      } // }}}
+    else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+      {
+	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+			    != __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+	_MaskMember<_Tp> __r64;
+	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+	return __r64;
+      } // }}}
+    else
+      return _Base::__not_equal_to(__x, __y);
+  }
+
+  // }}}
+  // __less {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __less(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if constexpr (__is_avx512_abi<_Abi>()) // {{{
+      {
+	if (__builtin_is_constant_evaluated()
+	    || (__x._M_is_constprop() && __y._M_is_constprop()))
+	  return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+	    __vector_bitcast<_Tp>(__x._M_data < __y._M_data)));
+
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	[[maybe_unused]] const auto __xi = __to_intrin(__x);
+	[[maybe_unused]] const auto __yi = __to_intrin(__y);
+	if constexpr (sizeof(__xi) == 64)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm512_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm512_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm512_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm512_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 32)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm256_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm256_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm256_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm256_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 16)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else
+	  __assert_unreachable<_Tp>();
+      } // }}}
+    else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+      {
+	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+			    < __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+	_MaskMember<_Tp> __r64;
+	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+	return __r64;
+      } // }}}
+    else
+      return _Base::__less(__x, __y);
+  }
+
+  // }}}
+  // __less_equal {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __less_equal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+    if constexpr (__is_avx512_abi<_Abi>()) // {{{
+      {
+	if (__builtin_is_constant_evaluated()
+	    || (__x._M_is_constprop() && __y._M_is_constprop()))
+	  return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+	    __vector_bitcast<_Tp>(__x._M_data <= __y._M_data)));
+
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	[[maybe_unused]] const auto __xi = __to_intrin(__x);
+	[[maybe_unused]] const auto __yi = __to_intrin(__y);
+	if constexpr (sizeof(__xi) == 64)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm512_mask_cmple_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm512_mask_cmple_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmple_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmple_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm512_mask_cmple_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm512_mask_cmple_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm512_mask_cmple_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm512_mask_cmple_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 32)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm256_mask_cmple_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm256_mask_cmple_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmple_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmple_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm256_mask_cmple_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm256_mask_cmple_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm256_mask_cmple_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm256_mask_cmple_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else if constexpr (sizeof(__xi) == 16)
+	  {
+	    if constexpr (std::is_same_v<_Tp, float>)
+	      return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_same_v<_Tp, double>)
+	      return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm_mask_cmple_epi8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm_mask_cmple_epi16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm_mask_cmple_epi32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm_mask_cmple_epi64_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+	      return _mm_mask_cmple_epu8_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+	      return _mm_mask_cmple_epu16_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+	      return _mm_mask_cmple_epu32_mask(__k1, __xi, __yi);
+	    else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+	      return _mm_mask_cmple_epu64_mask(__k1, __xi, __yi);
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else
+	  __assert_unreachable<_Tp>();
+      } // }}}
+    else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+      {
+	const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+			    <= __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+	_MaskMember<_Tp> __r64;
+	__builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+	return __r64;
+      } // }}}
+    else
+      return _Base::__less_equal(__x, __y);
+  }
+
+  // }}}
+  // }}}
+  // negation {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __negate(_SimdWrapper<_Tp, _Np> __x) noexcept
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      return __equal_to(__x, _SimdWrapper<_Tp, _Np>());
+    else
+      return _Base::__negate(__x);
+  }
+
+  // }}}
+  // math {{{
+  using _Base::__abs;
+  // __sqrt {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __sqrt(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (__is_sse_ps<_Tp, _Np>())
+      return __auto_bitcast(_mm_sqrt_ps(__to_intrin(__x)));
+    else if constexpr (__is_sse_pd<_Tp, _Np>())
+      return _mm_sqrt_pd(__x);
+    else if constexpr (__is_avx_ps<_Tp, _Np>())
+      return _mm256_sqrt_ps(__x);
+    else if constexpr (__is_avx_pd<_Tp, _Np>())
+      return _mm256_sqrt_pd(__x);
+    else if constexpr (__is_avx512_ps<_Tp, _Np>())
+      return _mm512_sqrt_ps(__x);
+    else if constexpr (__is_avx512_pd<_Tp, _Np>())
+      return _mm512_sqrt_pd(__x);
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __ldexp {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __ldexp(_SimdWrapper<_Tp, _Np> __x, __fixed_size_storage_t<int, _Np> __exp)
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __xi = __to_intrin(__x);
+	constexpr _SimdConverter<int, simd_abi::fixed_size<_Np>, _Tp, _Abi>
+	  __cvt;
+	const auto __expi = __to_intrin(__cvt(__exp));
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 16)
+	  {
+	    if constexpr (sizeof(_Tp) == 8)
+	      return _mm_maskz_scalef_pd(__k1, __xi, __expi);
+	    else
+	      return _mm_maskz_scalef_ps(__k1, __xi, __expi);
+	  }
+	else if constexpr (sizeof(__xi) == 32)
+	  {
+	    if constexpr (sizeof(_Tp) == 8)
+	      return _mm256_maskz_scalef_pd(__k1, __xi, __expi);
+	    else
+	      return _mm256_maskz_scalef_ps(__k1, __xi, __expi);
+	  }
+	else
+	  {
+	    static_assert(sizeof(__xi) == 64);
+	    if constexpr (sizeof(_Tp) == 8)
+	      return _mm512_maskz_scalef_pd(__k1, __xi, __expi);
+	    else
+	      return _mm512_maskz_scalef_ps(__k1, __xi, __expi);
+	  }
+      }
+    else
+      return _Base::__ldexp(__x, __exp);
+  }
+
+  // }}}
+  // __trunc {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __trunc(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (__is_avx512_ps<_Tp, _Np>())
+      return _mm512_roundscale_ps(__x, 0x0b);
+    else if constexpr (__is_avx512_pd<_Tp, _Np>())
+      return _mm512_roundscale_pd(__x, 0x0b);
+    else if constexpr (__is_avx_ps<_Tp, _Np>())
+      return _mm256_round_ps(__x, 0x3);
+    else if constexpr (__is_avx_pd<_Tp, _Np>())
+      return _mm256_round_pd(__x, 0x3);
+    else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+      return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x3));
+    else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+      return _mm_round_pd(__x, 0x3);
+    else if constexpr (__is_sse_ps<_Tp, _Np>())
+      {
+	auto __truncated = _mm_cvtepi32_ps(_mm_cvttps_epi32(__to_intrin(__x)));
+	const auto __no_fractional_values
+	  = __vector_bitcast<int>(__vector_bitcast<_UInt>(__to_intrin(__x))
+				  & 0x7f800000u)
+	    < 0x4b000000; // the exponent is so large that no mantissa bits
+			  // signify fractional values (0x3f8 + 23*8 =
+			  // 0x4b0)
+	return __no_fractional_values ? __truncated : __to_intrin(__x);
+      }
+    else
+      return _Base::__trunc(__x);
+  }
+
+  // }}}
+  // __round {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __round(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _V = __vector_type_t<_Tp, _Np>;
+    _V __truncated;
+    if constexpr (__is_avx512_ps<_Tp, _Np>())
+      __truncated = _mm512_roundscale_ps(__x._M_data, 0x0b);
+    else if constexpr (__is_avx512_pd<_Tp, _Np>())
+      __truncated = _mm512_roundscale_pd(__x._M_data, 0x0b);
+    else if constexpr (__is_avx_ps<_Tp, _Np>())
+      __truncated
+	= _mm256_round_ps(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+    else if constexpr (__is_avx_pd<_Tp, _Np>())
+      __truncated
+	= _mm256_round_pd(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+    else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+      __truncated = __auto_bitcast(
+	_mm_round_ps(__to_intrin(__x), _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC));
+    else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+      __truncated
+	= _mm_round_pd(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+    else if constexpr (__is_sse_ps<_Tp, _Np>())
+      __truncated
+	= __auto_bitcast(_mm_cvtepi32_ps(_mm_cvttps_epi32(__to_intrin(__x))));
+    else
+      return _Base::__round(__x);
+
+    // x < 0 => truncated <= 0 && truncated >= x => x - truncated <= 0
+    // x > 0 => truncated >= 0 && truncated <= x => x - truncated >= 0
+
+    const _V __rounded
+      = __truncated
+	+ (__and(_S_absmask<_V>, __x._M_data - __truncated) >= _Tp(.5)
+	     ? __or(__and(_S_signmask<_V>, __x._M_data), _V() + 1)
+	     : _V());
+    if constexpr (__have_sse4_1)
+      return __rounded;
+    else
+      return __and(_S_absmask<_V>, __x._M_data) < 0x1p23f ? __rounded
+							  : __x._M_data;
+  }
+
+  // }}}
+  // __nearbyint {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x) noexcept
+  {
+    if constexpr (_TVT::template __is<float, 16>)
+      return _mm512_roundscale_ps(__x, 0x0c);
+    else if constexpr (_TVT::template __is<double, 8>)
+      return _mm512_roundscale_pd(__x, 0x0c);
+    else if constexpr (_TVT::template __is<float, 8>)
+      return _mm256_round_ps(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+    else if constexpr (_TVT::template __is<double, 4>)
+      return _mm256_round_pd(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+    else if constexpr (__have_sse4_1 && _TVT::template __is<float, 4>)
+      return _mm_round_ps(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+    else if constexpr (__have_sse4_1 && _TVT::template __is<double, 2>)
+      return _mm_round_pd(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+    else
+      return _Base::__nearbyint(__x);
+  }
+
+  // }}}
+  // __rint {{{
+  template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+  _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x) noexcept
+  {
+    if constexpr (_TVT::template __is<float, 16>)
+      return _mm512_roundscale_ps(__x, 0x04);
+    else if constexpr (_TVT::template __is<double, 8>)
+      return _mm512_roundscale_pd(__x, 0x04);
+    else if constexpr (_TVT::template __is<float, 8>)
+      return _mm256_round_ps(__x, _MM_FROUND_CUR_DIRECTION);
+    else if constexpr (_TVT::template __is<double, 4>)
+      return _mm256_round_pd(__x, _MM_FROUND_CUR_DIRECTION);
+    else if constexpr (__have_sse4_1 && _TVT::template __is<float, 4>)
+      return _mm_round_ps(__x, _MM_FROUND_CUR_DIRECTION);
+    else if constexpr (__have_sse4_1 && _TVT::template __is<double, 2>)
+      return _mm_round_pd(__x, _MM_FROUND_CUR_DIRECTION);
+    else
+      return _Base::__rint(__x);
+  }
+
+  // }}}
+  // __floor {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __floor(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (__is_avx512_ps<_Tp, _Np>())
+      return _mm512_roundscale_ps(__x, 0x09);
+    else if constexpr (__is_avx512_pd<_Tp, _Np>())
+      return _mm512_roundscale_pd(__x, 0x09);
+    else if constexpr (__is_avx_ps<_Tp, _Np>())
+      return _mm256_round_ps(__x, 0x1);
+    else if constexpr (__is_avx_pd<_Tp, _Np>())
+      return _mm256_round_pd(__x, 0x1);
+    else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+      return __auto_bitcast(_mm_floor_ps(__to_intrin(__x)));
+    else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+      return _mm_floor_pd(__x);
+    else
+      return _Base::__floor(__x);
+  }
+
+  // }}}
+  // __ceil {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+  __ceil(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (__is_avx512_ps<_Tp, _Np>())
+      return _mm512_roundscale_ps(__x, 0x0a);
+    else if constexpr (__is_avx512_pd<_Tp, _Np>())
+      return _mm512_roundscale_pd(__x, 0x0a);
+    else if constexpr (__is_avx_ps<_Tp, _Np>())
+      return _mm256_round_ps(__x, 0x2);
+    else if constexpr (__is_avx_pd<_Tp, _Np>())
+      return _mm256_round_pd(__x, 0x2);
+    else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+      return __auto_bitcast(_mm_ceil_ps(__to_intrin(__x)));
+    else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+      return _mm_ceil_pd(__x);
+    else
+      return _Base::__ceil(__x);
+  }
+
+  // }}}
+  // __signbit {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __signbit(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+      {
+	if constexpr (sizeof(__x) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_movepi32_mask(__intrin_bitcast<__m512i>(__x._M_data));
+	else if constexpr (sizeof(__x) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_movepi64_mask(__intrin_bitcast<__m512i>(__x._M_data));
+	else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_movepi32_mask(__intrin_bitcast<__m256i>(__x._M_data));
+	else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_movepi64_mask(__intrin_bitcast<__m256i>(__x._M_data));
+	else if constexpr (sizeof(__x) <= 16 && sizeof(_Tp) == 4)
+	  return _mm_movepi32_mask(__intrin_bitcast<__m128i>(__x._M_data));
+	else if constexpr (sizeof(__x) <= 16 && sizeof(_Tp) == 8)
+	  return _mm_movepi64_mask(__intrin_bitcast<__m128i>(__x._M_data));
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __xi = __to_intrin(__x);
+	[[maybe_unused]] constexpr auto __k1
+	  = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_movemask_ps(__xi);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_movemask_pd(__xi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_movemask_ps(__xi);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_movemask_pd(__xi);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmplt_epi32_mask(__k1,
+					      __intrin_bitcast<__m512i>(__xi),
+					      __m512i());
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmplt_epi64_mask(__k1,
+					      __intrin_bitcast<__m512i>(__xi),
+					      __m512i());
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__signbit(__x);
+    /*{
+      using _I = __int_for_sizeof_t<_Tp>;
+      if constexpr (sizeof(__x) == 64)
+	return __less(__vector_bitcast<_I>(__x), _I());
+      else
+	{
+	  const auto __xx = __vector_bitcast<_I>(__x._M_data);
+	  [[maybe_unused]] constexpr _I __signmask =
+	    std::numeric_limits<_I>::min();
+	  if constexpr ((sizeof(_Tp) == 4 &&
+			 (__have_avx2 || sizeof(__x) == 16)) ||
+			__have_avx512vl)
+	    {
+	      return __vector_bitcast<_Tp>(__xx >>
+					   std::numeric_limits<_I>::digits);
+	    }
+	  else if constexpr ((__have_avx2 ||
+			      (__have_ssse3 && sizeof(__x) == 16)))
+	    {
+	      return __vector_bitcast<_Tp>((__xx & __signmask) ==
+					   __signmask);
+	    }
+	  else
+	    { // SSE2/3 or AVX (w/o AVX2)
+	      constexpr auto __one = __vector_broadcast<_Np, _Tp>(1);
+	      return __vector_bitcast<_Tp>(
+		__vector_bitcast<_Tp>(
+		  (__xx & __signmask) |
+		  __vector_bitcast<_I>(__one)) // -1 or 1
+		!= __one);
+	    }
+	}
+    }*/
+  }
+
+  // }}}
+  // __isnonzerovalue_mask (isnormal | is subnormal == !isinf & !isnan & !is
+  // zero) {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static auto __isnonzerovalue_mask(_Tp __x)
+  {
+    using _Traits = _VectorTraits<_Tp>;
+    if constexpr (__have_avx512dq_vl)
+      {
+	if constexpr (_Traits::template __is<
+			float, 2> || _Traits::template __is<float, 4>)
+	  return _knot_mask8(_mm_fpclass_ps_mask(__to_intrin(__x), 0x9f));
+	else if constexpr (_Traits::template __is<float, 8>)
+	  return _knot_mask8(_mm256_fpclass_ps_mask(__x, 0x9f));
+	else if constexpr (_Traits::template __is<float, 16>)
+	  return _knot_mask16(_mm512_fpclass_ps_mask(__x, 0x9f));
+	else if constexpr (_Traits::template __is<double, 2>)
+	  return _knot_mask8(_mm_fpclass_pd_mask(__x, 0x9f));
+	else if constexpr (_Traits::template __is<double, 4>)
+	  return _knot_mask8(_mm256_fpclass_pd_mask(__x, 0x9f));
+	else if constexpr (_Traits::template __is<double, 8>)
+	  return _knot_mask8(_mm512_fpclass_pd_mask(__x, 0x9f));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      {
+	using _Up = typename _Traits::value_type;
+	constexpr size_t _Np = _Traits::_S_width;
+	const auto __a
+	  = __x * std::numeric_limits<_Up>::infinity(); // NaN if __x == 0
+	const auto __b = __x * _Up();                   // NaN if __x == inf
+	if constexpr (__have_avx512vl && __is_sse_ps<_Up, _Np>())
+	  return _mm_cmp_ps_mask(__to_intrin(__a), __to_intrin(__b),
+				 _CMP_ORD_Q);
+	else if constexpr (__have_avx512f && __is_sse_ps<_Up, _Np>())
+	  return __mmask8(0xf
+			  & _mm512_cmp_ps_mask(__auto_bitcast(__a),
+					       __auto_bitcast(__b),
+					       _CMP_ORD_Q));
+	else if constexpr (__have_avx512vl && __is_sse_pd<_Up, _Np>())
+	  return _mm_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+	else if constexpr (__have_avx512f && __is_sse_pd<_Up, _Np>())
+	  return __mmask8(0x3
+			  & _mm512_cmp_pd_mask(__auto_bitcast(__a),
+					       __auto_bitcast(__b),
+					       _CMP_ORD_Q));
+	else if constexpr (__have_avx512vl && __is_avx_ps<_Up, _Np>())
+	  return _mm256_cmp_ps_mask(__a, __b, _CMP_ORD_Q);
+	else if constexpr (__have_avx512f && __is_avx_ps<_Up, _Np>())
+	  return __mmask8(_mm512_cmp_ps_mask(__auto_bitcast(__a),
+					     __auto_bitcast(__b), _CMP_ORD_Q));
+	else if constexpr (__have_avx512vl && __is_avx_pd<_Up, _Np>())
+	  return _mm256_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+	else if constexpr (__have_avx512f && __is_avx_pd<_Up, _Np>())
+	  return __mmask8(0xf
+			  & _mm512_cmp_pd_mask(__auto_bitcast(__a),
+					       __auto_bitcast(__b),
+					       _CMP_ORD_Q));
+	else if constexpr (__is_avx512_ps<_Up, _Np>())
+	  return _mm512_cmp_ps_mask(__a, __b, _CMP_ORD_Q);
+	else if constexpr (__is_avx512_pd<_Up, _Np>())
+	  return _mm512_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+  }
+
+  // }}}
+  // __isfinite {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isfinite(_SimdWrapper<_Tp, _Np> __x)
+  {
+    static_assert(is_floating_point_v<_Tp>);
+#if __FINITE_MATH_ONLY__
+    [](auto&&){}(__x);
+    return __equal_to(_SimdWrapper<_Tp, _Np>(), _SimdWrapper<_Tp, _Np>());
+#else
+    if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+      {
+	const auto __xi = __to_intrin(__x);
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm512_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm256_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm256_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      {
+	// if all exponent bits are set, __x is either inf or NaN
+	using _I = __int_for_sizeof_t<_Tp>;
+	const auto __inf = __vector_bitcast<_I>(
+	  __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+	return __less<_I, _Np>(__vector_bitcast<_I>(__x) & __inf, __inf);
+      }
+    else
+      return _Base::__isfinite(__x);
+#endif
+  }
+
+  // }}}
+  // __isinf {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isinf(_SimdWrapper<_Tp, _Np> __x)
+  {
+#if __FINITE_MATH_ONLY__
+    [](auto&&){}(__x);
+    return {}; // false
+#else
+    if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+      {
+	const auto __xi = __to_intrin(__x);
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_fpclass_ps_mask(__xi, 0x18);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_fpclass_pd_mask(__xi, 0x18);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_fpclass_ps_mask(__xi, 0x18);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_fpclass_pd_mask(__xi, 0x18);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_fpclass_ps_mask(__xi, 0x18);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_fpclass_pd_mask(__xi, 0x18);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx512dq_vl)
+      {
+	if constexpr (__is_sse_pd<_Tp, _Np>())
+	  return __vector_bitcast<double>(
+	    _mm_movm_epi64(_mm_fpclass_pd_mask(__x, 0x18)));
+	else if constexpr (__is_avx_pd<_Tp, _Np>())
+	  return __vector_bitcast<double>(
+	    _mm256_movm_epi64(_mm256_fpclass_pd_mask(__x, 0x18)));
+	else if constexpr (__is_sse_ps<_Tp, _Np>())
+	  return __auto_bitcast(
+	    _mm_movm_epi32(_mm_fpclass_ps_mask(__to_intrin(__x), 0x18)));
+	else if constexpr (__is_avx_ps<_Tp, _Np>())
+	  return __vector_bitcast<float>(
+	    _mm256_movm_epi32(_mm256_fpclass_ps_mask(__x, 0x18)));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__isinf(__x);
+#endif
+  }
+
+  // }}}
+  // __isnormal {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isnormal(_SimdWrapper<_Tp, _Np> __x)
+  {
+#if __FINITE_MATH_ONLY__
+    [[maybe_unused]] constexpr int __mode = 0x26;
+#else
+    [[maybe_unused]] constexpr int __mode = 0xbf;
+#endif
+    if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+      {
+	const auto __xi = __to_intrin(__x);
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, __mode);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm512_mask_fpclass_pd_mask(__k1, __xi, __mode);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm256_mask_fpclass_ps_mask(__k1, __xi, __mode);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm256_mask_fpclass_pd_mask(__k1, __xi, __mode);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __k1 ^ _mm_mask_fpclass_ps_mask(__k1, __xi, __mode);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return __k1 ^ _mm_mask_fpclass_pd_mask(__k1, __xi, __mode);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx512dq)
+      {
+	if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+	  return __auto_bitcast(_mm_movm_epi32(
+	    _knot_mask8(_mm_fpclass_ps_mask(__to_intrin(__x), __mode))));
+	else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+	  return __vector_bitcast<float>(_mm256_movm_epi32(
+	    _knot_mask8(_mm256_fpclass_ps_mask(__x, __mode))));
+	else if constexpr (__is_avx512_ps<_Tp, _Np>())
+	  return _knot_mask16(_mm512_fpclass_ps_mask(__x, __mode));
+	else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+	  return __vector_bitcast<double>(
+	    _mm_movm_epi64(_knot_mask8(_mm_fpclass_pd_mask(__x, __mode))));
+	else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+	  return __vector_bitcast<double>(_mm256_movm_epi64(
+	    _knot_mask8(_mm256_fpclass_pd_mask(__x, __mode))));
+	else if constexpr (__is_avx512_pd<_Tp, _Np>())
+	  return _knot_mask8(_mm512_fpclass_pd_mask(__x, __mode));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      {
+	using _I = __int_for_sizeof_t<_Tp>;
+	const auto absn = __vector_bitcast<_I>(__abs(__x));
+	const auto minn = __vector_bitcast<_I>(
+	  __vector_broadcast<_Np>(std::numeric_limits<_Tp>::min()));
+#if __FINITE_MATH_ONLY__
+	return __less_equal<_I, _Np>(minn, absn);
+#else
+	const auto infn = __vector_bitcast<_I>(
+	  __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+	return __and(__less_equal<_I, _Np>(minn, absn),
+		     __less<_I, _Np>(absn, infn));
+#endif
+      }
+    else
+      return _Base::__isnormal(__x);
+  }
+
+  // }}}
+  // __isnan {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isnan(_SimdWrapper<_Tp, _Np> __x)
+  {
+    return __isunordered(__x, __x);
+  }
+
+  // }}}
+  // __isunordered {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+  {
+#if __FINITE_MATH_ONLY__
+    [](auto&&){}(__x);
+    return {}; // false
+#else
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+      }
+    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+      return _mm256_cmp_ps(__xi, __yi, _CMP_UNORD_Q);
+    else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+      return _mm256_cmp_pd(__xi, __yi, _CMP_UNORD_Q);
+    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      return __auto_bitcast(_mm_cmpunord_ps(__xi, __yi));
+    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __auto_bitcast(_mm_cmpunord_pd(__xi, __yi));
+    else
+      __assert_unreachable<_Tp>();
+#endif
+  }
+
+  // }}}
+  // __isgreater {{{
+  template <typename _Tp, size_t _Np>
+  static constexpr _MaskMember<_Tp> __isgreater(_SimdWrapper<_Tp, _Np> __x,
+						_SimdWrapper<_Tp, _Np> __y)
+  {
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx)
+      {
+	if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_cmp_ps(__xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_cmp_pd(__xi, __yi, _CMP_GT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_GT_OQ));
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_cmp_pd(__xi, __yi, _CMP_GT_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      {
+	const auto __xn = __vector_bitcast<int>(__xi);
+	const auto __yn = __vector_bitcast<int>(__yi);
+	const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+	const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+	return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+				    reinterpret_cast<__m128>(__xp > __yp)));
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+	-_mm_ucomigt_sd(__xi, __yi),
+	-_mm_ucomigt_sd(_mm_unpackhi_pd(__xi, __xi),
+			_mm_unpackhi_pd(__yi, __yi))});
+    else
+      return _Base::__isgreater(__x, __y);
+  }
+
+  // }}}
+  // __isgreaterequal {{{
+  template <typename _Tp, size_t _Np>
+  static constexpr _MaskMember<_Tp> __isgreaterequal(_SimdWrapper<_Tp, _Np> __x,
+						     _SimdWrapper<_Tp, _Np> __y)
+  {
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx)
+      {
+	if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_cmp_ps(__xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_cmp_pd(__xi, __yi, _CMP_GE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_GE_OQ));
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_cmp_pd(__xi, __yi, _CMP_GE_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      {
+	const auto __xn = __vector_bitcast<int>(__xi);
+	const auto __yn = __vector_bitcast<int>(__yi);
+	const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+	const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+	return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+				    reinterpret_cast<__m128>(__xp >= __yp)));
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+	-_mm_ucomige_sd(__xi, __yi),
+	-_mm_ucomige_sd(_mm_unpackhi_pd(__xi, __xi),
+			_mm_unpackhi_pd(__yi, __yi))});
+    else
+      return _Base::__isgreaterequal(__x, __y);
+  }
+
+  // }}}
+  // __isless {{{
+  template <typename _Tp, size_t _Np>
+  static constexpr _MaskMember<_Tp> __isless(_SimdWrapper<_Tp, _Np> __x,
+					     _SimdWrapper<_Tp, _Np> __y)
+  {
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx)
+      {
+	if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_cmp_ps(__xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_cmp_pd(__xi, __yi, _CMP_LT_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_LT_OQ));
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_cmp_pd(__xi, __yi, _CMP_LT_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      {
+	const auto __xn = __vector_bitcast<int>(__xi);
+	const auto __yn = __vector_bitcast<int>(__yi);
+	const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+	const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+	return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+				    reinterpret_cast<__m128>(__xp < __yp)));
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+	-_mm_ucomigt_sd(__yi, __xi),
+	-_mm_ucomigt_sd(_mm_unpackhi_pd(__yi, __yi),
+			_mm_unpackhi_pd(__xi, __xi))});
+    else
+      return _Base::__isless(__x, __y);
+  }
+
+  // }}}
+  // __islessequal {{{
+  template <typename _Tp, size_t _Np>
+  static constexpr _MaskMember<_Tp> __islessequal(_SimdWrapper<_Tp, _Np> __x,
+						  _SimdWrapper<_Tp, _Np> __y)
+  {
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx)
+      {
+	if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_cmp_ps(__xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_cmp_pd(__xi, __yi, _CMP_LE_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_LE_OQ));
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_cmp_pd(__xi, __yi, _CMP_LE_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      {
+	const auto __xn = __vector_bitcast<int>(__xi);
+	const auto __yn = __vector_bitcast<int>(__yi);
+	const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+	const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+	return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+				    reinterpret_cast<__m128>(__xp <= __yp)));
+      }
+    else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+	-_mm_ucomige_sd(__yi, __xi),
+	-_mm_ucomige_sd(_mm_unpackhi_pd(__yi, __yi),
+			_mm_unpackhi_pd(__xi, __xi))});
+    else
+      return _Base::__islessequal(__x, __y);
+  }
+
+  // }}}
+  // __islessgreater {{{
+  template <typename _Tp, size_t _Np>
+  static constexpr _MaskMember<_Tp> __islessgreater(_SimdWrapper<_Tp, _Np> __x,
+						    _SimdWrapper<_Tp, _Np> __y)
+  {
+    const auto __xi = __to_intrin(__x);
+    const auto __yi = __to_intrin(__y);
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+	if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+	  return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+	  return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__have_avx)
+      {
+	if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+	  return _mm256_cmp_ps(__xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+	  return _mm256_cmp_pd(__xi, __yi, _CMP_NEQ_OQ);
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+	  return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_NEQ_OQ));
+	else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+	  return _mm_cmp_pd(__xi, __yi, _CMP_NEQ_OQ);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+      return __auto_bitcast(
+	__and(_mm_cmpord_ps(__xi, __yi), _mm_cmpneq_ps(__xi, __yi)));
+    else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+      return __and(_mm_cmpord_pd(__xi, __yi), _mm_cmpneq_pd(__xi, __yi));
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  //}}}
+  //}}}
+};
+
+// }}}
+// _MaskImplX86Mixin {{{
+struct _MaskImplX86Mixin
+{
+  template <typename _Tp> using _TypeTag = _Tp*;
+  using _Base = _MaskImplBuiltinMixin;
+
+  // __to_maskvector(bool) {{{
+  template <typename _Up, size_t _ToN = 1, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr enable_if_t<is_same_v<_Tp, bool>,
+						       _SimdWrapper<_Up, _ToN>>
+  __to_maskvector(_Tp __x)
+  {
+    using _I = __int_for_sizeof_t<_Up>;
+    return __vector_bitcast<_Up>(__x ? __vector_type_t<_I, _ToN>{~_I()}
+				     : __vector_type_t<_I, _ToN>{});
+  }
+
+  // }}}
+  // __to_maskvector(_SanitizedBitMask) {{{
+  template <typename _Up, size_t _UpN = 0, size_t _Np,
+	    size_t _ToN = _UpN == 0 ? _Np : _UpN>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+  __to_maskvector(_SanitizedBitMask<_Np> __x)
+  {
+    using _UV = __vector_type_t<_Up, _ToN>;
+    using _UI = __intrinsic_type_t<_Up, _ToN>;
+    [[maybe_unused]] const auto __k = __x._M_to_bits();
+    if constexpr (_Np == 1)
+      return __to_maskvector<_Up, _ToN>(__k);
+    else if (__x._M_is_constprop() || __builtin_is_constant_evaluated())
+      {
+	using _Ip = __int_for_sizeof_t<_Up>;
+	return __vector_bitcast<_Up>(
+	  __generate_from_n_evaluations<std::min(_ToN, _Np),
+					__vector_type_t<_Ip, _ToN>>(
+	    [&](auto __i) -> _Ip { return -__x[__i.value]; }));
+      }
+    else if constexpr (sizeof(_Up) == 1)
+      {
+	if constexpr (sizeof(_UI) == 16)
+	  {
+	    if constexpr (__have_avx512bw_vl)
+	      return __intrin_bitcast<_UV>(_mm_movm_epi8(__k));
+	    else if constexpr (__have_avx512bw)
+	      return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi8(__k)));
+	    else if constexpr (__have_avx512f)
+	      {
+		auto __as32bits = _mm512_maskz_mov_epi32(__k, ~__m512i());
+		auto __as16bits = __xzyw(
+		  _mm256_packs_epi32(__lo256(__as32bits), __hi256(__as32bits)));
+		return __intrin_bitcast<_UV>(
+		  _mm_packs_epi16(__lo128(__as16bits), __hi128(__as16bits)));
+	      }
+	    else if constexpr (__have_ssse3)
+	      {
+		const auto __bitmask = __to_intrin(
+		  __make_vector<_UChar>(1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8,
+					16, 32, 64, 128));
+		return __intrin_bitcast<_UV>(
+		  __vector_bitcast<_Up>(
+		    _mm_shuffle_epi8(__to_intrin(
+				       __vector_type_t<_ULLong, 2>{__k}),
+				     _mm_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
+						   1, 1, 1, 1, 1, 1))
+		    & __bitmask)
+		  != 0);
+	      }
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 32)
+	  {
+	    if constexpr (__have_avx512bw_vl)
+	      return __vector_bitcast<_Up>(_mm256_movm_epi8(__k));
+	    else if constexpr (__have_avx512bw)
+	      return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi8(__k)));
+	    else if constexpr (__have_avx512f)
+	      {
+		auto __as16bits = // 0 16 1 17 ... 15 31
+		  _mm512_srli_epi32(_mm512_maskz_mov_epi32(__k, ~__m512i()), 16)
+		  | _mm512_slli_epi32(_mm512_maskz_mov_epi32(__k >> 16,
+							     ~__m512i()),
+				      16);
+		auto __0_16_1_17 = __xzyw(_mm256_packs_epi16(
+		  __lo256(__as16bits),
+		  __hi256(__as16bits)) // 0 16 1 17 2 18 3 19 8 24 9 25 ...
+		);
+		// deinterleave:
+		return __vector_bitcast<_Up>(__xzyw(_mm256_shuffle_epi8(
+		  __0_16_1_17, // 0 16 1 17 2 ...
+		  _mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11,
+				   13, 15, 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5,
+				   7, 9, 11, 13,
+				   15)))); // 0-7 16-23 8-15 24-31 -> xzyw
+					   // 0-3  8-11 16-19 24-27
+					   // 4-7 12-15 20-23 28-31
+	      }
+	    else if constexpr (__have_avx2)
+	      {
+		const auto __bitmask = _mm256_broadcastsi128_si256(__to_intrin(
+		  __make_vector<_UChar>(1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8,
+					16, 32, 64, 128)));
+		return __vector_bitcast<_Up>(
+		  __vector_bitcast<_Up>(
+		    _mm256_shuffle_epi8(
+		      _mm256_broadcastsi128_si256(
+			__to_intrin(__vector_type_t<_ULLong, 2>{__k})),
+		      _mm256_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
+				       1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
+				       3, 3, 3, 3))
+		    & __bitmask)
+		  != 0);
+	      }
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 64)
+	  return reinterpret_cast<__vector_type_t<_SChar, 64>>(
+	    _mm512_movm_epi8(__k));
+	if constexpr (std::min(_ToN, _Np) <= 4)
+	  {
+	    if constexpr (_Np > 7) // avoid overflow
+	      __x &= _SanitizedBitMask<_Np>(0x0f);
+	    const _UInt __char_mask
+	      = ((_UInt(__x.to_ulong()) * 0x00204081U) & 0x01010101ULL) * 0xff;
+	    __vector_type_t<_Up, _ToN> __r = {};
+	    __builtin_memcpy(&__r, &__char_mask,
+			     std::min(sizeof(__r), sizeof(__char_mask)));
+	    return __r;
+	  }
+	else if constexpr (std::min(_ToN, _Np) <= 7)
+	  {
+	    if constexpr (_Np > 7) // avoid overflow
+	      __x &= _SanitizedBitMask<_Np>(0x7f);
+	    const _ULLong __char_mask
+	      = ((__x.to_ulong() * 0x40810204081ULL) & 0x0101010101010101ULL)
+		* 0xff;
+	    __vector_type_t<_Up, _ToN> __r = {};
+	    __builtin_memcpy(&__r, &__char_mask,
+			     std::min(sizeof(__r), sizeof(__char_mask)));
+	    return __r;
+	  }
+      }
+    else if constexpr (sizeof(_Up) == 2)
+      {
+	if constexpr (sizeof(_UI) == 16)
+	  {
+	    if constexpr (__have_avx512bw_vl)
+	      return __intrin_bitcast<_UV>(_mm_movm_epi16(__k));
+	    else if constexpr (__have_avx512bw)
+	      return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi16(__k)));
+	    else if constexpr (__have_avx512f)
+	      {
+		__m256i __as32bits;
+		if constexpr (__have_avx512vl)
+		  __as32bits = _mm256_maskz_mov_epi32(__k, ~__m256i());
+		else
+		  __as32bits = __lo256(_mm512_maskz_mov_epi32(__k, ~__m512i()));
+		return __intrin_bitcast<_UV>(
+		  _mm_packs_epi32(__lo128(__as32bits), __hi128(__as32bits)));
+	      }
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 32)
+	  {
+	    if constexpr (__have_avx512bw_vl)
+	      return __vector_bitcast<_Up>(_mm256_movm_epi16(__k));
+	    else if constexpr (__have_avx512bw)
+	      return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi16(__k)));
+	    else if constexpr (__have_avx512f)
+	      {
+		auto __as32bits = _mm512_maskz_mov_epi32(__k, ~__m512i());
+		return __vector_bitcast<_Up>(
+		  __xzyw(_mm256_packs_epi32(__lo256(__as32bits),
+					    __hi256(__as32bits))));
+	      }
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 64)
+	  return __vector_bitcast<_Up>(_mm512_movm_epi16(__k));
+      }
+    else if constexpr (sizeof(_Up) == 4)
+      {
+	if constexpr (sizeof(_UI) == 16)
+	  {
+	    if constexpr (__have_avx512dq_vl)
+	      return __intrin_bitcast<_UV>(_mm_movm_epi32(__k));
+	    else if constexpr (__have_avx512dq)
+	      return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi32(__k)));
+	    else if constexpr (__have_avx512vl)
+	      return __intrin_bitcast<_UV>(
+		_mm_maskz_mov_epi32(__k, ~__m128i()));
+	    else if constexpr (__have_avx512f)
+	      return __intrin_bitcast<_UV>(
+		__lo128(_mm512_maskz_mov_epi32(__k, ~__m512i())));
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 32)
+	  {
+	    if constexpr (__have_avx512dq_vl)
+	      return __vector_bitcast<_Up>(_mm256_movm_epi32(__k));
+	    else if constexpr (__have_avx512dq)
+	      return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi32(__k)));
+	    else if constexpr (__have_avx512vl)
+	      return __vector_bitcast<_Up>(
+		_mm256_maskz_mov_epi32(__k, ~__m256i()));
+	    else if constexpr (__have_avx512f)
+	      return __vector_bitcast<_Up>(
+		__lo256(_mm512_maskz_mov_epi32(__k, ~__m512i())));
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 64)
+	  return __vector_bitcast<_Up>(
+	    __have_avx512dq ? _mm512_movm_epi32(__k)
+			    : _mm512_maskz_mov_epi32(__k, ~__m512i()));
+      }
+    else if constexpr (sizeof(_Up) == 8)
+      {
+	if constexpr (sizeof(_UI) == 16)
+	  {
+	    if constexpr (__have_avx512dq_vl)
+	      return __vector_bitcast<_Up>(_mm_movm_epi64(__k));
+	    else if constexpr (__have_avx512dq)
+	      return __vector_bitcast<_Up>(__lo128(_mm512_movm_epi64(__k)));
+	    else if constexpr (__have_avx512vl)
+	      return __vector_bitcast<_Up>(
+		_mm_maskz_mov_epi64(__k, ~__m128i()));
+	    else if constexpr (__have_avx512f)
+	      return __vector_bitcast<_Up>(
+		__lo128(_mm512_maskz_mov_epi64(__k, ~__m512i())));
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 32)
+	  {
+	    if constexpr (__have_avx512dq_vl)
+	      return __vector_bitcast<_Up>(_mm256_movm_epi64(__k));
+	    else if constexpr (__have_avx512dq)
+	      return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi64(__k)));
+	    else if constexpr (__have_avx512vl)
+	      return __vector_bitcast<_Up>(
+		_mm256_maskz_mov_epi64(__k, ~__m256i()));
+	    else if constexpr (__have_avx512f)
+	      return __vector_bitcast<_Up>(
+		__lo256(_mm512_maskz_mov_epi64(__k, ~__m512i())));
+	    // else fall through
+	  }
+	else if constexpr (sizeof(_UI) == 64)
+	  return __vector_bitcast<_Up>(
+	    __have_avx512dq ? _mm512_movm_epi64(__k)
+			    : _mm512_maskz_mov_epi64(__k, ~__m512i()));
+      }
+
+    using _UpUInt = std::make_unsigned_t<__int_for_sizeof_t<_Up>>;
+    using _V = __vector_type_t<_UpUInt, _ToN>;
+    constexpr size_t __bits_per_element = sizeof(_Up) * CHAR_BIT;
+    if constexpr (_ToN == 2)
+      {
+	return __vector_bitcast<_Up>(_V{_UpUInt(-__x[0]), _UpUInt(-__x[1])});
+      }
+    else if constexpr (!__have_avx2 && __have_avx && sizeof(_V) == 32)
+      {
+	if constexpr (sizeof(_Up) == 4)
+	  return __vector_bitcast<_Up>(_mm256_cmp_ps(
+	    _mm256_and_ps(_mm256_castsi256_ps(_mm256_set1_epi32(__k)),
+			  _mm256_castsi256_ps(_mm256_setr_epi32(
+			    0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80))),
+	    _mm256_setzero_ps(), _CMP_NEQ_UQ));
+	else if constexpr (sizeof(_Up) == 8)
+	  return __vector_bitcast<_Up>(_mm256_cmp_pd(
+	    _mm256_and_pd(_mm256_castsi256_pd(_mm256_set1_epi64x(__k)),
+			  _mm256_castsi256_pd(
+			    _mm256_setr_epi64x(0x01, 0x02, 0x04, 0x08))),
+	    _mm256_setzero_pd(), _CMP_NEQ_UQ));
+	else
+	  __assert_unreachable<_Up>();
+      }
+    else if constexpr (__bits_per_element >= _ToN)
+      {
+	constexpr auto __bitmask
+	  = __generate_vector<__vector_type_t<_UpUInt, _ToN>>(
+	    [](auto __i) constexpr->_UpUInt {
+	      return __i < _ToN ? 1ull << __i : 0;
+	    });
+	const auto __bits = __vector_broadcast<_ToN, _UpUInt>(__k) & __bitmask;
+	if constexpr (__bits_per_element > _ToN)
+	  return __vector_bitcast<_Up>(
+	    __vector_bitcast<__int_for_sizeof_t<_Up>>(__bits) > 0);
+	else
+	  return __vector_bitcast<_Up>(__bits != 0);
+      }
+    else
+      {
+	const _V __tmp
+	  = __generate_vector<_V>([&](auto __i) constexpr {
+	      return static_cast<_UpUInt>(
+		__k >> (__bits_per_element * (__i / __bits_per_element)));
+	    })
+	    & __generate_vector<_V>([](auto __i) constexpr {
+		return static_cast<_UpUInt>(1ull << (__i % __bits_per_element));
+	      }); // mask bit index
+	return __vector_bitcast<_Up>(__tmp != _V());
+      }
+  }
+
+  // }}}
+  // __to_maskvector(_SimdWrapper) {{{
+  template <typename _Up, size_t _UpN = 0, typename _Tp, size_t _Np,
+	    size_t _ToN = _UpN == 0 ? _Np : _UpN>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+  __to_maskvector(_SimdWrapper<_Tp, _Np> __x)
+  {
+    using _TW = _SimdWrapper<_Tp, _Np>;
+    using _UW = _SimdWrapper<_Up, _ToN>;
+    using _UI = __intrinsic_type_t<_Up, _ToN>;
+    if constexpr (sizeof(_Up) == sizeof(_Tp) && sizeof(_TW) == sizeof(_UW))
+      if constexpr (_ToN <= _Np)
+	return __wrapper_bitcast<_Up, _ToN>(__x);
+      else
+	return simd_abi::deduce_t<_Up, _ToN>::__masked(
+	  __wrapper_bitcast<_Up, _ToN>(__x));
+    else if constexpr (is_same_v<_Tp, bool>) // bits -> vector
+      return __to_maskvector<_Up, _ToN>(
+	_BitMask<_Np>(__x._M_data)._M_sanitized());
+    else
+      { // vector -> vector {{{
+	if (__x._M_is_constprop() || __builtin_is_constant_evaluated())
+	  {
+	    const auto __y = __vector_bitcast<__int_for_sizeof_t<_Tp>>(__x);
+	    using _Ip = __int_for_sizeof_t<_Up>;
+	    return __vector_bitcast<_Up>(
+	      __generate_from_n_evaluations<std::min(_ToN, _Np),
+					    __vector_type_t<_Ip, _ToN>>(
+		[&](auto __i) -> _Ip { return __y[__i.value]; }));
+	  }
+	using _To = __vector_type_t<_Up, _ToN>;
+	[[maybe_unused]] constexpr size_t _FromN = _Np;
+	constexpr int _FromBytes = sizeof(_Tp);
+	constexpr int _ToBytes = sizeof(_Up);
+	const auto __k = __x._M_data;
+
+	if constexpr (_FromBytes == _ToBytes)
+	  return __intrin_bitcast<_To>(__k);
+	else if constexpr (sizeof(_UI) == 16 && sizeof(__k) == 16)
+	  { // SSE -> SSE {{{
+	    if constexpr (_FromBytes == 4 && _ToBytes == 8)
+	      return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+	    else if constexpr (_FromBytes == 2 && _ToBytes == 8)
+	      {
+		const auto __y
+		  = __vector_bitcast<int>(__interleave128_lo(__k, __k));
+		return __intrin_bitcast<_To>(__interleave128_lo(__y, __y));
+	      }
+	    else if constexpr (_FromBytes == 1 && _ToBytes == 8)
+	      {
+		auto __y
+		  = __vector_bitcast<short>(__interleave128_lo(__k, __k));
+		auto __z = __vector_bitcast<int>(__interleave128_lo(__y, __y));
+		return __intrin_bitcast<_To>(__interleave128_lo(__z, __z));
+	      }
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 4 && __have_sse2)
+	      return __intrin_bitcast<_To>(
+		_mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()));
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 4)
+	      return __vector_shuffle<1, 3, 6, 7>(__vector_bitcast<_Up>(__k),
+						  _UI());
+	    else if constexpr (_FromBytes == 2 && _ToBytes == 4)
+	      return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+	    else if constexpr (_FromBytes == 1 && _ToBytes == 4)
+	      {
+		const auto __y
+		  = __vector_bitcast<short>(__interleave128_lo(__k, __k));
+		return __intrin_bitcast<_To>(__interleave128_lo(__y, __y));
+	      }
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 2)
+	      {
+		if constexpr (__have_sse2 && !__have_ssse3)
+		  return __intrin_bitcast<_To>(_mm_packs_epi32(
+		    _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()),
+		    __m128i()));
+		else
+		  return __intrin_bitcast<_To>(
+		    __vector_permute<3, 7, -1, -1, -1, -1, -1, -1>(
+		      __vector_bitcast<_Up>(__k)));
+	      }
+	    else if constexpr (_FromBytes == 4 && _ToBytes == 2)
+	      return __intrin_bitcast<_To>(
+		_mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()));
+	    else if constexpr (_FromBytes == 1 && _ToBytes == 2)
+	      return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 1 && __have_ssse3)
+	      return __intrin_bitcast<_To>(
+		_mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				 _mm_setr_epi8(7, 15, -1, -1, -1, -1, -1, -1,
+					       -1, -1, -1, -1, -1, -1, -1,
+					       -1)));
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+	      {
+		auto __y
+		  = _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i());
+		__y = _mm_packs_epi32(__y, __m128i());
+		return __intrin_bitcast<_To>(_mm_packs_epi16(__y, __m128i()));
+	      }
+	    else if constexpr (_FromBytes == 4 && _ToBytes == 1 && __have_ssse3)
+	      return __intrin_bitcast<_To>(
+		_mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				 _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1, -1,
+					       -1, -1, -1, -1, -1, -1, -1)));
+	    else if constexpr (_FromBytes == 4 && _ToBytes == 1)
+	      {
+		const auto __y
+		  = _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i());
+		return __intrin_bitcast<_To>(_mm_packs_epi16(__y, __m128i()));
+	      }
+	    else if constexpr (_FromBytes == 2 && _ToBytes == 1)
+	      return __intrin_bitcast<_To>(
+		_mm_packs_epi16(__vector_bitcast<_LLong>(__k), __m128i()));
+	    else
+	      __assert_unreachable<_Tp>();
+	  } // }}}
+	else if constexpr (sizeof(_UI) == 32 && sizeof(__k) == 32)
+	  { // AVX -> AVX {{{
+	    if constexpr (_FromBytes == _ToBytes)
+	      __assert_unreachable<_Tp>();
+	    else if constexpr (_FromBytes == _ToBytes * 2)
+	      {
+		const auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+		  _mm_packs_epi16(__lo128(__y), __hi128(__y))));
+	      }
+	    else if constexpr (_FromBytes == _ToBytes * 4)
+	      {
+		const auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+		  _mm_packs_epi16(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+				  __m128i())));
+	      }
+	    else if constexpr (_FromBytes == _ToBytes * 8)
+	      {
+		const auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+		  _mm_shuffle_epi8(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+				   _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+						 -1, -1, -1, -1, -1, -1, -1,
+						 -1))));
+	      }
+	    else if constexpr (_FromBytes * 2 == _ToBytes)
+	      {
+		auto __y = __xzyw(__to_intrin(__k));
+		if constexpr (std::is_floating_point_v<_Tp>)
+		  return __intrin_bitcast<_To>(_mm256_unpacklo_ps(__y, __y));
+		else
+		  return __intrin_bitcast<_To>(_mm256_unpacklo_epi8(__y, __y));
+	      }
+	    else if constexpr (_FromBytes * 4 == _ToBytes)
+	      {
+		auto __y
+		  = _mm_unpacklo_epi8(__lo128(__vector_bitcast<_LLong>(__k)),
+				      __lo128(__vector_bitcast<_LLong>(
+					__k))); // drops 3/4 of input
+		return __intrin_bitcast<_To>(
+		  __concat(_mm_unpacklo_epi16(__y, __y),
+			   _mm_unpackhi_epi16(__y, __y)));
+	      }
+	    else if constexpr (_FromBytes == 1 && _ToBytes == 8)
+	      {
+		auto __y
+		  = _mm_unpacklo_epi8(__lo128(__vector_bitcast<_LLong>(__k)),
+				      __lo128(__vector_bitcast<_LLong>(
+					__k))); // drops 3/4 of input
+		__y = _mm_unpacklo_epi16(__y,
+					 __y); // drops another 1/2 => 7/8 total
+		return __intrin_bitcast<_To>(
+		  __concat(_mm_unpacklo_epi32(__y, __y),
+			   _mm_unpackhi_epi32(__y, __y)));
+	      }
+	    else
+	      __assert_unreachable<_Tp>();
+	  } // }}}
+	else if constexpr (sizeof(_UI) == 32 && sizeof(__k) == 16)
+	  { // SSE -> AVX {{{
+	    if constexpr (_FromBytes == _ToBytes)
+	      return __intrin_bitcast<_To>(
+		__intrinsic_type_t<_Tp, 32 / sizeof(_Tp)>(
+		  __zero_extend(__to_intrin(__k))));
+	    else if constexpr (_FromBytes * 2 == _ToBytes)
+	      { // keep all
+		return __intrin_bitcast<_To>(
+		  __concat(_mm_unpacklo_epi8(__vector_bitcast<_LLong>(__k),
+					     __vector_bitcast<_LLong>(__k)),
+			   _mm_unpackhi_epi8(__vector_bitcast<_LLong>(__k),
+					     __vector_bitcast<_LLong>(__k))));
+	      }
+	    else if constexpr (_FromBytes * 4 == _ToBytes)
+	      {
+		if constexpr (__have_avx2)
+		  {
+		    return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+		      __concat(__vector_bitcast<_LLong>(__k),
+			       __vector_bitcast<_LLong>(__k)),
+		      _mm256_setr_epi8(0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3,
+				       3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6,
+				       7, 7, 7, 7)));
+		  }
+		else
+		  {
+		    return __intrin_bitcast<_To>(__concat(
+		      _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				       _mm_setr_epi8(0, 0, 0, 0, 1, 1, 1, 1, 2,
+						     2, 2, 2, 3, 3, 3, 3)),
+		      _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				       _mm_setr_epi8(4, 4, 4, 4, 5, 5, 5, 5, 6,
+						     6, 6, 6, 7, 7, 7, 7))));
+		  }
+	      }
+	    else if constexpr (_FromBytes * 8 == _ToBytes)
+	      {
+		if constexpr (__have_avx2)
+		  {
+		    return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+		      __concat(__vector_bitcast<_LLong>(__k),
+			       __vector_bitcast<_LLong>(__k)),
+		      _mm256_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
+				       1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
+				       3, 3, 3, 3)));
+		  }
+		else
+		  {
+		    return __intrin_bitcast<_To>(__concat(
+		      _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				       _mm_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1,
+						     1, 1, 1, 1, 1, 1, 1)),
+		      _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				       _mm_setr_epi8(2, 2, 2, 2, 2, 2, 2, 2, 3,
+						     3, 3, 3, 3, 3, 3, 3))));
+		  }
+	      }
+	    else if constexpr (_FromBytes == _ToBytes * 2)
+	      return __intrin_bitcast<_To>(__m256i(__zero_extend(
+		_mm_packs_epi16(__vector_bitcast<_LLong>(__k), __m128i()))));
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 2)
+	      {
+		return __intrin_bitcast<_To>(__m256i(__zero_extend(
+		  _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				   _mm_setr_epi8(6, 7, 14, 15, -1, -1, -1, -1,
+						 -1, -1, -1, -1, -1, -1, -1,
+						 -1)))));
+	      }
+	    else if constexpr (_FromBytes == 4 && _ToBytes == 1)
+	      {
+		return __intrin_bitcast<_To>(__m256i(__zero_extend(
+		  _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				   _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+						 -1, -1, -1, -1, -1, -1, -1,
+						 -1)))));
+	      }
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+	      {
+		return __intrin_bitcast<_To>(__m256i(__zero_extend(
+		  _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+				   _mm_setr_epi8(7, 15, -1, -1, -1, -1, -1, -1,
+						 -1, -1, -1, -1, -1, -1, -1,
+						 -1)))));
+	      }
+	    else
+	      static_assert(!std::is_same_v<_Tp, _Tp>, "should be unreachable");
+	  } // }}}
+	else if constexpr (sizeof(_UI) == 16 && sizeof(__k) == 32)
+	  { // AVX -> SSE {{{
+	    if constexpr (_FromBytes == _ToBytes)
+	      { // keep low 1/2
+		return __intrin_bitcast<_To>(__lo128(__k));
+	      }
+	    else if constexpr (_FromBytes == _ToBytes * 2)
+	      { // keep all
+		auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(
+		  _mm_packs_epi16(__lo128(__y), __hi128(__y)));
+	      }
+	    else if constexpr (_FromBytes == _ToBytes * 4)
+	      { // add 1/2 undef
+		auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(
+		  _mm_packs_epi16(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+				  __m128i()));
+	      }
+	    else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+	      { // add 3/4 undef
+		auto __y = __vector_bitcast<_LLong>(__k);
+		return __intrin_bitcast<_To>(
+		  _mm_shuffle_epi8(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+				   _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+						 -1, -1, -1, -1, -1, -1, -1,
+						 -1)));
+	      }
+	    else if constexpr (_FromBytes * 2 == _ToBytes)
+	      { // keep low 1/4
+		auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+	      }
+	    else if constexpr (_FromBytes * 4 == _ToBytes)
+	      { // keep low 1/8
+		auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+		__y = _mm_unpacklo_epi8(__y, __y);
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+	      }
+	    else if constexpr (_FromBytes * 8 == _ToBytes)
+	      { // keep low 1/16
+		auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+		__y = _mm_unpacklo_epi8(__y, __y);
+		__y = _mm_unpacklo_epi8(__y, __y);
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+	      }
+	    else
+	      static_assert(!std::is_same_v<_Tp, _Tp>, "should be unreachable");
+	  } // }}}
+	else
+	  return _Base::template __to_maskvector<_Up, _ToN>(__x);
+	/*
+	if constexpr (_FromBytes > _ToBytes) {
+	    const _To     __y      = __vector_bitcast<_Up>(__k);
+	    return [&] <std::size_t... _Is> (std::index_sequence<_Is...>) {
+	      constexpr int _Stride = _FromBytes / _ToBytes;
+	      return _To{__y[(_Is + 1) * _Stride - 1]...};
+	    }(std::make_index_sequence<std::min(_ToN, _FromN)>());
+	} else {
+	    // {0, 0, 1, 1} (_Dups = 2, _Is<4>)
+	    // {0, 0, 0, 0, 1, 1, 1, 1} (_Dups = 4, _Is<8>)
+	    // {0, 0, 1, 1, 2, 2, 3, 3} (_Dups = 2, _Is<8>)
+	    // ...
+	    return [&] <std::size_t... _Is> (std::index_sequence<_Is...>) {
+	      constexpr int __dup = _ToBytes / _FromBytes;
+	      return __intrin_bitcast<_To>(_From{__k[_Is / __dup]...});
+	    }(std::make_index_sequence<_FromN>());
+	}
+	*/
+      } // }}}
+  }
+
+  // }}}
+  // __to_bits {{{
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+  __to_bits(_SimdWrapper<_Tp, _Np> __x)
+  {
+    if constexpr (is_same_v<_Tp, bool>)
+      return _BitMask<_Np>(__x._M_data)._M_sanitized();
+    else
+      {
+	if (__builtin_is_constant_evaluated() || __builtin_constant_p(__x._M_data))
+	  {
+	    using _I = __int_for_sizeof_t<_Tp>;
+	    const auto __bools = -__vector_bitcast<_I>(__x);
+	    _ULLong __k = 0;
+	    __execute_n_times<_Np>([&](auto __i) {
+	      __k |= (_ULLong(__bools[int(__i)]) << __i);
+	      });
+	    if(__builtin_is_constant_evaluated() || __builtin_constant_p(__k))
+	      return __k;
+	  }
+	const auto __xi = __to_intrin(__x);
+	if constexpr (is_floating_point_v<_Tp>)
+	  if constexpr (sizeof(_Tp) == 4) // float
+	    if constexpr (sizeof(__xi) == 16)
+	      return _BitMask<_Np>(_mm_movemask_ps(__xi));
+	    else if constexpr (sizeof(__xi) == 32)
+	      return _BitMask<_Np>(_mm256_movemask_ps(__xi));
+	    else if constexpr (__have_avx512dq)
+	      return _BitMask<_Np>(
+		_mm512_movepi32_mask(reinterpret_cast<__m512i>(__xi)));
+	    else
+	      return _BitMask<_Np>(
+		_mm512_cmp_ps_mask(__xi, __xi, _CMP_UNORD_Q));
+	  else // implies double
+	    if constexpr (sizeof(__xi) == 16)
+	    return _BitMask<_Np>(_mm_movemask_pd(__xi));
+	  else if constexpr (sizeof(__xi) == 32)
+	    return _BitMask<_Np>(_mm256_movemask_pd(__xi));
+	  else if constexpr (__have_avx512dq)
+	    return _BitMask<_Np>(
+	      _mm512_movepi64_mask(reinterpret_cast<__m512i>(__xi)));
+	  else
+	    return _BitMask<_Np>(_mm512_cmp_pd_mask(__xi, __xi, _CMP_UNORD_Q));
+
+	else if constexpr (sizeof(_Tp) == 1)
+	  if constexpr (sizeof(__xi) == 16)
+	    if constexpr (__have_avx512bw_vl)
+	      return _BitMask<_Np>(_mm_movepi8_mask(__xi));
+	    else // implies SSE2
+	      return _BitMask<_Np>(_mm_movemask_epi8(__xi));
+	  else if constexpr (sizeof(__xi) == 32)
+	    if constexpr (__have_avx512bw_vl)
+	      return _BitMask<_Np>(_mm256_movepi8_mask(__xi));
+	    else // implies AVX2
+	      return _BitMask<_Np>(_mm256_movemask_epi8(__xi));
+	  else // implies AVX512BW
+	    return _BitMask<_Np>(_mm512_movepi8_mask(__xi));
+
+	else if constexpr (sizeof(_Tp) == 2)
+	  if constexpr (sizeof(__xi) == 16)
+	    if constexpr (__have_avx512bw_vl)
+	      return _BitMask<_Np>(_mm_movepi16_mask(__xi));
+	    else if constexpr (__have_avx512bw)
+	      return _BitMask<_Np>(_mm512_movepi16_mask(__zero_extend(__xi)));
+	    else // implies SSE2
+	      return _BitMask<_Np>(
+		_mm_movemask_epi8(_mm_packs_epi16(__xi, __m128i())));
+	  else if constexpr (sizeof(__xi) == 32)
+	    if constexpr (__have_avx512bw_vl)
+	      return _BitMask<_Np>(_mm256_movepi16_mask(__xi));
+	    else if constexpr (__have_avx512bw)
+	      return _BitMask<_Np>(_mm512_movepi16_mask(__zero_extend(__xi)));
+	    else // implies SSE2
+	      return _BitMask<_Np>(_mm_movemask_epi8(
+		_mm_packs_epi16(__lo128(__xi), __hi128(__xi))));
+	  else // implies AVX512BW
+	    return _BitMask<_Np>(_mm512_movepi16_mask(__xi));
+
+	else if constexpr (sizeof(_Tp) == 4)
+	  if constexpr (sizeof(__xi) == 16)
+	    if constexpr (__have_avx512dq_vl)
+	      return _BitMask<_Np>(_mm_movepi32_mask(__xi));
+	    else if constexpr (__have_avx512vl)
+	      return _BitMask<_Np>(_mm_cmplt_epi32_mask(__xi, __m128i()));
+	    else if constexpr (__have_avx512dq)
+	      return _BitMask<_Np>(_mm512_movepi32_mask(__zero_extend(__xi)));
+	    else if constexpr (__have_avx512f)
+	      return _BitMask<_Np>(
+		_mm512_cmplt_epi32_mask(__zero_extend(__xi), __m512i()));
+	    else // implies SSE
+	      return _BitMask<_Np>(
+		_mm_movemask_ps(reinterpret_cast<__m128>(__xi)));
+	  else if constexpr (sizeof(__xi) == 32)
+	    if constexpr (__have_avx512dq_vl)
+	      return _BitMask<_Np>(_mm256_movepi32_mask(__xi));
+	    else if constexpr (__have_avx512dq)
+	      return _BitMask<_Np>(_mm512_movepi32_mask(__zero_extend(__xi)));
+	    else if constexpr (__have_avx512vl)
+	      return _BitMask<_Np>(_mm256_cmplt_epi32_mask(__xi, __m256i()));
+	    else if constexpr (__have_avx512f)
+	      return _BitMask<_Np>(
+		_mm512_cmplt_epi32_mask(__zero_extend(__xi), __m512i()));
+	    else // implies AVX
+	      return _BitMask<_Np>(
+		_mm256_movemask_ps(reinterpret_cast<__m256>(__xi)));
+	  else // implies AVX512??
+	    if constexpr (__have_avx512dq)
+	    return _BitMask<_Np>(_mm512_movepi32_mask(__xi));
+	  else // implies AVX512F
+	    return _BitMask<_Np>(_mm512_cmplt_epi32_mask(__xi, __m512i()));
+
+	else if constexpr (sizeof(_Tp) == 8)
+	  if constexpr (sizeof(__xi) == 16)
+	    if constexpr (__have_avx512dq_vl)
+	      return _BitMask<_Np>(_mm_movepi64_mask(__xi));
+	    else if constexpr (__have_avx512dq)
+	      return _BitMask<_Np>(_mm512_movepi64_mask(__zero_extend(__xi)));
+	    else if constexpr (__have_avx512vl)
+	      return _BitMask<_Np>(_mm_cmplt_epi64_mask(__xi, __m128i()));
+	    else if constexpr (__have_avx512f)
+	      return _BitMask<_Np>(
+		_mm512_cmplt_epi64_mask(__zero_extend(__xi), __m512i()));
+	    else // implies SSE2
+	      return _BitMask<_Np>(
+		_mm_movemask_pd(reinterpret_cast<__m128d>(__xi)));
+	  else if constexpr (sizeof(__xi) == 32)
+	    if constexpr (__have_avx512dq_vl)
+	      return _BitMask<_Np>(_mm256_movepi64_mask(__xi));
+	    else if constexpr (__have_avx512dq)
+	      return _BitMask<_Np>(_mm512_movepi64_mask(__zero_extend(__xi)));
+	    else if constexpr (__have_avx512vl)
+	      return _BitMask<_Np>(_mm256_cmplt_epi64_mask(__xi, __m256i()));
+	    else if constexpr (__have_avx512f)
+	      return _BitMask<_Np>(
+		_mm512_cmplt_epi64_mask(__zero_extend(__xi), __m512i()));
+	    else // implies AVX
+	      return _BitMask<_Np>(
+		_mm256_movemask_pd(reinterpret_cast<__m256d>(__xi)));
+	  else // implies AVX512??
+	    if constexpr (__have_avx512dq)
+	    return _BitMask<_Np>(_mm512_movepi64_mask(__xi));
+	  else // implies AVX512F
+	    return _BitMask<_Np>(_mm512_cmplt_epi64_mask(__xi, __m512i()));
+
+	else
+	  __assert_unreachable<_Tp>();
+      }
+  }
+  // }}}
+};
+
+// }}}
+// _MaskImplX86 {{{
+template <typename _Abi>
+struct _MaskImplX86 : _MaskImplX86Mixin, _MaskImplBuiltin<_Abi>
+{
+  using _MaskImplX86Mixin::__to_bits;
+  using _MaskImplX86Mixin::__to_maskvector;
+  using _MaskImplBuiltin<_Abi>::__convert;
+
+  // member types {{{
+  template <typename _Tp>
+  using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+  template <typename _Tp>
+  using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+  template <typename _Tp> static constexpr size_t size = simd_size_v<_Tp, _Abi>;
+  using _Base = _MaskImplBuiltin<_Abi>;
+
+  // }}}
+  // __broadcast {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __broadcast(bool __x)
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      return __x ? _Abi::__masked(_MaskMember<_Tp>(-1)) : _MaskMember<_Tp>();
+    else
+      return _Base::template __broadcast<_Tp>(__x);
+  }
+
+  // }}}
+  // __load {{{
+  template <typename _Tp, typename _Flags>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+  __load(const bool* __bool_mem)
+  {
+    const void* __mem = __bool_mem;
+    if constexpr (is_same_v<_Flags, vector_aligned_tag>)
+      __mem
+	= __builtin_assume_aligned(__mem,
+				   memory_alignment_v<simd_mask<_Tp, _Abi>>);
+    else if constexpr (!is_same_v<_Flags, element_aligned_tag>)
+      __mem = __builtin_assume_aligned(__mem, _Flags::_S_alignment);
+
+    if constexpr (__have_avx512bw)
+      {
+	const auto __to_vec_or_bits = [](auto __bits) -> decltype(auto) {
+	  if constexpr (__is_avx512_abi<_Abi>())
+	    return __bits;
+	  else
+	    return __to_maskvector<_Tp>(
+	      _BitMask<size<_Tp>>(__bits)._M_sanitized());
+	};
+
+	if constexpr (size<_Tp> <= 16 && __have_avx512vl)
+	  {
+	    __m128i __a = {};
+	    __builtin_memcpy(&__a, __mem, size<_Tp>);
+	    return __to_vec_or_bits(_mm_test_epi8_mask(__a, __a));
+	  }
+	else if constexpr (size<_Tp> <= 32 && __have_avx512vl)
+	  {
+	    __m256i __a = {};
+	    __builtin_memcpy(&__a, __mem, size<_Tp>);
+	    return __to_vec_or_bits(_mm256_test_epi8_mask(__a, __a));
+	  }
+	else if constexpr (size<_Tp> <= 64)
+	  {
+	    __m512i __a = {};
+	    __builtin_memcpy(&__a, __mem, size<_Tp>);
+	    return __to_vec_or_bits(_mm512_test_epi8_mask(__a, __a));
+	  }
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      {
+	if constexpr (size<_Tp> <= 8)
+	  {
+	    __m128i __a = {};
+	    __builtin_memcpy(&__a, __mem, size<_Tp>);
+	    const auto __b = _mm512_cvtepi8_epi64(__a);
+	    return _mm512_test_epi64_mask(__b, __b);
+	  }
+	else if constexpr (size<_Tp> <= 16)
+	  {
+	    __m128i __a = {};
+	    __builtin_memcpy(&__a, __mem, size<_Tp>);
+	    const auto __b = _mm512_cvtepi8_epi32(__a);
+	    return _mm512_test_epi32_mask(__b, __b);
+	  }
+	else if constexpr (size<_Tp> <= 32)
+	  {
+	    __m128i __a = {};
+	    __builtin_memcpy(&__a, __mem, 16);
+	    const auto __b = _mm512_cvtepi8_epi32(__a);
+	    __builtin_memcpy(&__a, __mem + 16, size<_Tp> - 16);
+	    const auto __c = _mm512_cvtepi8_epi32(__a);
+	    return _mm512_test_epi32_mask(__b, __b)
+		   | (_mm512_test_epi32_mask(__c, __c) << 16);
+	  }
+	else if constexpr (size<_Tp> <= 64)
+	  {
+	    __m128i __a = {};
+	    __builtin_memcpy(&__a, __mem, 16);
+	    const auto __b = _mm512_cvtepi8_epi32(__a);
+	    __builtin_memcpy(&__a, __mem + 16, 16);
+	    const auto __c = _mm512_cvtepi8_epi32(__a);
+	    if constexpr (size<_Tp> <= 48)
+	      {
+		__builtin_memcpy(&__a, __mem + 32, size<_Tp> - 32);
+		const auto __d = _mm512_cvtepi8_epi32(__a);
+		return _mm512_test_epi32_mask(__b, __b)
+		       | (_mm512_test_epi32_mask(__c, __c) << 16)
+		       | (_ULLong(_mm512_test_epi32_mask(__d, __d)) << 32);
+	      }
+	    else
+	      {
+		__builtin_memcpy(&__a, __mem + 16, 16);
+		const auto __d = _mm512_cvtepi8_epi32(__a);
+		__builtin_memcpy(&__a, __mem + 32, size<_Tp> - 48);
+		const auto __e = _mm512_cvtepi8_epi32(__a);
+		return _mm512_test_epi32_mask(__b, __b)
+		       | (_mm512_test_epi32_mask(__c, __c) << 16)
+		       | (_ULLong(_mm512_test_epi32_mask(__d, __d)) << 32)
+		       | (_ULLong(_mm512_test_epi32_mask(__e, __e)) << 48);
+	      }
+	  }
+	else
+	  __assert_unreachable<_Flags>();
+      }
+    else if constexpr (sizeof(_Tp) == 8 && size<_Tp> == 2)
+      return __vector_bitcast<_Tp>(
+	__vector_type16_t<int>{-int(__bool_mem[0]), -int(__bool_mem[0]),
+			       -int(__bool_mem[1]), -int(__bool_mem[1])});
+    else if constexpr (sizeof(_Tp) == 8 && size<_Tp> <= 4 && __have_avx)
+      {
+	int __bool4;
+	__builtin_memcpy(&__bool4, __mem, size<_Tp>);
+	const auto __k
+	  = __to_intrin((__vector_broadcast<4>(__bool4)
+			 & __make_vector<int>(0x1, 0x100, 0x10000,
+					      size<_Tp> == 4 ? 0x1000000 : 0))
+			!= 0);
+	return __vector_bitcast<_Tp>(
+	  __concat(_mm_unpacklo_epi32(__k, __k), _mm_unpackhi_epi32(__k, __k)));
+      }
+    else if constexpr (sizeof(_Tp) == 4 && size<_Tp> <= 4)
+      {
+	int __bools = 0;
+	__builtin_memcpy(&__bools, __mem, size<_Tp>);
+	if constexpr (__have_sse2)
+	  {
+	    __m128i __k = _mm_cvtsi32_si128(__bools);
+	    __k = _mm_cmpgt_epi16(_mm_unpacklo_epi8(__k, __k), __m128i());
+	    return __vector_bitcast<_Tp, size<_Tp>>(
+	      _mm_unpacklo_epi16(__k, __k));
+	  }
+	else
+	  {
+	    __m128 __k = _mm_cvtpi8_ps(_mm_cvtsi32_si64(__bools));
+	    _mm_empty();
+	    return __vector_bitcast<_Tp, size<_Tp>>(
+	      _mm_cmpgt_ps(__k, __m128()));
+	  }
+      }
+    else if constexpr (sizeof(_Tp) == 4 && size<_Tp> <= 8)
+      {
+	__m128i __k = {};
+	__builtin_memcpy(&__k, __mem, size<_Tp>);
+	__k = _mm_cmpgt_epi16(_mm_unpacklo_epi8(__k, __k), __m128i());
+	return __vector_bitcast<_Tp>(
+	  __concat(_mm_unpacklo_epi16(__k, __k), _mm_unpackhi_epi16(__k, __k)));
+      }
+    else if constexpr (sizeof(_Tp) == 2 && size<_Tp> <= 16)
+      {
+	__m128i __k = {};
+	__builtin_memcpy(&__k, __mem, size<_Tp>);
+	__k = _mm_cmpgt_epi8(__k, __m128i());
+	if constexpr (size<_Tp> <= 8)
+	  return __vector_bitcast<_Tp, size<_Tp>>(_mm_unpacklo_epi8(__k, __k));
+	else
+	  return __concat(_mm_unpacklo_epi8(__k, __k),
+			  _mm_unpackhi_epi8(__k, __k));
+      }
+    else
+      return _Base::template __load<_Tp, _Flags>(__bool_mem);
+  }
+
+  // }}}
+  // __from_bitmask{{{
+  template <size_t _Np, typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+  __from_bitmask(_SanitizedBitMask<_Np> __bits, _TypeTag<_Tp>)
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      return __bits._M_to_bits();
+    else
+      return __to_maskvector<_Tp, size<_Tp>>(__bits);
+  }
+
+  // }}}
+  // __masked_load {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  static inline _SimdWrapper<_Tp, _Np>
+  __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __mask,
+		const bool* __mem, _Fp) noexcept
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	if constexpr (__have_avx512bw_vl)
+	  {
+	    if constexpr (_Np <= 16)
+	      {
+		const auto __a = _mm_mask_loadu_epi8(__m128i(), __mask, __mem);
+		return (__merge & ~__mask) | _mm_test_epi8_mask(__a, __a);
+	      }
+	    else if constexpr (_Np <= 32)
+	      {
+		const auto __a
+		  = _mm256_mask_loadu_epi8(__m256i(), __mask, __mem);
+		return (__merge & ~__mask) | _mm256_test_epi8_mask(__a, __a);
+	      }
+	    else if constexpr (_Np <= 64)
+	      {
+		const auto __a
+		  = _mm512_mask_loadu_epi8(__m512i(), __mask, __mem);
+		return (__merge & ~__mask) | _mm512_test_epi8_mask(__a, __a);
+	      }
+	    else
+	      __assert_unreachable<_Tp>();
+	  }
+	else
+	  {
+	    _BitOps::__bit_iteration(__mask, [&](auto __i) {
+	      __merge.__set(__i, __mem[__i]);
+	    });
+	    return __merge;
+	  }
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 32 && sizeof(_Tp) == 1)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge
+	  = _mm256_mask_sub_epi8(__to_intrin(__merge), __k, __m256i(),
+				 _mm256_mask_loadu_epi8(__m256i(), __k, __mem));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 16 && sizeof(_Tp) == 1)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge
+	  = _mm_mask_sub_epi8(__vector_bitcast<_LLong>(__merge), __k, __m128i(),
+			      _mm_mask_loadu_epi8(__m128i(), __k, __mem));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 16 && sizeof(_Tp) == 2)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = _mm256_mask_sub_epi16(
+	  __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+	  _mm256_cvtepi8_epi16(_mm_mask_loadu_epi8(__m128i(), __k, __mem)));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 8 && sizeof(_Tp) == 2)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = _mm_mask_sub_epi16(
+	  __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+	  _mm_cvtepi8_epi16(_mm_mask_loadu_epi8(__m128i(), __k, __mem)));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 8 && sizeof(_Tp) == 4)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = __vector_bitcast<_Tp>(_mm256_mask_sub_epi32(
+	  __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+	  _mm256_cvtepi8_epi32(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 4 && sizeof(_Tp) == 4)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = __vector_bitcast<_Tp>(_mm_mask_sub_epi32(
+	  __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+	  _mm_cvtepi8_epi32(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 4 && sizeof(_Tp) == 8)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = __vector_bitcast<_Tp>(_mm256_mask_sub_epi64(
+	  __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+	  _mm256_cvtepi8_epi64(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+      }
+    else if constexpr (__have_avx512bw_vl && _Np == 2 && sizeof(_Tp) == 8)
+      {
+	const auto __k = __to_bits(__mask)._M_to_bits();
+	__merge = __vector_bitcast<_Tp>(_mm_mask_sub_epi64(
+	  __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+	  _mm_cvtepi8_epi64(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+      }
+    else
+      {
+	return _Base::__masked_load(__merge, __mask, __mem, _Fp{});
+      }
+    return __merge;
+  }
+
+  // __store {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __v,
+					      bool* __mem, _Fp) noexcept
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	if constexpr (__have_avx512bw_vl)
+	  _CommonImplX86::__store<_Np>(
+	    __vector_bitcast<char>([](auto __data) {
+	      if constexpr (_Np <= 16)
+		return _mm_maskz_set1_epi8(__data, 1);
+	      else if constexpr (_Np <= 32)
+		return _mm256_maskz_set1_epi8(__data, 1);
+	      else
+		return _mm512_maskz_set1_epi8(__data, 1);
+	    }(__v._M_data)),
+	    __mem, _Fp());
+	else if constexpr (_Np <= 8)
+	  _CommonImplX86::__store<_Np>(
+	    __vector_bitcast<char>(
+#if defined __x86_64__
+	      __make_wrapper<_ULLong>(
+		_pdep_u64(__v._M_data, 0x0101010101010101ULL), 0ull)
+#else
+	      __make_wrapper<_UInt>(_pdep_u32(__v._M_data, 0x01010101U),
+				    _pdep_u32(__v._M_data >> 4, 0x01010101U))
+#endif
+		),
+	    __mem, _Fp());
+	else if constexpr (_Np <= 16)
+	  _mm512_mask_cvtepi32_storeu_epi8(__mem, 0xffffu >> (16 - _Np),
+					   _mm512_maskz_set1_epi32(__v._M_data,
+								   1));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else if constexpr (__is_sse_abi<_Abi>()) //{{{
+      {
+	if constexpr (_Np == 2 && sizeof(_Tp) == 8)
+	  {
+	    const auto __k = __vector_bitcast<int>(__v);
+	    __mem[0] = -__k[1];
+	    __mem[1] = -__k[3];
+	  }
+	else if constexpr (_Np <= 4 && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (__have_sse2)
+	      {
+		const unsigned __bool4
+		  = __vector_bitcast<_UInt>(
+		      _mm_packs_epi16(_mm_packs_epi32(__intrin_bitcast<__m128i>(
+							__to_intrin(__v)),
+						      __m128i()),
+				      __m128i()))[0]
+		    & 0x01010101u;
+		__builtin_memcpy(__mem, &__bool4, _Np);
+	      }
+	    else if constexpr (__have_mmx)
+	      {
+		const __m64 __k
+		  = _mm_cvtps_pi8(__and(__to_intrin(__v), _mm_set1_ps(1.f)));
+		__builtin_memcpy(__mem, &__k, _Np);
+		_mm_empty();
+	      }
+	    else
+	      return _Base::__store(__v, __mem, _Fp());
+	  }
+	else if constexpr (_Np <= 8 && sizeof(_Tp) == 2)
+	  {
+	    _CommonImplX86::__store<_Np>(
+	      __vector_bitcast<char>(_mm_packs_epi16(
+		__to_intrin(__vector_bitcast<_UShort>(__v) >> 15), __m128i())),
+	      __mem, _Fp());
+	  }
+	else if constexpr (_Np <= 16 && sizeof(_Tp) == 1)
+	  _CommonImplX86::__store<_Np>(__v._M_data & 1, __mem, _Fp());
+	else
+	  __assert_unreachable<_Tp>();
+      }                                      // }}}
+    else if constexpr (__is_avx_abi<_Abi>()) // {{{
+      {
+	if constexpr (_Np <= 4 && sizeof(_Tp) == 8)
+	  {
+	    auto __k = __intrin_bitcast<__m256i>(__to_intrin(__v));
+	    int __bool4;
+	    if constexpr (__have_avx2)
+	      __bool4 = _mm256_movemask_epi8(__k);
+	    else
+	      __bool4 = (_mm_movemask_epi8(__lo128(__k))
+			 | (_mm_movemask_epi8(__hi128(__k)) << 16));
+	    __bool4 &= 0x01010101;
+	    __builtin_memcpy(__mem, &__bool4, _Np);
+	  }
+	else if constexpr (_Np <= 8 && sizeof(_Tp) == 4)
+	  {
+	    const auto __k = __intrin_bitcast<__m256i>(__to_intrin(__v));
+	    const auto __k2
+	      = _mm_srli_epi16(_mm_packs_epi16(__lo128(__k), __hi128(__k)), 15);
+	    const auto __k3
+	      = __vector_bitcast<char>(_mm_packs_epi16(__k2, __m128i()));
+	    _CommonImplX86::__store<_Np>(__k3, __mem, _Fp());
+	  }
+	else if constexpr (_Np <= 16 && sizeof(_Tp) == 2)
+	  {
+	    if constexpr (__have_avx2)
+	      {
+		const auto __x = _mm256_srli_epi16(__to_intrin(__v), 15);
+		const auto __bools = __vector_bitcast<char>(
+		  _mm_packs_epi16(__lo128(__x), __hi128(__x)));
+		_CommonImplX86::__store<_Np>(__bools, __mem, _Fp());
+	      }
+	    else
+	      {
+		const auto __bools
+		  = 1
+		    & __vector_bitcast<_UChar>(
+		      _mm_packs_epi16(__lo128(__to_intrin(__v)),
+				      __hi128(__to_intrin(__v))));
+		_CommonImplX86::__store<_Np>(__bools, __mem, _Fp());
+	      }
+	  }
+	else if constexpr (_Np <= 32 && sizeof(_Tp) == 1)
+	  _CommonImplX86::__store<_Np>(1 & __v._M_data, __mem, _Fp());
+	else
+	  __assert_unreachable<_Tp>();
+      } // }}}
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // __masked_store {{{2
+  template <typename _Tp, size_t _Np, typename _Fp>
+  static inline void __masked_store(const _SimdWrapper<_Tp, _Np> __v,
+				    bool* __mem, _Fp,
+				    const _SimdWrapper<_Tp, _Np> __k) noexcept
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	static_assert(is_same_v<_Tp, bool>);
+	if constexpr (_Np <= 16 && __have_avx512bw_vl)
+	  _mm_mask_storeu_epi8(__mem, __k, _mm_maskz_set1_epi8(__v, 1));
+	else if constexpr (_Np <= 16)
+	  _mm512_mask_cvtepi32_storeu_epi8(__mem, __k,
+					   _mm512_maskz_set1_epi32(__v, 1));
+	else if constexpr (_Np <= 32 && __have_avx512bw_vl)
+	  _mm256_mask_storeu_epi8(__mem, __k, _mm256_maskz_set1_epi8(__v, 1));
+	else if constexpr (_Np <= 32 && __have_avx512bw)
+	  _mm256_mask_storeu_epi8(__mem, __k,
+				  __lo256(_mm512_maskz_set1_epi8(__v, 1)));
+	else if constexpr (_Np <= 64 && __have_avx512bw)
+	  _mm512_mask_storeu_epi8(__mem, __k, _mm512_maskz_set1_epi8(__v, 1));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      _Base::__masked_store(__v, __mem, _Fp(), __k);
+  }
+
+  // logical and bitwise operators {{{2
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __logical_and(const _SimdWrapper<_Tp, _Np>& __x,
+		const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kand_mask8(__x._M_data, __y._M_data);
+	else if constexpr (_Np <= 16)
+	  return _kand_mask16(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kand_mask32(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kand_mask64(__x._M_data, __y._M_data);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__logical_and(__x, __y);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __logical_or(const _SimdWrapper<_Tp, _Np>& __x,
+	       const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kor_mask8(__x._M_data, __y._M_data);
+	else if constexpr (_Np <= 16)
+	  return _kor_mask16(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kor_mask32(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kor_mask64(__x._M_data, __y._M_data);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__logical_or(__x, __y);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_not(const _SimdWrapper<_Tp, _Np>& __x)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kandn_mask8(__x._M_data,
+			      _Abi::template __implicit_mask_n<_Np>());
+	else if constexpr (_Np <= 16)
+	  return _kandn_mask16(__x._M_data,
+			       _Abi::template __implicit_mask_n<_Np>());
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kandn_mask32(__x._M_data,
+			       _Abi::template __implicit_mask_n<_Np>());
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kandn_mask64(__x._M_data,
+			       _Abi::template __implicit_mask_n<_Np>());
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__bit_not(__x);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_and(const _SimdWrapper<_Tp, _Np>& __x,
+	    const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kand_mask8(__x._M_data, __y._M_data);
+	else if constexpr (_Np <= 16)
+	  return _kand_mask16(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kand_mask32(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kand_mask64(__x._M_data, __y._M_data);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__bit_and(__x, __y);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kor_mask8(__x._M_data, __y._M_data);
+	else if constexpr (_Np <= 16)
+	  return _kor_mask16(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kor_mask32(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kor_mask64(__x._M_data, __y._M_data);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__bit_or(__x, __y);
+  }
+
+  template <typename _Tp, size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+  __bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
+	    const _SimdWrapper<_Tp, _Np>& __y)
+  {
+    if constexpr (std::is_same_v<_Tp, bool>)
+      {
+	if constexpr (__have_avx512dq && _Np <= 8)
+	  return _kxor_mask8(__x._M_data, __y._M_data);
+	else if constexpr (_Np <= 16)
+	  return _kxor_mask16(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 32)
+	  return _kxor_mask32(__x._M_data, __y._M_data);
+	else if constexpr (__have_avx512bw && _Np <= 64)
+	  return _kxor_mask64(__x._M_data, __y._M_data);
+	else
+	  __assert_unreachable<_Tp>();
+      }
+    else
+      return _Base::__bit_xor(__x, __y);
+  }
+
+  //}}}2
+  // __masked_assign{{{
+  template <size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<bool, _Np> __k, _SimdWrapper<bool, _Np>& __lhs,
+		  _SimdWrapper<bool, _Np> __rhs)
+  {
+    __lhs._M_data
+      = (~__k._M_data & __lhs._M_data) | (__k._M_data & __rhs._M_data);
+  }
+
+  template <size_t _Np>
+  _GLIBCXX_SIMD_INTRINSIC static void
+  __masked_assign(_SimdWrapper<bool, _Np> __k, _SimdWrapper<bool, _Np>& __lhs,
+		  bool __rhs)
+  {
+    if (__rhs)
+      __lhs._M_data = __k._M_data | __lhs._M_data;
+    else
+      __lhs._M_data = ~__k._M_data & __lhs._M_data;
+  }
+
+  using _MaskImplBuiltin<_Abi>::__masked_assign;
+
+  //}}}
+  // __all_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+      {
+	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+	if constexpr (__have_sse4_1)
+	  return 0
+		 != __testc(__as_vector(__k),
+			    _Abi::template __implicit_mask<_Tp>());
+	else if constexpr (std::is_same_v<_Tp, float>)
+	  return (_mm_movemask_ps(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+		 == (1 << _Np) - 1;
+	else if constexpr (std::is_same_v<_Tp, double>)
+	  return (_mm_movemask_pd(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+		 == (1 << _Np) - 1;
+	else
+	  return (_mm_movemask_epi8(__to_intrin(__k._M_data))
+		  & ((1 << (_Np * sizeof(_Tp))) - 1))
+		 == (1 << (_Np * sizeof(_Tp))) - 1;
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      {
+	constexpr auto _Mask = _Abi::template __implicit_mask<_Tp>();
+	const auto __kk = __k._M_data._M_data;
+	if constexpr (sizeof(__kk) == 1)
+	  {
+	    if constexpr (__have_avx512dq)
+	      return _kortestc_mask8_u8(__kk, _Mask == 0xff ? __kk
+							    : __mmask8(~_Mask));
+	    else
+	      return _kortestc_mask16_u8(__kk, __mmask16(~_Mask));
+	  }
+	else if constexpr (sizeof(__kk) == 2)
+	  return _kortestc_mask16_u8(__kk, _Mask == 0xffff ? __kk
+							   : __mmask16(~_Mask));
+	else if constexpr (sizeof(__kk) == 4 && __have_avx512bw)
+	  return _kortestc_mask32_u8(__kk, _Mask == 0xffffffffU
+					     ? __kk
+					     : __mmask32(~_Mask));
+	else if constexpr (sizeof(__kk) == 8 && __have_avx512bw)
+	  return _kortestc_mask64_u8(__kk, _Mask == 0xffffffffffffffffULL
+					     ? __kk
+					     : __mmask64(~_Mask));
+	else
+	  __assert_unreachable<_Tp>();
+      }
+  }
+
+  // }}}
+  // __any_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+      {
+	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+	if constexpr (__have_sse4_1)
+	  {
+	    if constexpr (_Abi::_S_is_partial || sizeof(__k) < 16)
+	      return 0
+		     == __testz(__as_vector(__k),
+				_Abi::template __implicit_mask<_Tp>());
+	    else
+	      return 0 == __testz(__as_vector(__k), __as_vector(__k));
+	  }
+	else if constexpr (std::is_same_v<_Tp, float>)
+	  return (_mm_movemask_ps(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+		 != 0;
+	else if constexpr (std::is_same_v<_Tp, double>)
+	  return (_mm_movemask_pd(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+		 != 0;
+	else
+	  return (_mm_movemask_epi8(__to_intrin(__k._M_data))
+		  & ((1 << (_Np * sizeof(_Tp))) - 1))
+		 != 0;
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      return (__k._M_data._M_data & _Abi::template __implicit_mask<_Tp>()) != 0;
+  }
+
+  // }}}
+  // __none_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+      {
+	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+	if constexpr (__have_sse4_1)
+	  {
+	    if constexpr (_Abi::_S_is_partial || sizeof(__k) < 16)
+	      return 0
+		     != __testz(__as_vector(__k),
+				_Abi::template __implicit_mask<_Tp>());
+	    else
+	      return 0 != __testz(__as_vector(__k), __as_vector(__k));
+	  }
+	else if constexpr (std::is_same_v<_Tp, float>)
+	  return (__movemask(__to_intrin(__k._M_data)) & ((1 << _Np) - 1)) == 0;
+	else if constexpr (std::is_same_v<_Tp, double>)
+	  return (__movemask(__to_intrin(__k._M_data)) & ((1 << _Np) - 1)) == 0;
+	else
+	  return (__movemask(__to_intrin(__k._M_data))
+		  & int((1ull << (_Np * sizeof(_Tp))) - 1))
+		 == 0;
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      return (__k._M_data._M_data & _Abi::template __implicit_mask<_Tp>()) == 0;
+  }
+
+  // }}}
+  // __some_of {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+      {
+	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+	if constexpr (__have_sse4_1)
+	  return 0
+		 != __testnzc(__as_vector(__k),
+			      _Abi::template __implicit_mask<_Tp>());
+	else if constexpr (std::is_same_v<_Tp, float>)
+	  {
+	    constexpr int __allbits = (1 << _Np) - 1;
+	    const auto __tmp
+	      = _mm_movemask_ps(__to_intrin(__k._M_data)) & __allbits;
+	    return __tmp > 0 && __tmp < __allbits;
+	  }
+	else if constexpr (std::is_same_v<_Tp, double>)
+	  {
+	    constexpr int __allbits = (1 << _Np) - 1;
+	    const auto __tmp
+	      = _mm_movemask_pd(__to_intrin(__k._M_data)) & __allbits;
+	    return __tmp > 0 && __tmp < __allbits;
+	  }
+	else
+	  {
+	    constexpr int __allbits = (1 << (_Np * sizeof(_Tp))) - 1;
+	    const auto __tmp
+	      = _mm_movemask_epi8(__to_intrin(__k._M_data)) & __allbits;
+	    return __tmp > 0 && __tmp < __allbits;
+	  }
+      }
+    else if constexpr (__is_avx512_abi<_Abi>())
+      return __any_of(__k) && !__all_of(__k);
+    else
+      __assert_unreachable<_Tp>();
+  }
+
+  // }}}
+  // __popcount {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+  {
+    constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+    const auto __kk = _Abi::__masked(__k._M_data)._M_data;
+    if constexpr (__is_avx512_abi<_Abi>())
+      {
+	if constexpr (_Np > 32)
+	  return __builtin_popcountll(__kk);
+	else
+	  return __builtin_popcount(__kk);
+      }
+    else
+      {
+	if constexpr (__have_popcnt)
+	  {
+	    int __bits = __movemask(__to_intrin(__vector_bitcast<_Tp>(__kk)));
+	    const int __count = __builtin_popcount(__bits);
+	    return std::is_integral_v<_Tp> ? __count / sizeof(_Tp) : __count;
+	  }
+	else if constexpr (_Np == 2 && sizeof(_Tp) == 8)
+	  {
+	    const int mask = _mm_movemask_pd(__auto_bitcast(__kk));
+	    return mask - (mask >> 1);
+	  }
+	else if constexpr (_Np <= 4 && sizeof(_Tp) == 8)
+	  {
+	    auto __x = -(__lo128(__kk) + __hi128(__kk));
+	    return __x[0] + __x[1];
+	  }
+	else if constexpr (_Np <= 4 && sizeof(_Tp) == 4)
+	  {
+	    if constexpr (__have_sse2)
+	      {
+		__m128i __x = __intrin_bitcast<__m128i>(__to_intrin(__kk));
+		__x = _mm_add_epi32(__x,
+				    _mm_shuffle_epi32(__x,
+						      _MM_SHUFFLE(0, 1, 2, 3)));
+		__x = _mm_add_epi32(
+		  __x, _mm_shufflelo_epi16(__x, _MM_SHUFFLE(1, 0, 3, 2)));
+		return -_mm_cvtsi128_si32(__x);
+	      }
+	    else
+	      return __builtin_popcount(_mm_movemask_ps(__auto_bitcast(__kk)));
+	  }
+	else if constexpr (_Np <= 8 && sizeof(_Tp) == 2)
+	  {
+	    auto __x = __to_intrin(__kk);
+	    __x
+	      = _mm_add_epi16(__x,
+			      _mm_shuffle_epi32(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+	    __x = _mm_add_epi16(__x,
+				_mm_shufflelo_epi16(__x,
+						    _MM_SHUFFLE(0, 1, 2, 3)));
+	    __x = _mm_add_epi16(__x,
+				_mm_shufflelo_epi16(__x,
+						    _MM_SHUFFLE(2, 3, 0, 1)));
+	    return -short(_mm_extract_epi16(__x, 0));
+	  }
+	else if constexpr (_Np <= 16 && sizeof(_Tp) == 1)
+	  {
+	    auto __x = __to_intrin(__kk);
+	    __x = _mm_add_epi8(__x,
+			       _mm_shuffle_epi32(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+	    __x
+	      = _mm_add_epi8(__x,
+			     _mm_shufflelo_epi16(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+	    __x
+	      = _mm_add_epi8(__x,
+			     _mm_shufflelo_epi16(__x, _MM_SHUFFLE(2, 3, 0, 1)));
+	    auto __y = -__vector_bitcast<_UChar>(__x);
+	    if constexpr (__have_sse4_1)
+	      return __y[0] + __y[1];
+	    else
+	      {
+		unsigned __z = _mm_extract_epi16(__to_intrin(__y), 0);
+		return (__z & 0xff) + (__z >> 8);
+	      }
+	  }
+	else if constexpr (sizeof(__kk) == 32)
+	  {
+	    // The following works only as long as the implementations above use
+	    // a summation
+	    using _I = __int_for_sizeof_t<_Tp>;
+	    const auto __as_int = __vector_bitcast<_I>(__kk);
+	    _MaskImplX86<simd_abi::__sse>::__popcount(
+	      simd_mask<_I, simd_abi::__sse>(__private_init,
+					     __lo128(__as_int)
+					       + __hi128(__as_int)));
+	  }
+	else
+	  __assert_unreachable<_Tp>();
+      }
+  }
+
+  // }}}
+  // __find_first_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      if constexpr (size<_Tp> <= 32)
+	return _tzcnt_u32(__k._M_data._M_data);
+      else
+	return _BitOps::__firstbit(__k._M_data._M_data);
+    else
+      return _Base::__find_first_set(__k);
+  }
+
+  // }}}
+  // __find_last_set {{{
+  template <typename _Tp>
+  _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+  {
+    if constexpr (__is_avx512_abi<_Abi>())
+      if constexpr (size<_Tp> <= 32)
+	return 31 - _lzcnt_u32(__k._M_data._M_data);
+      else
+	return _BitOps::__lastbit(__k._M_data._M_data);
+    else
+      return _Base::__find_last_set(__k);
+  }
+
+  // }}}
+};
+
+// }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h b/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h
new file mode 100644
index 00000000000..f72d7809680
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h
@@ -0,0 +1,1993 @@
+// x86 specific conversion optimizations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+#define _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+
+#if __cplusplus >= 201703L
+
+// work around PR85827
+// 1-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v)
+{
+  static_assert(__is_vector_type_v<_V>);
+  using _Tp = typename _Traits::value_type;
+  constexpr size_t _Np = _Traits::_S_width;
+  [[maybe_unused]] const auto __intrin = __to_intrin(__v);
+  using _Up = typename _VectorTraits<_To>::value_type;
+  constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+  // [xyz]_to_[xyz] {{{2
+  [[maybe_unused]] constexpr bool __x_to_x
+    = sizeof(__v) <= 16 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __x_to_y
+    = sizeof(__v) <= 16 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __x_to_z
+    = sizeof(__v) <= 16 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __y_to_x
+    = sizeof(__v) == 32 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __y_to_y
+    = sizeof(__v) == 32 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __y_to_z
+    = sizeof(__v) == 32 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __z_to_x
+    = sizeof(__v) == 64 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __z_to_y
+    = sizeof(__v) == 64 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __z_to_z
+    = sizeof(__v) == 64 && sizeof(_To) == 64;
+
+  // iX_to_iX {{{2
+  [[maybe_unused]] constexpr bool __i_to_i
+    = is_integral_v<_Up> && is_integral_v<_Tp>;
+  [[maybe_unused]] constexpr bool __i8_to_i16
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i8_to_i32
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i8_to_i64
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i16_to_i8
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i16_to_i32
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i16_to_i64
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i32_to_i8
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i32_to_i16
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i32_to_i64
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i64_to_i8
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i64_to_i16
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i64_to_i32
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+  // [fsu]X_to_[fsu]X {{{2
+  // ibw = integral && byte or word, i.e. char and short with any signedness
+  [[maybe_unused]] constexpr bool __s64_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s32_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s16_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s8_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u64_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u32_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u16_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u8_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s64_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s32_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u64_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u32_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __ibw_to_f32
+    = is_integral_v<_Tp> && sizeof(_Tp) <= 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __ibw_to_f64
+    = is_integral_v<_Tp> && sizeof(_Tp) <= 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_f64
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_f32
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+  if constexpr (__i_to_i && __y_to_x && !__have_avx2) //{{{2
+    return __convert_x86<_To>(__lo128(__v), __hi128(__v));
+  else if constexpr (__i_to_i && __x_to_y && !__have_avx2) //{{{2
+    return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v),
+		    __convert_x86<__vector_type_t<_Up, _M / 2>>(
+		      __extract_part<1, _Np / _M * 2>(__v)));
+  else if constexpr (__i_to_i) //{{{2
+    {
+      static_assert(__x_to_x || __have_avx2,
+		    "integral conversions with ymm registers require AVX2");
+      static_assert(__have_avx512bw
+		      || ((sizeof(_Tp) >= 4 || sizeof(__v) < 64)
+			  && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+		    "8/16-bit integers in zmm registers require AVX512BW");
+      static_assert((sizeof(__v) < 64 && sizeof(_To) < 64) || __have_avx512f,
+		    "integral conversions with ymm registers require AVX2");
+    }
+  if constexpr (is_floating_point_v<_Tp> == is_floating_point_v<_Up> && //{{{2
+		sizeof(_Tp) == sizeof(_Up))
+    {
+      // conversion uses simple bit reinterpretation (or no conversion at all)
+      if constexpr (_Np >= _M)
+	return __intrin_bitcast<_To>(__v);
+      else
+	return __zero_extend(__vector_bitcast<_Up>(__v));
+    }
+  else if constexpr (_Np < _M && sizeof(_To) > 16) // zero extend (eg. xmm -> ymm){{{2
+    return __zero_extend(
+      __convert_x86<__vector_type_t<
+	_Up, (16 / sizeof(_Up) > _Np) ? 16 / sizeof(_Up) : _Np>>(__v));
+  else if constexpr (_Np > _M && sizeof(__v) > 16) // partial input (eg. ymm -> xmm){{{2
+    return __convert_x86<_To>(__extract_part<0, _Np / _M>(__v));
+  else if constexpr (__i64_to_i32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi64_epi32(__intrin));
+      else if constexpr (__x_to_x)
+	return __auto_bitcast(
+	  _mm_shuffle_ps(__vector_bitcast<float>(__v), __m128(), 8));
+      else if constexpr (__y_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi64_epi32(__intrin));
+      else if constexpr (__y_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi64_epi32(__auto_bitcast(__v))));
+      else if constexpr (__y_to_x)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm256_permute4x64_epi64(_mm256_shuffle_epi32(__intrin, 8),
+					   0 + 4 * 2)));
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(_mm512_cvtepi64_epi32(__intrin));
+    }
+  else if constexpr (__i64_to_i16) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi64_epi16(__intrin));
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi64_epi16(__auto_bitcast(__v))));
+      else if constexpr (__x_to_x && __have_ssse3)
+	{
+	  return __intrin_bitcast<_To>(
+	    _mm_shuffle_epi8(__intrin,
+			     _mm_setr_epi8(0, 1, 8, 9, -0x80, -0x80, -0x80,
+					   -0x80, -0x80, -0x80, -0x80, -0x80,
+					   -0x80, -0x80, -0x80, -0x80)));
+	  // fallback without SSSE3
+	}
+      else if constexpr (__y_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi64_epi16(__intrin));
+      else if constexpr (__y_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi64_epi16(__auto_bitcast(__v))));
+      else if constexpr (__y_to_x)
+	{
+	  const auto __a = _mm256_shuffle_epi8(
+	    __intrin,
+	    _mm256_setr_epi8(0, 1, 8, 9, -0x80, -0x80, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, 0, 1, 8, 9, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80));
+	  return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+	}
+      else if constexpr (__z_to_x)
+	return __intrin_bitcast<_To>(_mm512_cvtepi64_epi16(__intrin));
+    }
+  else if constexpr (__i64_to_i8) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi64_epi8(__intrin));
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi64_epi8(__zero_extend(__intrin))));
+      else if constexpr (__y_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi64_epi8(__intrin));
+      else if constexpr (__y_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  _mm512_cvtepi64_epi8(__zero_extend(__intrin)));
+      else if constexpr (__z_to_x)
+	return __intrin_bitcast<_To>(_mm512_cvtepi64_epi8(__intrin));
+    }
+  else if constexpr (__i32_to_i64) //{{{2
+    {
+      if constexpr (__have_sse4_1 && __x_to_x)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi32_epi64(__intrin)
+				       : _mm_cvtepu32_epi64(__intrin));
+      else if constexpr (__x_to_x)
+	{
+	  return __intrin_bitcast<_To>(
+	    _mm_unpacklo_epi32(__intrin, is_signed_v<_Tp>
+					   ? _mm_srai_epi32(__intrin, 31)
+					   : __m128i()));
+	}
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi32_epi64(__intrin)
+				       : _mm256_cvtepu32_epi64(__intrin));
+      else if constexpr (__y_to_z)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi32_epi64(__intrin)
+				       : _mm512_cvtepu32_epi64(__intrin));
+    }
+  else if constexpr (__i32_to_i16) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi32_epi16(__intrin));
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi32_epi16(__auto_bitcast(__v))));
+      else if constexpr (__x_to_x && __have_ssse3)
+	return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+	  __intrin, _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+				  -0x80, -0x80, -0x80, -0x80, -0x80)));
+      else if constexpr (__x_to_x)
+	{
+	  auto __a = _mm_unpacklo_epi16(__intrin, __m128i()); // 0o.o 1o.o
+	  auto __b = _mm_unpackhi_epi16(__intrin, __m128i()); // 2o.o 3o.o
+	  auto __c = _mm_unpacklo_epi16(__a, __b);            // 02oo ..oo
+	  auto __d = _mm_unpackhi_epi16(__a, __b);            // 13oo ..oo
+	  return __intrin_bitcast<_To>(
+	    _mm_unpacklo_epi16(__c, __d)); // 0123 oooo
+	}
+      else if constexpr (__y_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi32_epi16(__intrin));
+      else if constexpr (__y_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi32_epi16(__auto_bitcast(__v))));
+      else if constexpr (__y_to_x)
+	{
+	  auto __a = _mm256_shuffle_epi8(
+	    __intrin,
+	    _mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, -0x80, 0, 1, 4, 5, 8,
+			     9, 12, 13, -0x80, -0x80, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80));
+	  return __intrin_bitcast<_To>(__lo128(
+	    _mm256_permute4x64_epi64(__a,
+				     0xf8))); // __a[0] __a[2] | __a[3] __a[3]
+	}
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(_mm512_cvtepi32_epi16(__intrin));
+    }
+  else if constexpr (__i32_to_i8) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi32_epi8(__intrin));
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi32_epi8(__zero_extend(__intrin))));
+      else if constexpr (__x_to_x && __have_ssse3)
+	{
+	  return __intrin_bitcast<_To>(
+	    _mm_shuffle_epi8(__intrin,
+			     _mm_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80,
+					   -0x80, -0x80, -0x80, -0x80, -0x80,
+					   -0x80, -0x80, -0x80, -0x80)));
+	}
+      else if constexpr (__x_to_x)
+	{
+	  const auto __a
+	    = _mm_unpacklo_epi8(__intrin, __intrin); // 0... .... 1... ....
+	  const auto __b
+	    = _mm_unpackhi_epi8(__intrin, __intrin);    // 2... .... 3... ....
+	  const auto __c = _mm_unpacklo_epi8(__a, __b); // 02.. .... .... ....
+	  const auto __d = _mm_unpackhi_epi8(__a, __b); // 13.. .... .... ....
+	  const auto __e = _mm_unpacklo_epi8(__c, __d); // 0123 .... .... ....
+	  return __intrin_bitcast<_To>(__e & _mm_cvtsi32_si128(-1));
+	}
+      else if constexpr (__y_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi32_epi8(__intrin));
+      else if constexpr (__y_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  _mm512_cvtepi32_epi8(__zero_extend(__intrin)));
+      else if constexpr (__z_to_x)
+	return __intrin_bitcast<_To>(_mm512_cvtepi32_epi8(__intrin));
+    }
+  else if constexpr (__i16_to_i64) //{{{2
+    {
+      if constexpr (__x_to_x && __have_sse4_1)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi16_epi64(__intrin)
+				       : _mm_cvtepu16_epi64(__intrin));
+      else if constexpr (__x_to_x && is_signed_v<_Tp>)
+	{
+	  auto __x = _mm_srai_epi16(__intrin, 15);
+	  auto __y = _mm_unpacklo_epi16(__intrin, __x);
+	  __x = _mm_unpacklo_epi16(__x, __x);
+	  return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__y, __x));
+	}
+      else if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(
+	  _mm_unpacklo_epi32(_mm_unpacklo_epi16(__intrin, __m128i()),
+			     __m128i()));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi16_epi64(__intrin)
+				       : _mm256_cvtepu16_epi64(__intrin));
+      else if constexpr (__x_to_z)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi16_epi64(__intrin)
+				       : _mm512_cvtepu16_epi64(__intrin));
+    }
+  else if constexpr (__i16_to_i32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_sse4_1)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi16_epi32(__intrin)
+				       : _mm_cvtepu16_epi32(__intrin));
+      else if constexpr (__x_to_x && is_signed_v<_Tp>)
+	return __intrin_bitcast<_To>(
+	  _mm_srai_epi32(_mm_unpacklo_epi16(__intrin, __intrin), 16));
+      else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+	return __intrin_bitcast<_To>(_mm_unpacklo_epi16(__intrin, __m128i()));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi16_epi32(__intrin)
+				       : _mm256_cvtepu16_epi32(__intrin));
+      else if constexpr (__y_to_z)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi16_epi32(__intrin)
+				       : _mm512_cvtepu16_epi32(__intrin));
+    }
+  else if constexpr (__i16_to_i8) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512bw_vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi16_epi8(__intrin));
+      else if constexpr (__x_to_x && __have_avx512bw)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepi16_epi8(__zero_extend(__intrin))));
+      else if constexpr (__x_to_x && __have_ssse3)
+	return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+	  __intrin, _mm_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, -0x80, -0x80,
+				  -0x80, -0x80, -0x80, -0x80, -0x80, -0x80)));
+      else if constexpr (__x_to_x)
+	{
+	  auto __a
+	    = _mm_unpacklo_epi8(__intrin, __intrin); // 00.. 11.. 22.. 33..
+	  auto __b
+	    = _mm_unpackhi_epi8(__intrin, __intrin); // 44.. 55.. 66.. 77..
+	  auto __c = _mm_unpacklo_epi8(__a, __b);    // 0404 .... 1515 ....
+	  auto __d = _mm_unpackhi_epi8(__a, __b);    // 2626 .... 3737 ....
+	  auto __e = _mm_unpacklo_epi8(__c, __d);    // 0246 0246 .... ....
+	  auto __f = _mm_unpackhi_epi8(__c, __d);    // 1357 1357 .... ....
+	  return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__e, __f));
+	}
+      else if constexpr (__y_to_x && __have_avx512bw_vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi16_epi8(__intrin));
+      else if constexpr (__y_to_x && __have_avx512bw)
+	return __intrin_bitcast<_To>(
+	  __lo256(_mm512_cvtepi16_epi8(__zero_extend(__intrin))));
+      else if constexpr (__y_to_x)
+	{
+	  auto __a = _mm256_shuffle_epi8(
+	    __intrin,
+	    _mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+			     -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, 0, 2, 4,
+			     6, 8, 10, 12, 14));
+	  return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+	}
+      else if constexpr (__z_to_y && __have_avx512bw)
+	return __intrin_bitcast<_To>(_mm512_cvtepi16_epi8(__intrin));
+      else if constexpr (__z_to_y)
+	__assert_unreachable<_Tp>();
+    }
+  else if constexpr (__i8_to_i64) //{{{2
+    {
+      if constexpr (__x_to_x && __have_sse4_1)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi8_epi64(__intrin)
+				       : _mm_cvtepu8_epi64(__intrin));
+      else if constexpr (__x_to_x && is_signed_v<_Tp>)
+	{
+	  if constexpr (__have_ssse3)
+	    {
+	      auto __dup = _mm_unpacklo_epi8(__intrin, __intrin);
+	      auto __epi16 = _mm_srai_epi16(__dup, 8);
+	      _mm_shuffle_epi8(__epi16, _mm_setr_epi8(0, 1, 1, 1, 1, 1, 1, 1, 2,
+						      3, 3, 3, 3, 3, 3, 3));
+	    }
+	  else
+	    {
+	      auto __x = _mm_unpacklo_epi8(__intrin, __intrin);
+	      __x = _mm_unpacklo_epi16(__x, __x);
+	      return __intrin_bitcast<_To>(
+		_mm_unpacklo_epi32(_mm_srai_epi32(__x, 24),
+				   _mm_srai_epi32(__x, 31)));
+	    }
+	}
+      else if constexpr (__x_to_x)
+	{
+	  return __intrin_bitcast<_To>(_mm_unpacklo_epi32(
+	    _mm_unpacklo_epi16(_mm_unpacklo_epi8(__intrin, __m128i()),
+			       __m128i()),
+	    __m128i()));
+	}
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi8_epi64(__intrin)
+				       : _mm256_cvtepu8_epi64(__intrin));
+      else if constexpr (__x_to_z)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi8_epi64(__intrin)
+				       : _mm512_cvtepu8_epi64(__intrin));
+    }
+  else if constexpr (__i8_to_i32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_sse4_1)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi8_epi32(__intrin)
+				       : _mm_cvtepu8_epi32(__intrin));
+      else if constexpr (__x_to_x && is_signed_v<_Tp>)
+	{
+	  const auto __x = _mm_unpacklo_epi8(__intrin, __intrin);
+	  return __intrin_bitcast<_To>(
+	    _mm_srai_epi32(_mm_unpacklo_epi16(__x, __x), 24));
+	}
+      else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+	return __intrin_bitcast<_To>(
+	  _mm_unpacklo_epi16(_mm_unpacklo_epi8(__intrin, __m128i()),
+			     __m128i()));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi8_epi32(__intrin)
+				       : _mm256_cvtepu8_epi32(__intrin));
+      else if constexpr (__x_to_z)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi8_epi32(__intrin)
+				       : _mm512_cvtepu8_epi32(__intrin));
+    }
+  else if constexpr (__i8_to_i16) //{{{2
+    {
+      if constexpr (__x_to_x && __have_sse4_1)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm_cvtepi8_epi16(__intrin)
+				       : _mm_cvtepu8_epi16(__intrin));
+      else if constexpr (__x_to_x && is_signed_v<_Tp>)
+	return __intrin_bitcast<_To>(
+	  _mm_srai_epi16(_mm_unpacklo_epi8(__intrin, __intrin), 8));
+      else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+	return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__intrin, __m128i()));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm256_cvtepi8_epi16(__intrin)
+				       : _mm256_cvtepu8_epi16(__intrin));
+      else if constexpr (__y_to_z && __have_avx512bw)
+	return __intrin_bitcast<_To>(is_signed_v<_Tp>
+				       ? _mm512_cvtepi8_epi16(__intrin)
+				       : _mm512_cvtepu8_epi16(__intrin));
+      else if constexpr (__y_to_z)
+	__assert_unreachable<_Tp>();
+    }
+  else if constexpr (__f32_to_s64) //{{{2
+    {
+      if constexpr (__have_avx512dq_vl && __x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttps_epi64(__intrin));
+      else if constexpr (__have_avx512dq_vl && __x_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvttps_epi64(__intrin));
+      else if constexpr (__have_avx512dq && __y_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvttps_epi64(__intrin));
+      // else use scalar fallback
+    }
+  else if constexpr (__f32_to_u64) //{{{2
+    {
+      if constexpr (__have_avx512dq_vl && __x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttps_epu64(__intrin));
+      else if constexpr (__have_avx512dq_vl && __x_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvttps_epu64(__intrin));
+      else if constexpr (__have_avx512dq && __y_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvttps_epu64(__intrin));
+      // else use scalar fallback
+    }
+  else if constexpr (__f32_to_s32) //{{{2
+    {
+      if constexpr (__x_to_x || __y_to_y || __z_to_z)
+	{
+	  // go to fallback, it does the right thing
+	}
+      else
+	__assert_unreachable<_Tp>();
+    }
+  else if constexpr (__f32_to_u32) //{{{2
+    {
+      if constexpr (__have_avx512vl && __x_to_x)
+	return __auto_bitcast(_mm_cvttps_epu32(__intrin));
+      else if constexpr (__have_avx512f && __x_to_x)
+	return __auto_bitcast(
+	  __lo128(_mm512_cvttps_epu32(__auto_bitcast(__v))));
+      else if constexpr (__have_avx512vl && __y_to_y)
+	return __vector_bitcast<_Up>(_mm256_cvttps_epu32(__intrin));
+      else if constexpr (__have_avx512f && __y_to_y)
+	return __vector_bitcast<_Up>(
+	  __lo256(_mm512_cvttps_epu32(__auto_bitcast(__v))));
+      else if constexpr (__x_to_x || __y_to_y || __z_to_z)
+	{
+	  // go to fallback, it does the right thing. We can't use the
+	  // _mm_floor_ps - 0x8000'0000 trick for f32->u32 because it would
+	  // discard small input values (only 24 mantissa bits)
+	}
+      else
+	__assert_unreachable<_Tp>();
+    }
+  else if constexpr (__f32_to_ibw) //{{{2
+    return __convert_x86<_To>(__convert_x86<__vector_type_t<int, _Np>>(__v));
+  else if constexpr (__f64_to_s64) //{{{2
+    {
+      if constexpr (__have_avx512dq_vl && __x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttpd_epi64(__intrin));
+      else if constexpr (__have_avx512dq_vl && __y_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvttpd_epi64(__intrin));
+      else if constexpr (__have_avx512dq && __z_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvttpd_epi64(__intrin));
+      // else use scalar fallback
+    }
+  else if constexpr (__f64_to_u64) //{{{2
+    {
+      if constexpr (__have_avx512dq_vl && __x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttpd_epu64(__intrin));
+      else if constexpr (__have_avx512dq_vl && __y_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvttpd_epu64(__intrin));
+      else if constexpr (__have_avx512dq && __z_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvttpd_epu64(__intrin));
+      // else use scalar fallback
+    }
+  else if constexpr (__f64_to_s32) //{{{2
+    {
+      if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttpd_epi32(__intrin));
+      else if constexpr (__y_to_x)
+	return __intrin_bitcast<_To>(_mm256_cvttpd_epi32(__intrin));
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(_mm512_cvttpd_epi32(__intrin));
+    }
+  else if constexpr (__f64_to_u32) //{{{2
+    {
+      if constexpr (__have_avx512vl && __x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvttpd_epu32(__intrin));
+      else if constexpr (__have_sse4_1 && __x_to_x)
+	return __vector_bitcast<_Up, _M>(
+		 _mm_cvttpd_epi32(_mm_floor_pd(__intrin) - 0x8000'0000u))
+	       ^ 0x8000'0000u;
+      else if constexpr (__x_to_x)
+	{
+	  // use scalar fallback: it's only 2 values to convert, can't get much
+	  // better than scalar decomposition
+	}
+      else if constexpr (__have_avx512vl && __y_to_x)
+	return __intrin_bitcast<_To>(_mm256_cvttpd_epu32(__intrin));
+      else if constexpr (__y_to_x)
+	{
+	  return __intrin_bitcast<_To>(
+	    __vector_bitcast<_Up>(
+	      _mm256_cvttpd_epi32(_mm256_floor_pd(__intrin) - 0x8000'0000u))
+	    ^ 0x8000'0000u);
+	}
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(_mm512_cvttpd_epu32(__intrin));
+    }
+  else if constexpr (__f64_to_ibw) //{{{2
+    {
+      return __convert_x86<_To>(
+	__convert_x86<__vector_type_t<int, (_Np < 4 ? 4 : _Np)>>(__v));
+    }
+  else if constexpr (__s64_to_f32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi64_ps(__intrin));
+      else if constexpr (__y_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi64_ps(__intrin));
+      else if constexpr (__z_to_y && __have_avx512dq)
+	return __intrin_bitcast<_To>(_mm512_cvtepi64_ps(__intrin));
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(
+	  _mm512_cvtpd_ps(__convert_x86<__vector_type_t<double, 8>>(__v)));
+    }
+  else if constexpr (__u64_to_f32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm_cvtepu64_ps(__intrin));
+      else if constexpr (__y_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepu64_ps(__intrin));
+      else if constexpr (__z_to_y && __have_avx512dq)
+	return __intrin_bitcast<_To>(_mm512_cvtepu64_ps(__intrin));
+      else if constexpr (__z_to_y)
+	{
+	  return __intrin_bitcast<_To>(
+	    __lo256(_mm512_cvtepu32_ps(__auto_bitcast(
+	      _mm512_cvtepi64_epi32(_mm512_srai_epi64(__intrin, 32)))))
+	      * 0x100000000LL
+	    + __lo256(_mm512_cvtepu32_ps(
+	      __auto_bitcast(_mm512_cvtepi64_epi32(__intrin)))));
+	}
+    }
+  else if constexpr (__s32_to_f32) //{{{2
+    {
+      // use fallback (builtin conversion)
+    }
+  else if constexpr (__u32_to_f32) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	{
+	  // use fallback
+	}
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepu32_ps(__auto_bitcast(__v))));
+      else if constexpr (__x_to_x && (__have_fma || __have_fma4))
+	// work around PR85819
+	return __auto_bitcast(0x10000 * _mm_cvtepi32_ps(__to_intrin(__v >> 16))
+			      + _mm_cvtepi32_ps(__to_intrin(__v & 0xffff)));
+      else if constexpr (__y_to_y && __have_avx512vl)
+	{
+	  // use fallback
+	}
+      else if constexpr (__y_to_y && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo256(_mm512_cvtepu32_ps(__auto_bitcast(__v))));
+      else if constexpr (__y_to_y)
+	// work around PR85819
+	return 0x10000 * _mm256_cvtepi32_ps(__to_intrin(__v >> 16))
+	       + _mm256_cvtepi32_ps(__to_intrin(__v & 0xffff));
+      // else use fallback (builtin conversion)
+    }
+  else if constexpr (__ibw_to_f32) //{{{2
+    {
+      if constexpr (_M <= 4 || __have_avx2)
+	return __convert_x86<_To>(__convert_x86<__vector_type_t<int, _M>>(__v));
+      else
+	{
+	  static_assert(__x_to_y);
+	  __m128i __a, __b;
+	  if constexpr (__have_sse4_1)
+	    {
+	      __a = sizeof(_Tp) == 2
+		      ? (is_signed_v<_Tp> ? _mm_cvtepi16_epi32(__intrin)
+					  : _mm_cvtepu16_epi32(__intrin))
+		      : (is_signed_v<_Tp> ? _mm_cvtepi8_epi32(__intrin)
+					  : _mm_cvtepu8_epi32(__intrin));
+	      const auto __w
+		= _mm_shuffle_epi32(__intrin, sizeof(_Tp) == 2 ? 0xee : 0xe9);
+	      __b = sizeof(_Tp) == 2
+		      ? (is_signed_v<_Tp> ? _mm_cvtepi16_epi32(__w)
+					  : _mm_cvtepu16_epi32(__w))
+		      : (is_signed_v<_Tp> ? _mm_cvtepi8_epi32(__w)
+					  : _mm_cvtepu8_epi32(__w));
+	    }
+	  else
+	    {
+	      __m128i __tmp;
+	      if constexpr (sizeof(_Tp) == 1)
+		{
+		  __tmp
+		    = is_signed_v<_Tp>
+			? _mm_srai_epi16(_mm_unpacklo_epi8(__intrin, __intrin),
+					 8)
+			: _mm_unpacklo_epi8(__intrin, __m128i());
+		}
+	      else
+		{
+		  static_assert(sizeof(_Tp) == 2);
+		  __tmp = __intrin;
+		}
+	      __a = is_signed_v<_Tp>
+		      ? _mm_srai_epi32(_mm_unpacklo_epi16(__tmp, __tmp), 16)
+		      : _mm_unpacklo_epi16(__tmp, __m128i());
+	      __b = is_signed_v<_Tp>
+		      ? _mm_srai_epi32(_mm_unpackhi_epi16(__tmp, __tmp), 16)
+		      : _mm_unpackhi_epi16(__tmp, __m128i());
+	    }
+	  return __convert_x86<_To>(__vector_bitcast<int>(__a),
+				    __vector_bitcast<int>(__b));
+	}
+    }
+  else if constexpr (__s64_to_f64) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm_cvtepi64_pd(__intrin));
+      else if constexpr (__y_to_y && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepi64_pd(__intrin));
+      else if constexpr (__z_to_z && __have_avx512dq)
+	return __intrin_bitcast<_To>(_mm512_cvtepi64_pd(__intrin));
+      else if constexpr (__z_to_z)
+	{
+	  return __intrin_bitcast<_To>(
+	    _mm512_cvtepi32_pd(_mm512_cvtepi64_epi32(__to_intrin(__v >> 32)))
+	      * 0x100000000LL
+	    + _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__intrin)));
+	}
+    }
+  else if constexpr (__u64_to_f64) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm_cvtepu64_pd(__intrin));
+      else if constexpr (__y_to_y && __have_avx512dq_vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepu64_pd(__intrin));
+      else if constexpr (__z_to_z && __have_avx512dq)
+	return __intrin_bitcast<_To>(_mm512_cvtepu64_pd(__intrin));
+      else if constexpr (__z_to_z)
+	{
+	  return __intrin_bitcast<_To>(
+	    _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__to_intrin(__v >> 32)))
+	      * 0x100000000LL
+	    + _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__intrin)));
+	}
+    }
+  else if constexpr (__s32_to_f64) //{{{2
+    {
+      if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvtepi32_pd(__intrin));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvtepi32_pd(__intrin));
+      else if constexpr (__y_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvtepi32_pd(__intrin));
+    }
+  else if constexpr (__u32_to_f64) //{{{2
+    {
+      if constexpr (__x_to_x && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm_cvtepu32_pd(__intrin));
+      else if constexpr (__x_to_x && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo128(_mm512_cvtepu32_pd(__auto_bitcast(__v))));
+      else if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(
+	  _mm_cvtepi32_pd(__to_intrin(__v ^ 0x8000'0000u)) + 0x8000'0000u);
+      else if constexpr (__x_to_y && __have_avx512vl)
+	return __intrin_bitcast<_To>(_mm256_cvtepu32_pd(__intrin));
+      else if constexpr (__x_to_y && __have_avx512f)
+	return __intrin_bitcast<_To>(
+	  __lo256(_mm512_cvtepu32_pd(__auto_bitcast(__v))));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(
+	  _mm256_cvtepi32_pd(__to_intrin(__v ^ 0x8000'0000u)) + 0x8000'0000u);
+      else if constexpr (__y_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvtepu32_pd(__intrin));
+    }
+  else if constexpr (__ibw_to_f64) //{{{2
+    {
+      return __convert_x86<_To>(
+	__convert_x86<__vector_type_t<int, std::max(size_t(4), _M)>>(__v));
+    }
+  else if constexpr (__f32_to_f64) //{{{2
+    {
+      if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvtps_pd(__intrin));
+      else if constexpr (__x_to_y)
+	return __intrin_bitcast<_To>(_mm256_cvtps_pd(__intrin));
+      else if constexpr (__y_to_z)
+	return __intrin_bitcast<_To>(_mm512_cvtps_pd(__intrin));
+    }
+  else if constexpr (__f64_to_f32) //{{{2
+    {
+      if constexpr (__x_to_x)
+	return __intrin_bitcast<_To>(_mm_cvtpd_ps(__intrin));
+      else if constexpr (__y_to_x)
+	return __intrin_bitcast<_To>(_mm256_cvtpd_ps(__intrin));
+      else if constexpr (__z_to_y)
+	return __intrin_bitcast<_To>(_mm512_cvtpd_ps(__intrin));
+    }
+  else //{{{2
+    __assert_unreachable<_Tp>();
+
+  // fallback:{{{2
+  return __vector_convert<_To>(__v, make_index_sequence<std::min(_M, _Np)>());
+  //}}}
+} // }}}
+// 2-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1)
+{
+  static_assert(__is_vector_type_v<_V>);
+  using _Tp = typename _Traits::value_type;
+  constexpr size_t _Np = _Traits::_S_width;
+  [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+  [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+  using _Up = typename _VectorTraits<_To>::value_type;
+  constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+  static_assert(2 * _Np <= _M, "__v1 would be discarded; use the one-argument "
+			       "__convert_x86 overload instead");
+
+  // [xyz]_to_[xyz] {{{2
+  [[maybe_unused]] constexpr bool __x_to_x
+    = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __x_to_y
+    = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __x_to_z
+    = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __y_to_x
+    = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __y_to_y
+    = sizeof(__v0) == 32 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __y_to_z
+    = sizeof(__v0) == 32 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __z_to_x
+    = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __z_to_y
+    = sizeof(__v0) == 64 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __z_to_z
+    = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+  // iX_to_iX {{{2
+  [[maybe_unused]] constexpr bool __i_to_i
+    = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+  [[maybe_unused]] constexpr bool __i8_to_i16
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i8_to_i32
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i8_to_i64
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i16_to_i8
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i16_to_i32
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i16_to_i64
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i32_to_i8
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i32_to_i16
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i32_to_i64
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i64_to_i8
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i64_to_i16
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i64_to_i32
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+  // [fsu]X_to_[fsu]X {{{2
+  // ibw = integral && byte or word, i.e. char and short with any signedness
+  [[maybe_unused]] constexpr bool __i64_to_f32
+    = is_integral_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s32_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s16_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s8_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u32_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u16_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u8_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s64_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s32_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s16_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s8_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u64_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u32_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u16_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u8_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_f64
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_f32
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+  if constexpr (__i_to_i && __y_to_x && !__have_avx2)
+    { //{{{2
+      // <double, 4>, <double, 4> => <short, 8>
+      return __convert_x86<_To>(__lo128(__v0), __hi128(__v0), __lo128(__v1),
+				__hi128(__v1));
+    }
+  else if constexpr (__i_to_i)
+    { // assert ISA {{{2
+      static_assert(__x_to_x || __have_avx2,
+		    "integral conversions with ymm registers require AVX2");
+      static_assert(__have_avx512bw
+		      || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+			  && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+		    "8/16-bit integers in zmm registers require AVX512BW");
+      static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+		    "integral conversions with ymm registers require AVX2");
+    }
+  // concat => use 1-arg __convert_x86 {{{2
+  if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+		|| (sizeof(__v0) == 16 && __have_avx
+		    && std::is_floating_point_v<_Tp>)
+		|| (sizeof(__v0) == 32 && __have_avx512f
+		    && (sizeof(_Tp) >= 4 || __have_avx512bw)))
+    {
+      // The ISA can handle wider input registers, so concat and use one-arg
+      // implementation. This reduces code duplication considerably.
+      return __convert_x86<_To>(__concat(__v0, __v1));
+    }
+  else
+    { //{{{2
+      // conversion using bit reinterpretation (or no conversion at all) should
+      // all go through the concat branch above:
+      static_assert(!(
+	std::is_floating_point_v<
+	  _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+      if constexpr (2 * _Np < _M && sizeof(_To) > 16)
+	{ // handle all zero extension{{{2
+	  constexpr size_t Min = 16 / sizeof(_Up);
+	  return __zero_extend(
+	    __convert_x86<
+	      __vector_type_t<_Up, (Min > 2 * _Np) ? Min : 2 * _Np>>(__v0,
+								     __v1));
+	}
+      else if constexpr (__i64_to_i32)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    return __auto_bitcast(
+	      _mm_shuffle_ps(__auto_bitcast(__v0), __auto_bitcast(__v1), 0x88));
+	  else if constexpr (__y_to_y)
+	    {
+	      // AVX512F is not available (would concat otherwise)
+	      return __auto_bitcast(
+		__xzyw(_mm256_shuffle_ps(__auto_bitcast(__v0),
+					 __auto_bitcast(__v1), 0x88)));
+	      // alternative:
+	      // const auto v0_abxxcdxx = _mm256_shuffle_epi32(__v0, 8);
+	      // const auto v1_efxxghxx = _mm256_shuffle_epi32(__v1, 8);
+	      // const auto v_abefcdgh = _mm256_unpacklo_epi64(v0_abxxcdxx,
+	      // v1_efxxghxx); return _mm256_permute4x64_epi64(v_abefcdgh,
+	      // 0x01 * 0 + 0x04 * 2 + 0x10 * 1 + 0x40 * 3);  // abcdefgh
+	    }
+	  else if constexpr (__z_to_z)
+	    return __intrin_bitcast<_To>(__concat(_mm512_cvtepi64_epi32(__i0),
+						  _mm512_cvtepi64_epi32(__i1)));
+	}
+      else if constexpr (__i64_to_i16)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    {
+	      // AVX2 is not available (would concat otherwise)
+	      if constexpr (__have_sse4_1)
+		{
+		  return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+		    _mm_blend_epi16(__i0, _mm_slli_si128(__i1, 4), 0x44),
+		    _mm_setr_epi8(0, 1, 8, 9, 4, 5, 12, 13, -0x80, -0x80, -0x80,
+				  -0x80, -0x80, -0x80, -0x80, -0x80)));
+		}
+	      else
+		{
+		  return __vector_type_t<_Up, _M>{_Up(__v0[0]), _Up(__v0[1]),
+						  _Up(__v1[0]), _Up(__v1[1])};
+		}
+	    }
+	  else if constexpr (__y_to_x)
+	    {
+	      auto __a
+		= _mm256_unpacklo_epi16(__i0, __i1); // 04.. .... 26.. ....
+	      auto __b
+		= _mm256_unpackhi_epi16(__i0, __i1);      // 15.. .... 37.. ....
+	      auto __c = _mm256_unpacklo_epi16(__a, __b); // 0145 .... 2367 ....
+	      return __intrin_bitcast<_To>(
+		_mm_unpacklo_epi32(__lo128(__c), __hi128(__c))); // 0123 4567
+	    }
+	  else if constexpr (__z_to_y)
+	    return __intrin_bitcast<_To>(__concat(_mm512_cvtepi64_epi16(__i0),
+						  _mm512_cvtepi64_epi16(__i1)));
+	}
+      else if constexpr (__i64_to_i8)
+	{ //{{{2
+	  if constexpr (__x_to_x && __have_sse4_1)
+	    {
+	      return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+		_mm_blend_epi16(__i0, _mm_slli_si128(__i1, 4), 0x44),
+		_mm_setr_epi8(0, 8, 4, 12, -0x80, -0x80, -0x80, -0x80, -0x80,
+			      -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+			      -0x80)));
+	    }
+	  else if constexpr (__x_to_x && __have_ssse3)
+	    {
+	      return __intrin_bitcast<_To>(_mm_unpacklo_epi16(
+		_mm_shuffle_epi8(__i0, _mm_setr_epi8(0, 8, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80)),
+		_mm_shuffle_epi8(__i1, _mm_setr_epi8(0, 8, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80, -0x80,
+						     -0x80, -0x80, -0x80))));
+	    }
+	  else if constexpr (__x_to_x)
+	    {
+	      return __vector_type_t<_Up, _M>{_Up(__v0[0]), _Up(__v0[1]),
+					      _Up(__v1[0]), _Up(__v1[1])};
+	    }
+	  else if constexpr (__y_to_x)
+	    {
+	      const auto __a = _mm256_shuffle_epi8(
+		_mm256_blend_epi32(__i0, _mm256_slli_epi64(__i1, 32), 0xAA),
+		_mm256_setr_epi8(0, 8, -0x80, -0x80, 4, 12, -0x80, -0x80, -0x80,
+				 -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+				 -0x80, -0x80, -0x80, 0, 8, -0x80, -0x80, 4, 12,
+				 -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+				 -0x80, -0x80));
+	      return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+	    } // __z_to_x uses concat fallback
+	}
+      else if constexpr (__i32_to_i16)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    {
+	      // AVX2 is not available (would concat otherwise)
+	      if constexpr (__have_sse4_1)
+		{
+		  return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+		    _mm_blend_epi16(__i0, _mm_slli_si128(__i1, 2), 0xaa),
+		    _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7, 10, 11,
+				  14, 15)));
+		}
+	      else if constexpr (__have_ssse3)
+		{
+		  return __intrin_bitcast<_To>(
+		    _mm_hadd_epi16(__to_intrin(__v0 << 16),
+				   __to_intrin(__v1 << 16)));
+		  /*
+		  return _mm_unpacklo_epi64(
+		      _mm_shuffle_epi8(__i0, _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12,
+		  13, 8, 9, 12, 13, 12, 13, 14, 15)), _mm_shuffle_epi8(__i1,
+		  _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 8, 9, 12, 13, 12, 13,
+		  14, 15)));
+							 */
+		}
+	      else
+		{
+		  auto __a = _mm_unpacklo_epi16(__i0, __i1); // 04.. 15..
+		  auto __b = _mm_unpackhi_epi16(__i0, __i1); // 26.. 37..
+		  auto __c = _mm_unpacklo_epi16(__a, __b);   // 0246 ....
+		  auto __d = _mm_unpackhi_epi16(__a, __b);   // 1357 ....
+		  return __intrin_bitcast<_To>(
+		    _mm_unpacklo_epi16(__c, __d)); // 0123 4567
+		}
+	    }
+	  else if constexpr (__y_to_y)
+	    {
+	      const auto __shuf
+		= _mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80,
+				   -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, 0,
+				   1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+				   -0x80, -0x80, -0x80, -0x80, -0x80);
+	      auto __a = _mm256_shuffle_epi8(__i0, __shuf);
+	      auto __b = _mm256_shuffle_epi8(__i1, __shuf);
+	      return __intrin_bitcast<_To>(
+		__xzyw(_mm256_unpacklo_epi64(__a, __b)));
+	    } // __z_to_z uses concat fallback
+	}
+      else if constexpr (__i32_to_i8)
+	{ //{{{2
+	  if constexpr (__x_to_x && __have_ssse3)
+	    {
+	      const auto shufmask
+		= _mm_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80, -0x80, -0x80,
+				-0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+				-0x80);
+	      return __intrin_bitcast<_To>(
+		_mm_unpacklo_epi32(_mm_shuffle_epi8(__i0, shufmask),
+				   _mm_shuffle_epi8(__i1, shufmask)));
+	    }
+	  else if constexpr (__x_to_x)
+	    {
+	      auto __a = _mm_unpacklo_epi8(__i0, __i1); // 04.. .... 15.. ....
+	      auto __b = _mm_unpackhi_epi8(__i0, __i1); // 26.. .... 37.. ....
+	      auto __c = _mm_unpacklo_epi8(__a, __b);   // 0246 .... .... ....
+	      auto __d = _mm_unpackhi_epi8(__a, __b);   // 1357 .... .... ....
+	      auto __e = _mm_unpacklo_epi8(__c, __d);   // 0123 4567 .... ....
+	      return __intrin_bitcast<_To>(__e & __m128i{-1, 0});
+	    }
+	  else if constexpr (__y_to_x)
+	    {
+	      const auto __a = _mm256_shuffle_epi8(
+		_mm256_blend_epi16(__i0, _mm256_slli_epi32(__i1, 16), 0xAA),
+		_mm256_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80, -0x80, 2, 6,
+				 10, 14, -0x80, -0x80, -0x80, -0x80, -0x80,
+				 -0x80, -0x80, -0x80, 0, 4, 8, 12, -0x80, -0x80,
+				 -0x80, -0x80, 2, 6, 10, 14));
+	      return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+	    } // __z_to_y uses concat fallback
+	}
+      else if constexpr (__i16_to_i8)
+	{ //{{{2
+	  if constexpr (__x_to_x && __have_ssse3)
+	    {
+	      const auto __shuf = reinterpret_cast<__m128i>(
+		__vector_type_t<_UChar, 16>{0, 2, 4, 6, 8, 10, 12, 14, 0x80,
+					    0x80, 0x80, 0x80, 0x80, 0x80, 0x80,
+					    0x80});
+	      return __intrin_bitcast<_To>(
+		_mm_unpacklo_epi64(_mm_shuffle_epi8(__i0, __shuf),
+				   _mm_shuffle_epi8(__i1, __shuf)));
+	    }
+	  else if constexpr (__x_to_x)
+	    {
+	      auto __a = _mm_unpacklo_epi8(__i0, __i1); // 08.. 19.. 2A.. 3B..
+	      auto __b = _mm_unpackhi_epi8(__i0, __i1); // 4C.. 5D.. 6E.. 7F..
+	      auto __c = _mm_unpacklo_epi8(__a, __b);   // 048C .... 159D ....
+	      auto __d = _mm_unpackhi_epi8(__a, __b);   // 26AE .... 37BF ....
+	      auto __e = _mm_unpacklo_epi8(__c, __d);   // 0246 8ACE .... ....
+	      auto __f = _mm_unpackhi_epi8(__c, __d);   // 1357 9BDF .... ....
+	      return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__e, __f));
+	    }
+	  else if constexpr (__y_to_y)
+	    {
+	      return __intrin_bitcast<_To>(__xzyw(_mm256_shuffle_epi8(
+		(__to_intrin(__v0) & _mm256_set1_epi32(0x00ff00ff))
+		  | _mm256_slli_epi16(__i1, 8),
+		_mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11,
+				 13, 15, 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7,
+				 9, 11, 13, 15))));
+	    } // __z_to_z uses concat fallback
+	}
+      else if constexpr (__i64_to_f32)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    return __make_wrapper<float>(__v0[0], __v0[1], __v1[0], __v1[1]);
+	  else if constexpr (__y_to_y)
+	    {
+	      static_assert(__y_to_y && __have_avx2);
+	      const auto __a = _mm256_unpacklo_epi32(__i0, __i1);  // aeAE cgCG
+	      const auto __b = _mm256_unpackhi_epi32(__i0, __i1);  // bfBF dhDH
+	      const auto __lo32 = _mm256_unpacklo_epi32(__a, __b); // abef cdgh
+	      const auto __hi32
+		= __vector_bitcast<conditional_t<is_signed_v<_Tp>, int, _UInt>>(
+		  _mm256_unpackhi_epi32(__a, __b)); // ABEF CDGH
+	      const auto __hi
+		= 0x100000000LL
+		  * __convert_x86<__vector_type_t<float, 8>>(__hi32);
+	      const auto __mid
+		= 0x10000 * _mm256_cvtepi32_ps(_mm256_srli_epi32(__lo32, 16));
+	      const auto __lo
+		= _mm256_cvtepi32_ps(_mm256_set1_epi32(0x0000ffffu) & __lo32);
+	      return __xzyw((__hi + __mid) + __lo);
+	    }
+	  else if constexpr (__z_to_z && __have_avx512dq)
+	    {
+	      return std::is_signed_v<_Tp> ? __concat(_mm512_cvtepi64_ps(__i0),
+						      _mm512_cvtepi64_ps(__i1))
+					   : __concat(_mm512_cvtepu64_ps(__i0),
+						      _mm512_cvtepu64_ps(__i1));
+	    }
+	  else if constexpr (__z_to_z && std::is_signed_v<_Tp>)
+	    {
+	      const __m512 __hi32 = _mm512_cvtepi32_ps(
+		__concat(_mm512_cvtepi64_epi32(__to_intrin(__v0 >> 32)),
+			 _mm512_cvtepi64_epi32(__to_intrin(__v1 >> 32))));
+	      const __m512i __lo32 = __concat(_mm512_cvtepi64_epi32(__i0),
+					      _mm512_cvtepi64_epi32(__i1));
+	      // split low 32-bits, because if __hi32 is a small negative
+	      // number, the 24-bit mantissa may lose important information if
+	      // any of the high 8 bits of __lo32 is set, leading to
+	      // catastrophic cancelation in the FMA
+	      const __m512 __hi16
+		= _mm512_cvtepu32_ps(_mm512_set1_epi32(0xffff0000u) & __lo32);
+	      const __m512 __lo16
+		= _mm512_cvtepi32_ps(_mm512_set1_epi32(0x0000ffffu) & __lo32);
+	      return (__hi32 * 0x100000000LL + __hi16) + __lo16;
+	    }
+	  else if constexpr (__z_to_z && std::is_unsigned_v<_Tp>)
+	    {
+	      return __intrin_bitcast<_To>(
+		_mm512_cvtepu32_ps(
+		  __concat(_mm512_cvtepi64_epi32(_mm512_srai_epi64(__i0, 32)),
+			   _mm512_cvtepi64_epi32(_mm512_srai_epi64(__i1, 32))))
+		  * 0x100000000LL
+		+ _mm512_cvtepu32_ps(__concat(_mm512_cvtepi64_epi32(__i0),
+					      _mm512_cvtepi64_epi32(__i1))));
+	    }
+	}
+      else if constexpr (__f64_to_s32)
+	{ //{{{2
+	  // use concat fallback
+	}
+      else if constexpr (__f64_to_u32)
+	{ //{{{2
+	  if constexpr (__x_to_x && __have_sse4_1)
+	    {
+	      return __vector_bitcast<_Up, _M>(_mm_unpacklo_epi64(
+		       _mm_cvttpd_epi32(_mm_floor_pd(__i0) - 0x8000'0000u),
+		       _mm_cvttpd_epi32(_mm_floor_pd(__i1) - 0x8000'0000u)))
+		     ^ 0x8000'0000u;
+	      // without SSE4.1 just use the scalar fallback, it's only four
+	      // values
+	    }
+	  else if constexpr (__y_to_y)
+	    {
+	      return __vector_bitcast<_Up>(
+		       __concat(_mm256_cvttpd_epi32(_mm256_floor_pd(__i0)
+						    - 0x8000'0000u),
+				_mm256_cvttpd_epi32(_mm256_floor_pd(__i1)
+						    - 0x8000'0000u)))
+		     ^ 0x8000'0000u;
+	    } // __z_to_z uses fallback
+	}
+      else if constexpr (__f64_to_ibw)
+	{ //{{{2
+	  // one-arg __f64_to_ibw goes via _SimdWrapper<int, ?>. The fallback
+	  // would go via two independet conversions to _SimdWrapper<_To> and
+	  // subsequent interleaving. This is better, because f64->__i32 allows
+	  // to combine __v0 and __v1 into one register:
+	  // if constexpr (__z_to_x || __y_to_x) {
+	  return __convert_x86<_To>(
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1));
+	  //}
+	}
+      else if constexpr (__f32_to_ibw)
+	{ //{{{2
+	  return __convert_x86<_To>(
+	    __convert_x86<__vector_type_t<int, _Np>>(__v0),
+	    __convert_x86<__vector_type_t<int, _Np>>(__v1));
+	  //}}}
+	}
+
+      // fallback: {{{2
+      if constexpr (sizeof(_To) >= 32)
+	// if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+	return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v0),
+			__convert_x86<__vector_type_t<_Up, _M / 2>>(__v1));
+      else if constexpr (sizeof(_To) == 16)
+	{
+	  const auto __lo = __to_intrin(__convert_x86<_To>(__v0));
+	  const auto __hi = __to_intrin(__convert_x86<_To>(__v1));
+	  if constexpr (sizeof(_Up) * _Np == 8)
+	    {
+	      if constexpr (is_floating_point_v<_Up>)
+		return __auto_bitcast(
+		  _mm_unpacklo_pd(__vector_bitcast<double>(__lo),
+				  __vector_bitcast<double>(__hi)));
+	      else
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+	    }
+	  else if constexpr (sizeof(_Up) * _Np == 4)
+	    {
+	      if constexpr (is_floating_point_v<_Up>)
+		return __auto_bitcast(
+		  _mm_unpacklo_ps(__vector_bitcast<float>(__lo),
+				  __vector_bitcast<float>(__hi)));
+	      else
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__lo, __hi));
+	    }
+	  else if constexpr (sizeof(_Up) * _Np == 2)
+	    return __intrin_bitcast<_To>(_mm_unpacklo_epi16(__lo, __hi));
+	  else
+	    __assert_unreachable<_Tp>();
+	}
+      else
+	return __vector_convert<_To>(__v0, __v1, make_index_sequence<_Np>());
+      //}}}
+    }
+} //}}}1
+// 4-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3)
+{
+  static_assert(__is_vector_type_v<_V>);
+  using _Tp = typename _Traits::value_type;
+  constexpr size_t _Np = _Traits::_S_width;
+  [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+  [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+  [[maybe_unused]] const auto __i2 = __to_intrin(__v2);
+  [[maybe_unused]] const auto __i3 = __to_intrin(__v3);
+  using _Up = typename _VectorTraits<_To>::value_type;
+  constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+  static_assert(4 * _Np <= _M,
+		"__v2/__v3 would be discarded; use the two/one-argument "
+		"__convert_x86 overload instead");
+
+  // [xyz]_to_[xyz] {{{2
+  [[maybe_unused]] constexpr bool __x_to_x
+    = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __x_to_y
+    = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __x_to_z
+    = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __y_to_x
+    = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __y_to_y
+    = sizeof(__v0) == 32 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __y_to_z
+    = sizeof(__v0) == 32 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __z_to_x
+    = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __z_to_y
+    = sizeof(__v0) == 64 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __z_to_z
+    = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+  // iX_to_iX {{{2
+  [[maybe_unused]] constexpr bool __i_to_i
+    = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+  [[maybe_unused]] constexpr bool __i8_to_i16
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i8_to_i32
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i8_to_i64
+    = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i16_to_i8
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i16_to_i32
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __i16_to_i64
+    = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i32_to_i8
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i32_to_i16
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i32_to_i64
+    = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __i64_to_i8
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __i64_to_i16
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+  [[maybe_unused]] constexpr bool __i64_to_i32
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+  // [fsu]X_to_[fsu]X {{{2
+  // ibw = integral && byte or word, i.e. char and short with any signedness
+  [[maybe_unused]] constexpr bool __i64_to_f32
+    = is_integral_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s32_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s16_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s8_to_f32
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u32_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u16_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __u8_to_f32
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+  [[maybe_unused]] constexpr bool __s64_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s32_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s16_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __s8_to_f64
+    = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u64_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u32_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u16_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __u8_to_f64
+    = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f32_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_s64
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_s32
+    = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u64
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_u32
+    = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+  [[maybe_unused]] constexpr bool __f64_to_ibw
+    = is_integral_v<_Up> && sizeof(_Up) <= 2
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+  [[maybe_unused]] constexpr bool __f32_to_f64
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+      && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+  [[maybe_unused]] constexpr bool __f64_to_f32
+    = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+      && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+  if constexpr (__i_to_i && __y_to_x && !__have_avx2)
+    { //{{{2
+      // <double, 4>, <double, 4>, <double, 4>, <double, 4> => <char, 16>
+      return __convert_x86<_To>(__lo128(__v0), __hi128(__v0), __lo128(__v1),
+				__hi128(__v1), __lo128(__v2), __hi128(__v2),
+				__lo128(__v3), __hi128(__v3));
+    }
+  else if constexpr (__i_to_i)
+    { // assert ISA {{{2
+      static_assert(__x_to_x || __have_avx2,
+		    "integral conversions with ymm registers require AVX2");
+      static_assert(__have_avx512bw
+		      || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+			  && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+		    "8/16-bit integers in zmm registers require AVX512BW");
+      static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+		    "integral conversions with ymm registers require AVX2");
+    }
+  // concat => use 2-arg __convert_x86 {{{2
+  if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+		|| (sizeof(__v0) == 16 && __have_avx
+		    && std::is_floating_point_v<_Tp>)
+		|| (sizeof(__v0) == 32 && __have_avx512f))
+    {
+      // The ISA can handle wider input registers, so concat and use two-arg
+      // implementation. This reduces code duplication considerably.
+      return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3));
+    }
+  else
+    { //{{{2
+      // conversion using bit reinterpretation (or no conversion at all) should
+      // all go through the concat branch above:
+      static_assert(!(
+	std::is_floating_point_v<
+	  _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+      if constexpr (4 * _Np < _M && sizeof(_To) > 16)
+	{ // handle all zero extension{{{2
+	  constexpr size_t Min = 16 / sizeof(_Up);
+	  return __zero_extend(
+	    __convert_x86<
+	      __vector_type_t<_Up, (Min > 4 * _Np) ? Min : 4 * _Np>>(__v0, __v1,
+								     __v2,
+								     __v3));
+	}
+      else if constexpr (__i64_to_i16)
+	{ //{{{2
+	  if constexpr (__x_to_x && __have_sse4_1)
+	    {
+	      return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+		_mm_blend_epi16(_mm_blend_epi16(__i0, _mm_slli_si128(__i1, 2),
+						0x22),
+				_mm_blend_epi16(_mm_slli_si128(__i2, 4),
+						_mm_slli_si128(__i3, 6), 0x88),
+				0xcc),
+		_mm_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6, 7, 14,
+			      15)));
+	    }
+	  else if constexpr (__y_to_y && __have_avx2)
+	    {
+	      return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+		__xzyw(_mm256_blend_epi16(
+		  __auto_bitcast(
+		    _mm256_shuffle_ps(__vector_bitcast<float>(__v0),
+				      __vector_bitcast<float>(__v2),
+				      0x88)), // 0.1. 8.9. 2.3. A.B.
+		  __to_intrin(__vector_bitcast<int>(_mm256_shuffle_ps(
+				__vector_bitcast<float>(__v1),
+				__vector_bitcast<float>(__v3), 0x88))
+			      << 16), // .4.5 .C.D .6.7 .E.F
+		  0xaa)               // 0415 8C9D 2637 AEBF
+		       ),             // 0415 2637 8C9D AEBF
+		_mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7, 10, 11,
+				 14, 15, 0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7,
+				 10, 11, 14, 15)));
+	      /*
+	      auto __a = _mm256_unpacklo_epi16(__v0, __v1);  // 04.. .... 26..
+	      .... auto __b = _mm256_unpackhi_epi16(__v0, __v1);  // 15..
+	      .... 37.. .... auto __c = _mm256_unpacklo_epi16(__v2, __v3);  //
+	      8C.. .... AE.. .... auto __d = _mm256_unpackhi_epi16(__v2, __v3);
+	      // 9D.. .... BF.. .... auto __e = _mm256_unpacklo_epi16(__a, __b);
+	      // 0145 .... 2367 .... auto __f = _mm256_unpacklo_epi16(__c, __d);
+	      // 89CD .... ABEF .... auto __g = _mm256_unpacklo_epi64(__e, __f);
+	      // 0145 89CD 2367 ABEF return __concat(
+		  _mm_unpacklo_epi32(__lo128(__g), __hi128(__g)),
+		  _mm_unpackhi_epi32(__lo128(__g), __hi128(__g)));  // 0123 4567
+	      89AB CDEF
+		  */
+	    } // else use fallback
+	}
+      else if constexpr (__i64_to_i8)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    {
+	      // TODO: use fallback for now
+	    }
+	  else if constexpr (__y_to_x)
+	    {
+	      auto __a = _mm256_srli_epi32(_mm256_slli_epi32(__i0, 24), 24)
+			 | _mm256_srli_epi32(_mm256_slli_epi32(__i1, 24), 16)
+			 | _mm256_srli_epi32(_mm256_slli_epi32(__i2, 24), 8)
+			 | _mm256_slli_epi32(
+			   __i3, 24); // 048C .... 159D .... 26AE .... 37BF ....
+	      /*return _mm_shuffle_epi8(
+		  _mm_blend_epi32(__lo128(__a) << 32, __hi128(__a), 0x5),
+		  _mm_setr_epi8(4, 12, 0, 8, 5, 13, 1, 9, 6, 14, 2, 10, 7, 15,
+		 3, 11));*/
+	      auto __b = _mm256_unpackhi_epi64(
+		__a, __a); // 159D .... 159D .... 37BF .... 37BF ....
+	      auto __c = _mm256_unpacklo_epi8(
+		__a, __b); // 0145 89CD .... .... 2367 ABEF .... ....
+	      return __intrin_bitcast<_To>(
+		_mm_unpacklo_epi16(__lo128(__c),
+				   __hi128(__c))); // 0123 4567 89AB CDEF
+	    }
+	}
+      else if constexpr (__i32_to_i8)
+	{ //{{{2
+	  if constexpr (__x_to_x)
+	    {
+	      if constexpr (__have_ssse3)
+		{
+		  const auto __x0 = __vector_bitcast<_UInt>(__v0) & 0xff;
+		  const auto __x1 = (__vector_bitcast<_UInt>(__v1) & 0xff) << 8;
+		  const auto __x2 = (__vector_bitcast<_UInt>(__v2) & 0xff)
+				    << 16;
+		  const auto __x3 = __vector_bitcast<_UInt>(__v3) << 24;
+		  return __intrin_bitcast<_To>(
+		    _mm_shuffle_epi8(__to_intrin(__x0 | __x1 | __x2 | __x3),
+				     _mm_setr_epi8(0, 4, 8, 12, 1, 5, 9, 13, 2,
+						   6, 10, 14, 3, 7, 11, 15)));
+		}
+	      else
+		{
+		  auto __a
+		    = _mm_unpacklo_epi8(__i0, __i2); // 08.. .... 19.. ....
+		  auto __b
+		    = _mm_unpackhi_epi8(__i0, __i2); // 2A.. .... 3B.. ....
+		  auto __c
+		    = _mm_unpacklo_epi8(__i1, __i3); // 4C.. .... 5D.. ....
+		  auto __d
+		    = _mm_unpackhi_epi8(__i1, __i3);      // 6E.. .... 7F.. ....
+		  auto __e = _mm_unpacklo_epi8(__a, __c); // 048C .... .... ....
+		  auto __f = _mm_unpackhi_epi8(__a, __c); // 159D .... .... ....
+		  auto __g = _mm_unpacklo_epi8(__b, __d); // 26AE .... .... ....
+		  auto __h = _mm_unpackhi_epi8(__b, __d); // 37BF .... .... ....
+		  return __intrin_bitcast<_To>(_mm_unpacklo_epi8(
+		    _mm_unpacklo_epi8(__e, __g), // 0246 8ACE .... ....
+		    _mm_unpacklo_epi8(__f, __h)  // 1357 9BDF .... ....
+		    ));                          // 0123 4567 89AB CDEF
+		}
+	    }
+	  else if constexpr (__y_to_y)
+	    {
+	      const auto __a = _mm256_shuffle_epi8(
+		__to_intrin((__vector_bitcast<_UShort>(_mm256_blend_epi16(
+			       __i0, _mm256_slli_epi32(__i1, 16), 0xAA))
+			     & 0xff)
+			    | (__vector_bitcast<_UShort>(_mm256_blend_epi16(
+				 __i2, _mm256_slli_epi32(__i3, 16), 0xAA))
+			       << 8)),
+		_mm256_setr_epi8(0, 4, 8, 12, 2, 6, 10, 14, 1, 5, 9, 13, 3, 7,
+				 11, 15, 0, 4, 8, 12, 2, 6, 10, 14, 1, 5, 9, 13,
+				 3, 7, 11, 15));
+	      return __intrin_bitcast<_To>(_mm256_permutevar8x32_epi32(
+		__a, _mm256_setr_epi32(0, 4, 1, 5, 2, 6, 3, 7)));
+	    }
+	}
+      else if constexpr (__i64_to_f32)
+	{ //{{{2
+	  // this branch is only relevant with AVX and w/o AVX2 (i.e. no ymm
+	  // integers)
+	  if constexpr (__x_to_y)
+	    {
+	      return __make_wrapper<float>(__v0[0], __v0[1], __v1[0], __v1[1],
+					   __v2[0], __v2[1], __v3[0], __v3[1]);
+
+	      const auto __a = _mm_unpacklo_epi32(__i0, __i1);   // acAC
+	      const auto __b = _mm_unpackhi_epi32(__i0, __i1);   // bdBD
+	      const auto __c = _mm_unpacklo_epi32(__i2, __i3);   // egEG
+	      const auto __d = _mm_unpackhi_epi32(__i2, __i3);   // fhFH
+	      const auto __lo32a = _mm_unpacklo_epi32(__a, __b); // abcd
+	      const auto __lo32b = _mm_unpacklo_epi32(__c, __d); // efgh
+	      const auto __hi32
+		= __vector_bitcast<conditional_t<is_signed_v<_Tp>, int, _UInt>>(
+		  __concat(_mm_unpackhi_epi32(__a, __b),
+			   _mm_unpackhi_epi32(__c, __d))); // ABCD EFGH
+	      const auto __hi
+		= 0x100000000LL
+		  * __convert_x86<__vector_type_t<float, 8>>(__hi32);
+	      const auto __mid
+		= 0x10000
+		  * _mm256_cvtepi32_ps(__concat(_mm_srli_epi32(__lo32a, 16),
+						_mm_srli_epi32(__lo32b, 16)));
+	      const auto __lo = _mm256_cvtepi32_ps(
+		__concat(_mm_set1_epi32(0x0000ffffu) & __lo32a,
+			 _mm_set1_epi32(0x0000ffffu) & __lo32b));
+	      return (__hi + __mid) + __lo;
+	    }
+	}
+      else if constexpr (__f64_to_ibw)
+	{ //{{{2
+	  return __convert_x86<_To>(
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1),
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v2, __v3));
+	}
+      else if constexpr (__f32_to_ibw)
+	{ //{{{2
+	  return __convert_x86<_To>(
+	    __convert_x86<__vector_type_t<int, _Np>>(__v0),
+	    __convert_x86<__vector_type_t<int, _Np>>(__v1),
+	    __convert_x86<__vector_type_t<int, _Np>>(__v2),
+	    __convert_x86<__vector_type_t<int, _Np>>(__v3));
+	} //}}}
+
+      // fallback: {{{2
+      if constexpr (sizeof(_To) >= 32)
+	// if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+	return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1),
+			__convert_x86<__vector_type_t<_Up, _M / 2>>(__v2,
+								    __v3));
+      else if constexpr (sizeof(_To) == 16)
+	{
+	  const auto __lo = __to_intrin(__convert_x86<_To>(__v0, __v1));
+	  const auto __hi = __to_intrin(__convert_x86<_To>(__v2, __v3));
+	  if constexpr (sizeof(_Up) * _Np * 2 == 8)
+	    {
+	      if constexpr (is_floating_point_v<_Up>)
+		return __auto_bitcast(_mm_unpacklo_pd(__lo, __hi));
+	      else
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+	    }
+	  else if constexpr (sizeof(_Up) * _Np * 2 == 4)
+	    {
+	      if constexpr (is_floating_point_v<_Up>)
+		return __auto_bitcast(_mm_unpacklo_ps(__lo, __hi));
+	      else
+		return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__lo, __hi));
+	    }
+	  else
+	    __assert_unreachable<_Tp>();
+	}
+      else
+	return __vector_convert<_To>(__v0, __v1, __v2, __v3,
+				     make_index_sequence<_Np>());
+      //}}}2
+    }
+} //}}}
+// 8-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3, _V __v4, _V __v5, _V __v6,
+	      _V __v7)
+{
+  static_assert(__is_vector_type_v<_V>);
+  using _Tp = typename _Traits::value_type;
+  constexpr size_t _Np = _Traits::_S_width;
+  [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+  [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+  [[maybe_unused]] const auto __i2 = __to_intrin(__v2);
+  [[maybe_unused]] const auto __i3 = __to_intrin(__v3);
+  [[maybe_unused]] const auto __i4 = __to_intrin(__v4);
+  [[maybe_unused]] const auto __i5 = __to_intrin(__v5);
+  [[maybe_unused]] const auto __i6 = __to_intrin(__v6);
+  [[maybe_unused]] const auto __i7 = __to_intrin(__v7);
+  using _Up = typename _VectorTraits<_To>::value_type;
+  constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+  static_assert(8 * _Np <= _M,
+		"__v4-__v7 would be discarded; use the four/two/one-argument "
+		"__convert_x86 overload instead");
+
+  // [xyz]_to_[xyz] {{{2
+  [[maybe_unused]] constexpr bool __x_to_x
+    = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __x_to_y
+    = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __x_to_z
+    = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __y_to_x
+    = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __y_to_y
+    = sizeof(__v0) == 32 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __y_to_z
+    = sizeof(__v0) == 32 && sizeof(_To) == 64;
+  [[maybe_unused]] constexpr bool __z_to_x
+    = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+  [[maybe_unused]] constexpr bool __z_to_y
+    = sizeof(__v0) == 64 && sizeof(_To) == 32;
+  [[maybe_unused]] constexpr bool __z_to_z
+    = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+  // [if]X_to_i8 {{{2
+  [[maybe_unused]] constexpr bool __i_to_i
+    = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+  [[maybe_unused]] constexpr bool __i64_to_i8
+    = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+  [[maybe_unused]] constexpr bool __f64_to_i8
+    = is_integral_v<_Up> && sizeof(_Up) == 1
+      && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+
+  if constexpr (__i_to_i) // assert ISA {{{2
+    {
+      static_assert(__x_to_x || __have_avx2,
+		    "integral conversions with ymm registers require AVX2");
+      static_assert(__have_avx512bw
+		      || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+			  && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+		    "8/16-bit integers in zmm registers require AVX512BW");
+      static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+		    "integral conversions with ymm registers require AVX2");
+    }
+  // concat => use 4-arg __convert_x86 {{{2
+  if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+		|| (sizeof(__v0) == 16 && __have_avx
+		    && std::is_floating_point_v<_Tp>)
+		|| (sizeof(__v0) == 32 && __have_avx512f))
+    {
+      // The ISA can handle wider input registers, so concat and use two-arg
+      // implementation. This reduces code duplication considerably.
+      return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3),
+				__concat(__v4, __v5), __concat(__v6, __v7));
+    }
+  else //{{{2
+    {
+      // conversion using bit reinterpretation (or no conversion at all) should
+      // all go through the concat branch above:
+      static_assert(!(
+	std::is_floating_point_v<
+	  _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+      static_assert(!(8 * _Np < _M && sizeof(_To) > 16),
+		    "zero extension should be impossible");
+      if constexpr (__i64_to_i8) //{{{2
+	{
+	  if constexpr (__x_to_x && __have_ssse3)
+	    {
+	      // unsure whether this is better than the variant below
+	      return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+		__to_intrin((((__v0 & 0xff) | ((__v1 & 0xff) << 8))
+			     | (((__v2 & 0xff) << 16) | ((__v3 & 0xff) << 24)))
+			    | ((((__v4 & 0xff) << 32) | ((__v5 & 0xff) << 40))
+			       | (((__v6 & 0xff) << 48) | (__v7 << 56)))),
+		_mm_setr_epi8(0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7,
+			      15)));
+	    }
+	  else if constexpr (__x_to_x)
+	    {
+	      const auto __a = _mm_unpacklo_epi8(__i0, __i1); // ac
+	      const auto __b = _mm_unpackhi_epi8(__i0, __i1); // bd
+	      const auto __c = _mm_unpacklo_epi8(__i2, __i3); // eg
+	      const auto __d = _mm_unpackhi_epi8(__i2, __i3); // fh
+	      const auto __e = _mm_unpacklo_epi8(__i4, __i5); // ik
+	      const auto __f = _mm_unpackhi_epi8(__i4, __i5); // jl
+	      const auto __g = _mm_unpacklo_epi8(__i6, __i7); // mo
+	      const auto __h = _mm_unpackhi_epi8(__i6, __i7); // np
+	      return __intrin_bitcast<_To>(_mm_unpacklo_epi64(
+		_mm_unpacklo_epi32(_mm_unpacklo_epi8(__a, __b),  // abcd
+				   _mm_unpacklo_epi8(__c, __d)), // efgh
+		_mm_unpacklo_epi32(_mm_unpacklo_epi8(__e, __f),  // ijkl
+				   _mm_unpacklo_epi8(__g, __h))  // mnop
+		));
+	    }
+	  else if constexpr (__y_to_y)
+	    {
+	      auto __a = // 048C GKOS 159D HLPT 26AE IMQU 37BF JNRV
+		__to_intrin((((__v0 & 0xff) | ((__v1 & 0xff) << 8))
+			     | (((__v2 & 0xff) << 16) | ((__v3 & 0xff) << 24)))
+			    | ((((__v4 & 0xff) << 32) | ((__v5 & 0xff) << 40))
+			       | (((__v6 & 0xff) << 48) | ((__v7 << 56)))));
+	      /*
+	      auto __b = _mm256_unpackhi_epi64(__a, __a);  // 159D HLPT 159D
+	      HLPT 37BF JNRV 37BF JNRV auto __c = _mm256_unpacklo_epi8(__a,
+	      __b);  // 0145 89CD GHKL OPST 2367 ABEF IJMN QRUV auto __d =
+	      __xzyw(__c); // 0145 89CD 2367 ABEF GHKL OPST IJMN QRUV return
+	      _mm256_shuffle_epi8(
+		  __d, _mm256_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13,
+	      6, 7, 14, 15, 0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6, 7, 14,
+	      15));
+				      */
+	      auto __b = _mm256_shuffle_epi8( // 0145 89CD GHKL OPST 2367 ABEF
+					      // IJMN QRUV
+		__a, _mm256_setr_epi8(0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6,
+				      14, 7, 15, 0, 8, 1, 9, 2, 10, 3, 11, 4,
+				      12, 5, 13, 6, 14, 7, 15));
+	      auto __c = __xzyw(__b); // 0145 89CD 2367 ABEF GHKL OPST IJMN QRUV
+	      return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+		__c, _mm256_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6,
+				      7, 14, 15, 0, 1, 8, 9, 2, 3, 10, 11, 4, 5,
+				      12, 13, 6, 7, 14, 15)));
+	    }
+	  else if constexpr (__z_to_z)
+	    {
+	      return __concat(
+		__convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1, __v2,
+							    __v3),
+		__convert_x86<__vector_type_t<_Up, _M / 2>>(__v4, __v5, __v6,
+							    __v7));
+	    }
+	}
+      else if constexpr (__f64_to_i8) //{{{2
+	{
+	  return __convert_x86<_To>(
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1),
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v2, __v3),
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v4, __v5),
+	    __convert_x86<__vector_type_t<int, _Np * 2>>(__v6, __v7));
+	}
+      else // unreachable {{{2
+	__assert_unreachable<_Tp>();
+      //}}}
+
+      // fallback: {{{2
+      if constexpr (sizeof(_To) >= 32)
+	// if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+	return __concat(
+	  __convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1, __v2, __v3),
+	  __convert_x86<__vector_type_t<_Up, _M / 2>>(__v4, __v5, __v6, __v7));
+      else if constexpr (sizeof(_To) == 16)
+	{
+	  const auto __lo
+	    = __to_intrin(__convert_x86<_To>(__v0, __v1, __v2, __v3));
+	  const auto __hi
+	    = __to_intrin(__convert_x86<_To>(__v4, __v5, __v6, __v7));
+	  static_assert(sizeof(_Up) == 1 && _Np == 2);
+	  return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+	}
+      else
+	{
+	  __assert_unreachable<_Tp>();
+	  // return __vector_convert<_To>(__v0, __v1, __v2, __v3, __v4, __v5,
+	  // __v6, __v7,
+	  //                             make_index_sequence<_Np>());
+	} //}}}2
+    }
+} //}}}
+// 16-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3, _V __v4, _V __v5, _V __v6,
+	      _V __v7, _V __v8, _V __v9, _V __v10, _V __v11, _V __v12, _V __v13,
+	      _V __v14, _V __v15)
+{
+  // concat => use 8-arg __convert_x86 {{{2
+  return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3),
+			    __concat(__v4, __v5), __concat(__v6, __v7),
+			    __concat(__v8, __v9), __concat(__v10, __v11),
+			    __concat(__v12, __v13), __concat(__v14, __v15));
+} //}}}
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/simd b/libstdc++-v3/include/experimental/simd
new file mode 100644
index 00000000000..cb875bd0e40
--- /dev/null
+++ b/libstdc++-v3/include/experimental/simd
@@ -0,0 +1,66 @@
+// Components for element-wise operations on data-parallel objects -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// <http://www.gnu.org/licenses/>.
+
+/** @file experimental/simd
+ *  This is a TS C++ Library header.
+ */
+
+//
+// N4773 §9 data-parallel types library
+//
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD
+#define _GLIBCXX_EXPERIMENTAL_SIMD
+
+#define __cpp_lib_experimental_parallel_simd 201803
+
+#pragma GCC diagnostic push
+// Many [[gnu::vector_size(N)]] types might lead to a -Wpsabi warning which is
+// irrelevant as those functions never appear on ABI borders
+#pragma GCC diagnostic ignored "-Wpsabi"
+
+// If __OPTIMIZE__ is not defined some intrinsics are defined as macros, making
+// use of C casts internally. This requires us to disable the warning as it
+// would otherwise yield many false positives.
+#ifndef __OPTIMIZE__
+#pragma GCC diagnostic ignored "-Wold-style-cast"
+#endif
+
+#include "bits/simd_detail.h"
+#include "bits/simd.h"
+#include "bits/simd_fixed_size.h"
+#include "bits/simd_scalar.h"
+#include "bits/simd_builtin.h"
+#include "bits/simd_converter.h"
+#if _GLIBCXX_SIMD_X86INTRIN
+#include "bits/simd_x86.h"
+#elif _GLIBCXX_SIMD_HAVE_NEON
+#include "bits/simd_neon.h"
+#endif
+#include "bits/simd_math.h"
+
+#pragma GCC diagnostic pop
+
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD
+// vim: ft=cpp
diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am
index e19509d2534..9cef1e65e1b 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -47,6 +47,7 @@ site.exp: Makefile
 	@echo '## these variables are automatically generated by make ##' >site.tmp
 	@echo '# Do not edit here.  If you wish to override these values' >>site.tmp
 	@echo '# edit the last section' >>site.tmp
+	@echo 'set tool libstdc++' >>site.tmp
 	@echo 'set srcdir $(srcdir)' >>site.tmp
 	@echo "set objdir `pwd`" >>site.tmp
 	@echo 'set build_alias "$(build_alias)"' >>site.tmp
@@ -55,7 +56,6 @@ site.exp: Makefile
 	@echo 'set host_triplet $(host_triplet)' >>site.tmp
 	@echo 'set target_alias "$(target_alias)"' >>site.tmp
 	@echo 'set target_triplet $(target_triplet)' >>site.tmp
-	@echo 'set target_triplet $(target_triplet)' >>site.tmp
 	@echo 'set libiconv "$(LIBICONV)"' >>site.tmp
 	@echo 'set baseline_dir "$(baseline_dir)"' >> site.tmp
 	@echo 'set baseline_subdir_switch "$(baseline_subdir_switch)"' >> site.tmp
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc
new file mode 100644
index 00000000000..ffff65ee130
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc
new file mode 100644
index 00000000000..f8dd7d4ef82
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char.cc
new file mode 100644
index 00000000000..8b37d82caaa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc
new file mode 100644
index 00000000000..4c11f64ea4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ef375ce9451
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc
new file mode 100644
index 00000000000..6618460cc38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc
new file mode 100644
index 00000000000..e6c5ba261f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9b95c98421c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc
new file mode 100644
index 00000000000..47e8fc78e90
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc
new file mode 100644
index 00000000000..4adce678c0f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc
new file mode 100644
index 00000000000..25bb1f8cb24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double.cc
new file mode 100644
index 00000000000..302389530c1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc
new file mode 100644
index 00000000000..317d2b7db52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc
new file mode 100644
index 00000000000..e69ea67bc71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float.cc
new file mode 100644
index 00000000000..2ba2178d2b0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc
new file mode 100644
index 00000000000..b515b5cb4a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc
new file mode 100644
index 00000000000..c41eeb52641
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int.cc
new file mode 100644
index 00000000000..7299e4af93d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc
new file mode 100644
index 00000000000..5d3ac0a8217
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc
new file mode 100644
index 00000000000..a5f27e384e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long.cc
new file mode 100644
index 00000000000..64719277b7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc
new file mode 100644
index 00000000000..bdd51845a38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc
new file mode 100644
index 00000000000..0454574b3db
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc
new file mode 100644
index 00000000000..d18f45f8a45
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc
new file mode 100644
index 00000000000..736d0005a68
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc
new file mode 100644
index 00000000000..2fdcfbb077c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc
new file mode 100644
index 00000000000..f1dfe10f33f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc
new file mode 100644
index 00000000000..5b95e5def75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc
new file mode 100644
index 00000000000..a09d81da561
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short.cc
new file mode 100644
index 00000000000..d772aea85e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc
new file mode 100644
index 00000000000..e343396280b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..43146b7b6d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc
new file mode 100644
index 00000000000..bfd89fd5a96
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..8d2eba06b13
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..f3afe7b4548
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc
new file mode 100644
index 00000000000..1b00d8bfdd2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..05354330541
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..6aeec8b356b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc
new file mode 100644
index 00000000000..d3581287a0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..547a3f77c71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..9bd90e8ea7e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc
new file mode 100644
index 00000000000..79c2fd3f739
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..0976cbe1082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ed7adc17ac1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc
new file mode 100644
index 00000000000..60369fe7d23
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..e4078abd7e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..5d24b6bd858
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc
new file mode 100644
index 00000000000..b7bde5c7487
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..7adee468f76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..006e31a4a9c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc
new file mode 100644
index 00000000000..1b837f3f005
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc
new file mode 100644
index 00000000000..453e5f6c644
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc
new file mode 100644
index 00000000000..5c6ec25040d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc
new file mode 100644
index 00000000000..0b4b81155f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc
new file mode 100644
index 00000000000..f946ac69bed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..a04511a6cf5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc
new file mode 100644
index 00000000000..3a8dcf9acd3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc
new file mode 100644
index 00000000000..ba1cd1b9cbd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..797d60f7822
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc
new file mode 100644
index 00000000000..874e27d70a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc
new file mode 100644
index 00000000000..3f3e06b4e17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc
new file mode 100644
index 00000000000..c8690e22f1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc
new file mode 100644
index 00000000000..bf4accfc59b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc
new file mode 100644
index 00000000000..5b97d8a5a67
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc
new file mode 100644
index 00000000000..6c88f4e5289
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc
new file mode 100644
index 00000000000..e4b0eec5742
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc
new file mode 100644
index 00000000000..2e40f05da5a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc
new file mode 100644
index 00000000000..801fa146563
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc
new file mode 100644
index 00000000000..9e5a74e8d0c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc
new file mode 100644
index 00000000000..959e7689c99
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc
new file mode 100644
index 00000000000..1d997953e63
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc
new file mode 100644
index 00000000000..67f04518350
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc
new file mode 100644
index 00000000000..284c48f9399
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc
new file mode 100644
index 00000000000..0c2973a88c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc
new file mode 100644
index 00000000000..649f5dc5d42
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc
new file mode 100644
index 00000000000..9f454a9bda5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc
new file mode 100644
index 00000000000..d1295fbd4e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc
new file mode 100644
index 00000000000..7e8a3f91b23
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc
new file mode 100644
index 00000000000..fcc2d52a097
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc
new file mode 100644
index 00000000000..92e3fb0bdeb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc
new file mode 100644
index 00000000000..e294906388c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a02e310f606
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..51545f8960d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc
new file mode 100644
index 00000000000..67bb7e9493e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..a71a1cee92e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..8c32dd2a4fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc
new file mode 100644
index 00000000000..ce4e416091b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..bbafb7a5fba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..63b6e61e0ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc
new file mode 100644
index 00000000000..8704ef8bc48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..7279d391ec5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..2bbd1e3cd7e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc
new file mode 100644
index 00000000000..2ec36d041cb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..579ee3cb787
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..eb216b1daf2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc
new file mode 100644
index 00000000000..9d0502d3e80
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..ea8f8d9b68d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..fb8650d4ddd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc
new file mode 100644
index 00000000000..e5d45f12a58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..52b4b70fcda
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..51f485c17e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc
new file mode 100644
index 00000000000..e5df7bc35ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc
new file mode 100644
index 00000000000..2441ead5416
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc
new file mode 100644
index 00000000000..b6a8850f988
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc
new file mode 100644
index 00000000000..a7bcdbf8c26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc
new file mode 100644
index 00000000000..7030a405433
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..1124d75c645
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc
new file mode 100644
index 00000000000..cdb827041cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc
new file mode 100644
index 00000000000..7d585299fb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..0a09d2a27d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc
new file mode 100644
index 00000000000..6d127fed41e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc
new file mode 100644
index 00000000000..38acf53cd86
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc
new file mode 100644
index 00000000000..5a8480383c6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc
new file mode 100644
index 00000000000..8a258106dec
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc
new file mode 100644
index 00000000000..02bd74edb45
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc
new file mode 100644
index 00000000000..b0326aebeba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc
new file mode 100644
index 00000000000..210d01aeeec
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc
new file mode 100644
index 00000000000..e810f8a379d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc
new file mode 100644
index 00000000000..2199cae9850
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc
new file mode 100644
index 00000000000..d3945085de0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc
new file mode 100644
index 00000000000..af48c1de8eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc
new file mode 100644
index 00000000000..90531aa4c85
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc
new file mode 100644
index 00000000000..cb839c32c75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc
new file mode 100644
index 00000000000..e9fdce9bb0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc
new file mode 100644
index 00000000000..a237b259b93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc
new file mode 100644
index 00000000000..537a3fcb4c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc
new file mode 100644
index 00000000000..8c0ca3413a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc
new file mode 100644
index 00000000000..586e5380a5d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc
new file mode 100644
index 00000000000..6af141ac126
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc
new file mode 100644
index 00000000000..ab2f19dca8c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc
new file mode 100644
index 00000000000..1d71a7328f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc
new file mode 100644
index 00000000000..2f3a937715c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a7a65fc4869
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..58f30eac548
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc
new file mode 100644
index 00000000000..40e707496c7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..23ffa001a7d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..1dbc5313a22
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc
new file mode 100644
index 00000000000..b266c3d0900
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..114c4d28258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..063559e6a9b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc
new file mode 100644
index 00000000000..234eb95a65c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..86b2fce8258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..148134d7baf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc
new file mode 100644
index 00000000000..a30a6e76162
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..4f10a2ad029
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e04cb2b23b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc
new file mode 100644
index 00000000000..73824beead2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..c87091d3151
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..0f886eacf2e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc
new file mode 100644
index 00000000000..69d786bdca5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..94604182e00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..45aaf4689fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc
new file mode 100644
index 00000000000..319243c448d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc
new file mode 100644
index 00000000000..378e2fb2df7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc
new file mode 100644
index 00000000000..a098d2e3f4e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char.cc
new file mode 100644
index 00000000000..f64a0024052
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc
new file mode 100644
index 00000000000..1722271157b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..bd8faab2002
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc
new file mode 100644
index 00000000000..bd1268c3612
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc
new file mode 100644
index 00000000000..938dfbb75d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..85951c4320f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc
new file mode 100644
index 00000000000..8ef1e7ce7e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc
new file mode 100644
index 00000000000..0cf0b9e5c79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc
new file mode 100644
index 00000000000..8b7f0c9aed0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double.cc
new file mode 100644
index 00000000000..41be646beeb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc
new file mode 100644
index 00000000000..f7a4eff264e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc
new file mode 100644
index 00000000000..b854f481ab6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float.cc
new file mode 100644
index 00000000000..f766426a834
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc
new file mode 100644
index 00000000000..7851a637593
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc
new file mode 100644
index 00000000000..2cdfc6e91b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int.cc
new file mode 100644
index 00000000000..97d288508b8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc
new file mode 100644
index 00000000000..0cd85e095d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc
new file mode 100644
index 00000000000..9d43269a0f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long.cc
new file mode 100644
index 00000000000..88cb7299fb7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc
new file mode 100644
index 00000000000..d2b2ed1fa29
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc
new file mode 100644
index 00000000000..1705e0b6fbc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc
new file mode 100644
index 00000000000..3ca613c7eab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc
new file mode 100644
index 00000000000..cc135964ab0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc
new file mode 100644
index 00000000000..cdc5a31cee6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc
new file mode 100644
index 00000000000..3a671406607
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc
new file mode 100644
index 00000000000..13b56956836
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc
new file mode 100644
index 00000000000..fe52ac7db75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short.cc
new file mode 100644
index 00000000000..6e5dbe83784
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a1598ca85bc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..27b4d595db1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc
new file mode 100644
index 00000000000..2c5a99941ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..661dce00177
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..0bafc8a31fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc
new file mode 100644
index 00000000000..611642504cd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..5289792e403
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..00f1eb0d41b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc
new file mode 100644
index 00000000000..f7b202c27ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..fabb1d94dd4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..36070941276
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc
new file mode 100644
index 00000000000..cf44cbc2c11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..fee0dc4b452
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..bc0d32f9fd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc
new file mode 100644
index 00000000000..5779b69f410
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..86462e6ebb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..fc7c4a81b15
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc
new file mode 100644
index 00000000000..fbef0ec25f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..ea61dc67a2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a2f7f8eb820
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc
new file mode 100644
index 00000000000..492bd52db7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc
new file mode 100644
index 00000000000..0c5b3955c4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc
new file mode 100644
index 00000000000..69fc7e9e28f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc
new file mode 100644
index 00000000000..25693b2082c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc
new file mode 100644
index 00000000000..cb2ce60a75c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc
new file mode 100644
index 00000000000..80ca4a6043f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc
new file mode 100644
index 00000000000..886a4d0c83b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc
new file mode 100644
index 00000000000..3d3578f974e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc
new file mode 100644
index 00000000000..6848e5db58c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc
new file mode 100644
index 00000000000..66eb5aaa7be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc
new file mode 100644
index 00000000000..e86618b52fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc
new file mode 100644
index 00000000000..0cf0d6f6de6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc
new file mode 100644
index 00000000000..ebfc0f1b738
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc
new file mode 100644
index 00000000000..7c5a9838a41
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc
new file mode 100644
index 00000000000..b2d82af9f16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc
new file mode 100644
index 00000000000..584ea43afe3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc
new file mode 100644
index 00000000000..00e2a6c9dbc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc
new file mode 100644
index 00000000000..abdcd9d5126
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc
new file mode 100644
index 00000000000..00a3b61b33d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh b/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh
new file mode 100755
index 00000000000..7acf17c7eed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh
@@ -0,0 +1,81 @@
+#!/bin/bash
+
+floattypes=(
+"long double"
+"double"
+"float"
+)
+alltypes=(
+"${floattypes[@]}"
+"long long"
+"unsigned long long"
+"unsigned long"
+"long"
+"int"
+"unsigned int"
+"short"
+"unsigned short"
+"char"
+"signed char"
+"unsigned char"
+"char32_t"
+"char16_t"
+"wchar_t"
+)
+
+cd ${0%/*}
+for testcase in tests/*.h; do
+  if grep -q "test only floattypes" "$testcase"; then
+    typelist=("${floattypes[@]}")
+  else
+    typelist=("${alltypes[@]}")
+  fi
+  testcase=${testcase%.h}
+  testcase=${testcase##*/}
+  for type in "${typelist[@]}"; do
+    if [[ $testcase == sincos ]]; then
+      # The sincos test requires reference data to run
+      extra='// { dg-do compile }'
+    else
+      extra=''
+    fi
+    filename="${testcase}-${type// /_}"
+
+    cat > "${filename}.cc" <<EOF
+// { dg-options "-std=c++17" }
+${extra}
+#include "tests/${testcase}.h"
+
+int main()
+{
+  iterate_abis<${type}>();
+  return 0;
+}
+EOF
+    cat > "${filename}-constexpr.cc" <<EOF
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+${extra}
+#include "tests/${testcase}.h"
+
+int main()
+{
+  iterate_abis<${type}>();
+  return 0;
+}
+EOF
+    cat > "${filename}-fixed_size.cc" <<EOF
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+${extra}
+#define TESTFIXEDSIZE 1
+#include "tests/${testcase}.h"
+
+int main()
+{
+  iterate_abis<${type}>();
+  return 0;
+}
+EOF
+  done
+done
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc
new file mode 100644
index 00000000000..dceb9ab29b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc
new file mode 100644
index 00000000000..f20cc441d74
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char.cc
new file mode 100644
index 00000000000..790e4e3636a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc
new file mode 100644
index 00000000000..59ea27c0802
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..19fa325ed51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc
new file mode 100644
index 00000000000..897ee1c7a88
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc
new file mode 100644
index 00000000000..4db121300fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..62b5cd6c29f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc
new file mode 100644
index 00000000000..2b04c8bda75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc
new file mode 100644
index 00000000000..de491f79875
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc
new file mode 100644
index 00000000000..e7af2ed7082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double.cc
new file mode 100644
index 00000000000..09ac4bdc33d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc
new file mode 100644
index 00000000000..edabab7d3e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc
new file mode 100644
index 00000000000..75d18751c02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float.cc
new file mode 100644
index 00000000000..40f44fae4d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc
new file mode 100644
index 00000000000..643a071d7c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc
new file mode 100644
index 00000000000..acd38d02921
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int.cc
new file mode 100644
index 00000000000..2166ba8d480
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc
new file mode 100644
index 00000000000..25b994c26a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc
new file mode 100644
index 00000000000..a2d5ecfce3c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long.cc
new file mode 100644
index 00000000000..9529bcc37ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc
new file mode 100644
index 00000000000..f96beaa690a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc
new file mode 100644
index 00000000000..e60f903b48e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc
new file mode 100644
index 00000000000..dbb5cac8e6b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc
new file mode 100644
index 00000000000..e6b9f93fea7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc
new file mode 100644
index 00000000000..cb23b21fcc4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc
new file mode 100644
index 00000000000..b1d1de2a2f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc
new file mode 100644
index 00000000000..84d3314be24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc
new file mode 100644
index 00000000000..44a6764f7e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short.cc
new file mode 100644
index 00000000000..5343657320f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc
new file mode 100644
index 00000000000..fd35555d54a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bdca8349c33
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc
new file mode 100644
index 00000000000..0c1f5bb6118
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..6802c31a3f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..d990de8de5b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc
new file mode 100644
index 00000000000..2c4a0c57404
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..daba85f07ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..6bdbebcdd24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc
new file mode 100644
index 00000000000..fed7b58d6ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..da209e2b894
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..ab20c3f87ac
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc
new file mode 100644
index 00000000000..66b330f2d5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..047ff571237
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..6c96a68f2b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc
new file mode 100644
index 00000000000..609e23f5df3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..b24d0d9a60a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..456ece81cdc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc
new file mode 100644
index 00000000000..cc7f8c3d287
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..5cf9521b7c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4f77cfe7a91
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc
new file mode 100644
index 00000000000..6c775fdd0e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc
new file mode 100644
index 00000000000..bd6936cb40f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc
new file mode 100644
index 00000000000..eba5aa120ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc
new file mode 100644
index 00000000000..442cec265eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc
new file mode 100644
index 00000000000..43fab5b1b8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc
new file mode 100644
index 00000000000..e933dc8aea4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc
new file mode 100644
index 00000000000..24132704a26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc
new file mode 100644
index 00000000000..658c8a2fb6d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc
new file mode 100644
index 00000000000..afed35e475f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc
new file mode 100644
index 00000000000..78cd653f795
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc
new file mode 100644
index 00000000000..c3c0bd70f9d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc
new file mode 100644
index 00000000000..c934dac6e65
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc
new file mode 100644
index 00000000000..02c2324be0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..16cd5b3e477
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..56914e866f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc
new file mode 100644
index 00000000000..708c36f3dd0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..fbabea41c66
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..50676c64bdf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc
new file mode 100644
index 00000000000..64d23ab4b8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc
new file mode 100644
index 00000000000..c80490ddfa9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc
new file mode 100644
index 00000000000..65717f6a449
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc
new file mode 100644
index 00000000000..9caf5aadf4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc
new file mode 100644
index 00000000000..4d57562fef6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc
new file mode 100644
index 00000000000..3b7dd998e24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc
new file mode 100644
index 00000000000..9a5219fd89e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc
new file mode 100644
index 00000000000..d829d8bc842
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc
new file mode 100644
index 00000000000..72e1647d920
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc
new file mode 100644
index 00000000000..61b1970c831
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc
new file mode 100644
index 00000000000..fb74cbec9a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc
new file mode 100644
index 00000000000..6f1892f30c5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc
new file mode 100644
index 00000000000..d2ae50b8800
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..42884f0f483
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..a617c0a0a8c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc
new file mode 100644
index 00000000000..67ed81b7001
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..521a4ef7a0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..232f0942d2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc
new file mode 100644
index 00000000000..4768c4fadba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc
new file mode 100644
index 00000000000..aab5b19a523
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc
new file mode 100644
index 00000000000..ec7ed1c31c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc
new file mode 100644
index 00000000000..ba0d08ef1f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..4cd49cc02f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..9f2da6b998d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc
new file mode 100644
index 00000000000..76491b7ae17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..33781182de0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..3a896e64381
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc
new file mode 100644
index 00000000000..d2d8a877ea9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..40720b069a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..64d9ea41c8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc
new file mode 100644
index 00000000000..b0e397ddb73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..e78204203ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..e39d053246b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc
new file mode 100644
index 00000000000..1776a81b8da
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..dc83a1403a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..620cc2f9f71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..ab18b1fadb9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..70c79f359d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..74c6cad8d64
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc
new file mode 100644
index 00000000000..a1cc484382a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..abaf5fe0184
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a1457acb5d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc
new file mode 100644
index 00000000000..cd8fe35af87
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc
new file mode 100644
index 00000000000..f30cc134fc7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc
new file mode 100644
index 00000000000..026689bb872
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc
new file mode 100644
index 00000000000..04e5c8dcf16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc
new file mode 100644
index 00000000000..be858a25c76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc
new file mode 100644
index 00000000000..5eb7970cd25
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc
new file mode 100644
index 00000000000..5d4b0905dee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc
new file mode 100644
index 00000000000..ad6ab7a5f32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc
new file mode 100644
index 00000000000..dae783e3054
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc
new file mode 100644
index 00000000000..292a093e014
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc
new file mode 100644
index 00000000000..8f8c86fc723
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc
new file mode 100644
index 00000000000..558bcf5a1ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc
new file mode 100644
index 00000000000..d54a5484b38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc
new file mode 100644
index 00000000000..89734584aaf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..09afc92c291
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc
new file mode 100644
index 00000000000..13e490d5298
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc
new file mode 100644
index 00000000000..d4bb463c8b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..8f6c0dd2633
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc
new file mode 100644
index 00000000000..8287f749609
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc
new file mode 100644
index 00000000000..2d2cc758dd2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc
new file mode 100644
index 00000000000..48a884f810b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc
new file mode 100644
index 00000000000..ff23a245932
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc
new file mode 100644
index 00000000000..6e9aaaf804f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc
new file mode 100644
index 00000000000..3f78361528e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc
new file mode 100644
index 00000000000..ad07cd19450
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc
new file mode 100644
index 00000000000..7f02b0491d6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc
new file mode 100644
index 00000000000..b64a1dacdbe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc
new file mode 100644
index 00000000000..c18e83e1a6a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc
new file mode 100644
index 00000000000..8e1614d6f9d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc
new file mode 100644
index 00000000000..c5b44d2f4f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc
new file mode 100644
index 00000000000..bfc96cec5aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc
new file mode 100644
index 00000000000..b56af2f577d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc
new file mode 100644
index 00000000000..312bf635926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc
new file mode 100644
index 00000000000..21a40601f0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc
new file mode 100644
index 00000000000..0c894b52df5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc
new file mode 100644
index 00000000000..3dd727183f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc
new file mode 100644
index 00000000000..5ce328f75ed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc
new file mode 100644
index 00000000000..d54bb34bbf5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc
new file mode 100644
index 00000000000..4bbc320dd39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc
new file mode 100644
index 00000000000..7e478bd4089
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc
new file mode 100644
index 00000000000..c964496e454
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bb40925621d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc
new file mode 100644
index 00000000000..5a58e97a07f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..0b84e78b2d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..38d08864098
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc
new file mode 100644
index 00000000000..5c3e91efa2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a1f7bde48a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..fcfb4f6fe78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc
new file mode 100644
index 00000000000..8326899f1f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..c56a5c92e00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d13dd683603
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc
new file mode 100644
index 00000000000..9415472ea27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..3cf44d29ebb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ce108525c2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc
new file mode 100644
index 00000000000..3d811ad94ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..800689b49f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..503f674e3f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc
new file mode 100644
index 00000000000..8a33738354b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..521839044eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4b7188655b6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc
new file mode 100644
index 00000000000..ebfeef00910
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc
new file mode 100644
index 00000000000..fc6c1d68f24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc
new file mode 100644
index 00000000000..fcc3e0c688a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc
new file mode 100644
index 00000000000..5806393f619
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc
new file mode 100644
index 00000000000..5429cd72deb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc
new file mode 100644
index 00000000000..dd8ae7a43d0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc
new file mode 100644
index 00000000000..abb45db813a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc
new file mode 100644
index 00000000000..f59216fe260
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc
new file mode 100644
index 00000000000..143d020ea39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc
new file mode 100644
index 00000000000..00fc1a42d7b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc
new file mode 100644
index 00000000000..f11e3d34e64
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc
new file mode 100644
index 00000000000..6c50d321c12
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc
new file mode 100644
index 00000000000..67da72a4f8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc
new file mode 100644
index 00000000000..edaa5da5819
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..967f63be80c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc
new file mode 100644
index 00000000000..fd96e76b041
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc
new file mode 100644
index 00000000000..cbc4bbe77ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..c7c5754a279
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc
new file mode 100644
index 00000000000..7fb0082e9b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc
new file mode 100644
index 00000000000..d1972a66933
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc
new file mode 100644
index 00000000000..57ec0817e81
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc
new file mode 100644
index 00000000000..fb14dc2a93e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc
new file mode 100644
index 00000000000..e2e57cfdd36
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc
new file mode 100644
index 00000000000..d1fb0844d32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc
new file mode 100644
index 00000000000..815a421b6e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc
new file mode 100644
index 00000000000..1f49e480fed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc
new file mode 100644
index 00000000000..ed73ab27b39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc
new file mode 100644
index 00000000000..4bbc4a8357b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc
new file mode 100644
index 00000000000..c8993db6266
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc
new file mode 100644
index 00000000000..8f7237f7565
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc
new file mode 100644
index 00000000000..dad171eb5e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc
new file mode 100644
index 00000000000..c6976640608
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc
new file mode 100644
index 00000000000..7e3b49eeaf3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc
new file mode 100644
index 00000000000..87083517140
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc
new file mode 100644
index 00000000000..9786519ecdd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc
new file mode 100644
index 00000000000..69f68155b2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc
new file mode 100644
index 00000000000..42a3e5fdd2a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc
new file mode 100644
index 00000000000..b3c457e7207
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc
new file mode 100644
index 00000000000..75410a7e1bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc
new file mode 100644
index 00000000000..4dedc6f8394
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc
new file mode 100644
index 00000000000..66cdd5458fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..9f6a5e66da6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc
new file mode 100644
index 00000000000..231236be4dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..297ee8aa460
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..8cb3566e533
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc
new file mode 100644
index 00000000000..85fbd99081a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..8da8192a063
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..68fba792469
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc
new file mode 100644
index 00000000000..de905c0fb5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..9d3b1bc299d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..c1f89bc4831
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc
new file mode 100644
index 00000000000..824f254f89a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..403deff73a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..94e079c9032
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc
new file mode 100644
index 00000000000..a0e8b14c11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..f2187c85f5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e7c6695034f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc
new file mode 100644
index 00000000000..97fb4819951
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..fde03ab1777
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..3076fe9967a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc
new file mode 100644
index 00000000000..33cfb2796df
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc
new file mode 100644
index 00000000000..565a723bd10
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc
new file mode 100644
index 00000000000..8b18da853f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc
new file mode 100644
index 00000000000..506e9daf930
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..e08ff6d8759
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..32b6c88b409
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc
new file mode 100644
index 00000000000..1879792e37c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..63cb5c1efae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..1ab9b0047d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc
new file mode 100644
index 00000000000..4bc48f38d1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc
new file mode 100644
index 00000000000..a4b515ae6e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc
new file mode 100644
index 00000000000..b16b3618d79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc
new file mode 100644
index 00000000000..ad5f7d97d00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc
new file mode 100644
index 00000000000..372eba52a7a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc
new file mode 100644
index 00000000000..316bc781f94
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc
new file mode 100644
index 00000000000..054b7de4cc1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc
new file mode 100644
index 00000000000..39dc1f063bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc
new file mode 100644
index 00000000000..ec4dbfddde4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc
new file mode 100644
index 00000000000..337d73f2222
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc
new file mode 100644
index 00000000000..bdff42ea69c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc
new file mode 100644
index 00000000000..cfd18904811
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc
new file mode 100644
index 00000000000..29380dee2d4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc
new file mode 100644
index 00000000000..6438d2e128d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..83538a716dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc
new file mode 100644
index 00000000000..8a5385eb018
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc
new file mode 100644
index 00000000000..cfe1a5863e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..c0d6f5d6e77
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc
new file mode 100644
index 00000000000..a1a6ab41b10
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc
new file mode 100644
index 00000000000..16f59d2c10c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc
new file mode 100644
index 00000000000..bc4c4c02d86
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc
new file mode 100644
index 00000000000..eb5a05907dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..52e468ad65d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3dcf542c3e1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc
new file mode 100644
index 00000000000..07f26470e31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..c49e1400916
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..e3b9896aebf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc
new file mode 100644
index 00000000000..93e302cfff7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a811aebfe48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..d80e9ec27a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc
new file mode 100644
index 00000000000..b54ead172a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..64435a00af8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..e180ab25175
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc
new file mode 100644
index 00000000000..913a8e39d93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..6c0d786355f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ea85ae5b532
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc
new file mode 100644
index 00000000000..e2d96ad0775
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..ebf6efe8192
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..864241bf103
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc
new file mode 100644
index 00000000000..b5f77babd16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..d285f886712
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..c94ed844589
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc
new file mode 100644
index 00000000000..58f6454f5eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..c765ccee423
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..19e870c48ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc
new file mode 100644
index 00000000000..70b4150fdf1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..d9da969ca73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..57bb37af5f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc
new file mode 100644
index 00000000000..ad5159bbabd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c9ee39d0114
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..a30a9d4395f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc
new file mode 100644
index 00000000000..42757de6ea8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..321441421ac
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..bc5bfd9e141
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc
new file mode 100644
index 00000000000..f1c85076865
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..0e547c1b56d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..1465aa38b9a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc
new file mode 100644
index 00000000000..7ab3b192531
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..54f158bb721
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..174b7f9b1a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc
new file mode 100644
index 00000000000..d0c2d723c8a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..a74018f4e32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..4353cdcca72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc
new file mode 100644
index 00000000000..ab3c4247929
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..2459c58a9d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..c29a31539a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc
new file mode 100644
index 00000000000..c0f1954e7b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..033c316c5e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..12adce10f3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc
new file mode 100644
index 00000000000..508309fca1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..91cdf1bfa2e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..c520be67867
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc
new file mode 100644
index 00000000000..35f230c112d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..94a5c86f13b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..2408ee12bc2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc
new file mode 100644
index 00000000000..1de188a59bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..6a502930aa7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..d24f52132c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..a6c215f9be6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..07a59faad93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..827f8f20d3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..f55e5b31510
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..917328fff97
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..bc4e2c1ee1a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..53c5a43538c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..9bb7d41b20f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..c837776083e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..4b224cf2255
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..04d35b01f47
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e2f0fd409b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..deda07a1170
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..e2dc688f3b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..ceb192b6f0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc
new file mode 100644
index 00000000000..6c709aba793
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc
new file mode 100644
index 00000000000..a432ad4e2dc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc
new file mode 100644
index 00000000000..ca36d61a0c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc
new file mode 100644
index 00000000000..25aec9fd2ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc
new file mode 100644
index 00000000000..9207f203e4a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..0d2d5bbf8cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc
new file mode 100644
index 00000000000..aa9e5cbe614
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc
new file mode 100644
index 00000000000..2f8767ac51f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..5a191a1d1f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc
new file mode 100644
index 00000000000..eb2e790b34f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc
new file mode 100644
index 00000000000..1f655c3b5b2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc
new file mode 100644
index 00000000000..24f9a5c2cca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc
new file mode 100644
index 00000000000..86940d1bc79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc
new file mode 100644
index 00000000000..b9cd9c2c2e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc
new file mode 100644
index 00000000000..e07bd30af26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc
new file mode 100644
index 00000000000..bf811a459f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc
new file mode 100644
index 00000000000..006e84eef6e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc
new file mode 100644
index 00000000000..5bc6f3940ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc
new file mode 100644
index 00000000000..69f0e1f8ec4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc
new file mode 100644
index 00000000000..473eba397bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc
new file mode 100644
index 00000000000..53cfce6231f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc
new file mode 100644
index 00000000000..493b74dba1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc
new file mode 100644
index 00000000000..40c32ce983b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc
new file mode 100644
index 00000000000..41edd207ef9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc
new file mode 100644
index 00000000000..4315608c710
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc
new file mode 100644
index 00000000000..626b31c29ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc
new file mode 100644
index 00000000000..b69d9998c3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc
new file mode 100644
index 00000000000..2d5f7a7bdab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc
new file mode 100644
index 00000000000..5d5b00b1af9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc
new file mode 100644
index 00000000000..d8fa0da4832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc
new file mode 100644
index 00000000000..32223bd3ca5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc
new file mode 100644
index 00000000000..b93b884fe3f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..a68917e8ead
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc
new file mode 100644
index 00000000000..103e0536af9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..8c4f129f868
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..bd4f76b00e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc
new file mode 100644
index 00000000000..0d0795d7628
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a511307e11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..87ad711269d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc
new file mode 100644
index 00000000000..750e63914be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..abb103eb633
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d5c8db6669d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc
new file mode 100644
index 00000000000..10d2aa2acd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..3663b729bc5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e34f45e3dfd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc
new file mode 100644
index 00000000000..bc5419eb290
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..1b40c7d531a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..d43ee3743fe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc
new file mode 100644
index 00000000000..eb7dd8ae4cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..7e29b05f001
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..73379fc9336
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc
new file mode 100644
index 00000000000..ad80e3c76e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..da2323c3890
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..2b7bf71329a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc
new file mode 100644
index 00000000000..90f59f80b14
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..239e4b74d0c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..919175a1304
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc
new file mode 100644
index 00000000000..67546406ccc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..00930432f3c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..3ccfc8205c4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc
new file mode 100644
index 00000000000..d048692b595
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..1a381b73078
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..cc9637e8d02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc
new file mode 100644
index 00000000000..1833537abab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..853be149e87
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..ab935f81066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc
new file mode 100644
index 00000000000..44a076b3b1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..6401f73f090
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..5e31458026e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc
new file mode 100644
index 00000000000..5fd1b352b39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..5b218f5fb46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..52feda74541
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc
new file mode 100644
index 00000000000..540be86d5cb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..44c4b69e914
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..2bd1c8dc38b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc
new file mode 100644
index 00000000000..d156bdbcfd4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..bf37f7697ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..1249170c3d6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc
new file mode 100644
index 00000000000..364a0758efd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..b4757e68257
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..9b4dea095c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc
new file mode 100644
index 00000000000..1e7c6056e88
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..db74da66b44
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..cc2224e0de4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc
new file mode 100644
index 00000000000..6c35351e603
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..f9f0e7d00cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..fef0cb50bb7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..4a170d1c072
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..d7e6f995534
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..cf98ed3e2bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..be0a0826857
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..0a7473296fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..36a8047c0f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..9b711f0ec2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..d6a23bf065e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..7a85877582d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..9ce0d5563da
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..7d38a5afa60
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..83fb8d5c754
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..174586782fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..3ded73de66b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..5c3c03bf98f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc
new file mode 100644
index 00000000000..84d81226ebc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc
new file mode 100644
index 00000000000..e8a8e548f73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc
new file mode 100644
index 00000000000..f05128952f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc
new file mode 100644
index 00000000000..f7122cd211b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..4c58ba4f49e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..1020726c2c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc
new file mode 100644
index 00000000000..5a3818ec668
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..d369c84ac4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9ccf64efc07
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc
new file mode 100644
index 00000000000..9b8fcc1b10a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc
new file mode 100644
index 00000000000..db98fe95d72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc
new file mode 100644
index 00000000000..2add295d16c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc
new file mode 100644
index 00000000000..5f2e35edbb1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc
new file mode 100644
index 00000000000..b1327432bbd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc
new file mode 100644
index 00000000000..ff4df352830
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc
new file mode 100644
index 00000000000..db0db36fc31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc
new file mode 100644
index 00000000000..dbda11bb3bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc
new file mode 100644
index 00000000000..85b5fd792e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc
new file mode 100644
index 00000000000..f18d44c3479
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc
new file mode 100644
index 00000000000..75bee8fad61
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc
new file mode 100644
index 00000000000..38501b26ae1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc
new file mode 100644
index 00000000000..5702dfe3b17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..e40e2ffe1b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..8883052e3a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc
new file mode 100644
index 00000000000..95456f74539
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..20c7cb3a19d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..b2ca775a178
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc
new file mode 100644
index 00000000000..930e52678c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc
new file mode 100644
index 00000000000..4bed4ff1e04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc
new file mode 100644
index 00000000000..6509df3c534
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc
new file mode 100644
index 00000000000..2e398c863d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..1a60835a33e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..d96abe5803b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc
new file mode 100644
index 00000000000..bbc24e801a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..caeb8ccf67e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..6fedbf1c29d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc
new file mode 100644
index 00000000000..26f47443239
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..de08a525490
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..be930358fdb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc
new file mode 100644
index 00000000000..57ff6ae61af
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..52a0357f05b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..acb3ecb39eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc
new file mode 100644
index 00000000000..fa003a2ec80
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..622070bb537
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..235c5b2d1ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..88f2761ff5d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..3cb868c01d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e95ac57ece6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc
new file mode 100644
index 00000000000..9b6fe582036
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..abbd4528f8f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..d30d9d8883a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc
new file mode 100644
index 00000000000..a1466c6507e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc
new file mode 100644
index 00000000000..a88da8d1de9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc
new file mode 100644
index 00000000000..5dded0b55aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc
new file mode 100644
index 00000000000..1a364735198
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..aebd547bc3a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..6afbf82c5dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc
new file mode 100644
index 00000000000..cbc8f919d46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c867a8de9fe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..0084d8d7078
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc
new file mode 100644
index 00000000000..62797f670d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc
new file mode 100644
index 00000000000..f40e6b04ab4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc
new file mode 100644
index 00000000000..a81bb65f3f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc
new file mode 100644
index 00000000000..3470a9c9159
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc
new file mode 100644
index 00000000000..409c254442b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc
new file mode 100644
index 00000000000..e335ec76c93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc
new file mode 100644
index 00000000000..2db4dea0b3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc
new file mode 100644
index 00000000000..0d447176e25
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc
new file mode 100644
index 00000000000..239a7a6692d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc
new file mode 100644
index 00000000000..9d82f1d1172
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc
new file mode 100644
index 00000000000..3b360555852
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc
new file mode 100644
index 00000000000..fa00db7f4ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc
new file mode 100644
index 00000000000..f809a67ac00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc
new file mode 100644
index 00000000000..6792557a8a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b140a33cc8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc
new file mode 100644
index 00000000000..2d00ae3c934
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc
new file mode 100644
index 00000000000..2d879f88b97
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..4c5b4039503
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc
new file mode 100644
index 00000000000..d3f63cecf00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc
new file mode 100644
index 00000000000..c42ac91aa6c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc
new file mode 100644
index 00000000000..3ce7bf0a493
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc
new file mode 100644
index 00000000000..f700fa0a398
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..988f5e723ca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bbe37fade46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc
new file mode 100644
index 00000000000..d6aa66ac88b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..330634647c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..12478a3f4e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc
new file mode 100644
index 00000000000..8c41bcecea6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..b53eb7c64e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..564bf132849
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc
new file mode 100644
index 00000000000..17b8714eceb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..5970d009e21
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d62ffbea7d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc
new file mode 100644
index 00000000000..1f56d2d4968
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..a0ce1d47684
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..bcbbdaf78d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc
new file mode 100644
index 00000000000..96f48b836a8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..9ed5311ab37
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e55707d5077
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc
new file mode 100644
index 00000000000..5cd0ea38f44
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..cd3a2bf5273
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..8139c345a81
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc
new file mode 100644
index 00000000000..e5184dd5a50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc
new file mode 100644
index 00000000000..89c7b9d5db7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc
new file mode 100644
index 00000000000..540e66bf038
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc
new file mode 100644
index 00000000000..c92cd794d76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc
new file mode 100644
index 00000000000..4b3a8f89b92
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc
new file mode 100644
index 00000000000..0caaeaac8f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc
new file mode 100644
index 00000000000..07ee6a1e619
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc
new file mode 100644
index 00000000000..c5a24f463f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc
new file mode 100644
index 00000000000..bd67831e0a3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc
new file mode 100644
index 00000000000..f03c6cc86e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc
new file mode 100644
index 00000000000..56bb3c2c6c6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc
new file mode 100644
index 00000000000..fb742c73c20
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc
new file mode 100644
index 00000000000..1a03db95e4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc
new file mode 100644
index 00000000000..348355ad4b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc
new file mode 100644
index 00000000000..0b775643a78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc
new file mode 100644
index 00000000000..0325569e8b8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc
new file mode 100644
index 00000000000..3ebc0e5eef3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b3970109140
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc
new file mode 100644
index 00000000000..dd1660bac18
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..525d39b0e05
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..ca07cd1eb15
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc
new file mode 100644
index 00000000000..18e2d574150
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..bb8c03a5d56
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..cd62bf3a279
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc
new file mode 100644
index 00000000000..8021e3965b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..ebdb78599d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..968fe783144
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc
new file mode 100644
index 00000000000..14e565bcf33
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..be62012f2be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..f97188fcceb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc
new file mode 100644
index 00000000000..047c01cef90
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..8b8dfc0f097
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..982e3901c05
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc
new file mode 100644
index 00000000000..c4a8a59b623
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..f9ceb8b56be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..f07524c129a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc
new file mode 100644
index 00000000000..ab0be546457
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..6655d90326b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..959a368cc43
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc
new file mode 100644
index 00000000000..5593ff46d08
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..2073875046e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..1b7465aaae4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc
new file mode 100644
index 00000000000..e5eddab9dcf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..6ed1deca404
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..95ec2baa936
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc
new file mode 100644
index 00000000000..4de27d782fa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..44c42ce9791
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..dce845b714b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc
new file mode 100644
index 00000000000..402e6fa555c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..bc1f652cfe6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3270b07460f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc
new file mode 100644
index 00000000000..a4a84c1438b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..10e303fcbb1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..dd5cb5f6b27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..12dbfa81102
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..0c951b1e39a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..58a81f027e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..6655cdae33f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..b919dc872e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d1c2f2edc75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..a9825e90832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..9837223badd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..b466d3f1827
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..eb1d51acba5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..f228b70237b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..83b33288b42
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..b29605032db
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..18c168306a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..187273c581a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc
new file mode 100644
index 00000000000..6525f575d67
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc
new file mode 100644
index 00000000000..b279c4d6e2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc
new file mode 100644
index 00000000000..02495cb3b82
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char.cc
new file mode 100644
index 00000000000..c5044c6b99b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..794872c0833
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ee87fbfb637
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc
new file mode 100644
index 00000000000..b25efd91e37
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..433dee6758e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..f4b44a44cb3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc
new file mode 100644
index 00000000000..f6479061ad7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc
new file mode 100644
index 00000000000..2d046a21fb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc
new file mode 100644
index 00000000000..919969f8bab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double.cc
new file mode 100644
index 00000000000..e22e22dd8d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc
new file mode 100644
index 00000000000..dbd318cd74a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc
new file mode 100644
index 00000000000..2e401bb0263
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float.cc
new file mode 100644
index 00000000000..074eba229d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc
new file mode 100644
index 00000000000..abd78d194c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc
new file mode 100644
index 00000000000..bf06696dc5e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int.cc
new file mode 100644
index 00000000000..00eb40405fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc
new file mode 100644
index 00000000000..8746c9c550e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc
new file mode 100644
index 00000000000..f30884ec6a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long.cc
new file mode 100644
index 00000000000..2610dd74481
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..efd8f2f307d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..3b184d8f4a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc
new file mode 100644
index 00000000000..8d4e204a558
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..dbc8d820951
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..23ade5883c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc
new file mode 100644
index 00000000000..b318368c4a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc
new file mode 100644
index 00000000000..a03fe143069
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc
new file mode 100644
index 00000000000..b455eebdd89
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short.cc
new file mode 100644
index 00000000000..a3ad21df595
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..0442070b9f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..1974f910812
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc
new file mode 100644
index 00000000000..637c25a5c1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..c12b4aba4a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..da9b4e86b3e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc
new file mode 100644
index 00000000000..25e3a0a7705
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..83d20644082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..42cf8f63ad2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc
new file mode 100644
index 00000000000..3196bdc84f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..21b883d0642
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..2b20cf03502
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc
new file mode 100644
index 00000000000..59ab58edf2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..41131bd5f2d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..b8748fe1bf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..09b40f12e08
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..9e4d6445a8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..47f6572c21d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc
new file mode 100644
index 00000000000..aad63de33c5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..8880fc0e8e0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..b9d3ca35272
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc
new file mode 100644
index 00000000000..c88dcebda30
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc
new file mode 100644
index 00000000000..c6c5b258153
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc
new file mode 100644
index 00000000000..1fdb568b0a2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc
new file mode 100644
index 00000000000..66092cdc40b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..2b5b7a89b50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..9ff0b2c911c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc
new file mode 100644
index 00000000000..277ff5cf799
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..e42d1adb9a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..dee15db677f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc
new file mode 100644
index 00000000000..8b173d1ff5e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc
new file mode 100644
index 00000000000..6df4d82726c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc
new file mode 100644
index 00000000000..538936d9ec0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc
new file mode 100644
index 00000000000..1d8f787a517
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc
new file mode 100644
index 00000000000..a535b554801
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc
new file mode 100644
index 00000000000..a9a923b39e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc
new file mode 100644
index 00000000000..983ccd569df
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc
new file mode 100644
index 00000000000..3d9aca0c2cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc
new file mode 100644
index 00000000000..d5a60b21f6f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc
new file mode 100644
index 00000000000..d067bdc064d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc
new file mode 100644
index 00000000000..0c0494443fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc
new file mode 100644
index 00000000000..bf40cfc4769
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc
new file mode 100644
index 00000000000..c39e9988c3e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc
new file mode 100644
index 00000000000..685c9892f0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..ff85cd56ddb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc
new file mode 100644
index 00000000000..c2f39fe67ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc
new file mode 100644
index 00000000000..6d6d33e93c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..040d809d4f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc
new file mode 100644
index 00000000000..6223592a039
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc
new file mode 100644
index 00000000000..2175dfbe172
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc
new file mode 100644
index 00000000000..aaef87a2a3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc
new file mode 100644
index 00000000000..b9af0f6d195
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..98b6acad3a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..41961d26b52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc
new file mode 100644
index 00000000000..4d9414f368c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..9e54f69605d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..53bfac1fa11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc
new file mode 100644
index 00000000000..bc57dc2b24f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..ff5fce0d845
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..422f8a704e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc
new file mode 100644
index 00000000000..d8521e699b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..7d967629035
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..6c08d3087ff
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc
new file mode 100644
index 00000000000..b605891903a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..df70fb5e234
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..0d133b07a02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc
new file mode 100644
index 00000000000..70fce75c309
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..b33657d8790
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..3e7666d25b5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc
new file mode 100644
index 00000000000..731ee35e9f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..bba4697efb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4dc5a4d9352
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc
new file mode 100644
index 00000000000..d726e391412
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc
new file mode 100644
index 00000000000..9a391af162b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc
new file mode 100644
index 00000000000..76bf3d0fdb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc
new file mode 100644
index 00000000000..c3d22e3fe51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc
new file mode 100644
index 00000000000..b8cd91df1a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc
new file mode 100644
index 00000000000..7ec6c9ea47d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc
new file mode 100644
index 00000000000..1780e2dd258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc
new file mode 100644
index 00000000000..34f987165f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc
new file mode 100644
index 00000000000..61fb5f28794
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc
new file mode 100644
index 00000000000..5d488071143
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc
new file mode 100644
index 00000000000..d89006bc16d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc
new file mode 100644
index 00000000000..2885e825ecf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char.cc
new file mode 100644
index 00000000000..418c0afde04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc
new file mode 100644
index 00000000000..8e03105545d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..fad6ab5ac1e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc
new file mode 100644
index 00000000000..8bb564337ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c86bf41329a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..bb86cb1122e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc
new file mode 100644
index 00000000000..3bde89f1ef4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc
new file mode 100644
index 00000000000..56df57d63e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc
new file mode 100644
index 00000000000..54569e8f2d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double.cc
new file mode 100644
index 00000000000..bd9af0c8901
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc
new file mode 100644
index 00000000000..f513909e8ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc
new file mode 100644
index 00000000000..ecfdb179758
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float.cc
new file mode 100644
index 00000000000..4b2bd1c6613
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc
new file mode 100644
index 00000000000..2d758d5eb50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc
new file mode 100644
index 00000000000..d55e3a13751
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int.cc
new file mode 100644
index 00000000000..14c02ac49cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc
new file mode 100644
index 00000000000..732890cc136
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc
new file mode 100644
index 00000000000..0898e26fd12
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long.cc
new file mode 100644
index 00000000000..882a2bd5e52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc
new file mode 100644
index 00000000000..b607fe81fe0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc
new file mode 100644
index 00000000000..05581dc5a0d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc
new file mode 100644
index 00000000000..cf741d54b2b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc
new file mode 100644
index 00000000000..0e24adfe874
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc
new file mode 100644
index 00000000000..575575286cd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc
new file mode 100644
index 00000000000..49896a5e1c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc
new file mode 100644
index 00000000000..cdf2bcd0805
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc
new file mode 100644
index 00000000000..1eacae08dee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short.cc
new file mode 100644
index 00000000000..9afec6a6f66
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc
new file mode 100644
index 00000000000..26abe7185f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..798fe3b90a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc
new file mode 100644
index 00000000000..b1ff461462d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..2cb9489ab8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..1ea3ab4c80e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc
new file mode 100644
index 00000000000..c3d0a898ac0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..5711322b7c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..c6ab76b7bd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc
new file mode 100644
index 00000000000..3068d4ca7aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..f640f2e6da2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..ce454db5cf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc
new file mode 100644
index 00000000000..433ae996eb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..a25540981c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e5a2be2a6f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc
new file mode 100644
index 00000000000..9735360d999
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..8597525567e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e08dab57a4e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc
new file mode 100644
index 00000000000..c98a565773c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..a5d37a9949c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..ba02727f6a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc
new file mode 100644
index 00000000000..07c313833f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc
new file mode 100644
index 00000000000..e142e79c84e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc
new file mode 100644
index 00000000000..834c2c3df68
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc
new file mode 100644
index 00000000000..fefc41e3822
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc
new file mode 100644
index 00000000000..88376e09ee9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc
new file mode 100644
index 00000000000..565e225997d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc
new file mode 100644
index 00000000000..25a71653eb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc
new file mode 100644
index 00000000000..096c122d151
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc
new file mode 100644
index 00000000000..3dbb43ce5c7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc
new file mode 100644
index 00000000000..e70d0624fe8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc
new file mode 100644
index 00000000000..a4c9a607644
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc
new file mode 100644
index 00000000000..ab6fe04efad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc
new file mode 100644
index 00000000000..3bcb2b145fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc
new file mode 100644
index 00000000000..a3f4ec408d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..e0136767bb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc
new file mode 100644
index 00000000000..68c00bd6483
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc
new file mode 100644
index 00000000000..5ea4ee445f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9e682c5249d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc
new file mode 100644
index 00000000000..c90f1bb9d54
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc
new file mode 100644
index 00000000000..c7dbd5156ea
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc
new file mode 100644
index 00000000000..2e4eea69f11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc
new file mode 100644
index 00000000000..49feda56304
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc
new file mode 100644
index 00000000000..8631b832bd9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc
new file mode 100644
index 00000000000..ff9d8573ac2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc
new file mode 100644
index 00000000000..54b889cf83b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc
new file mode 100644
index 00000000000..08915acc8b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc
new file mode 100644
index 00000000000..4d3cdefd393
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc
new file mode 100644
index 00000000000..e600bcbab3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc
new file mode 100644
index 00000000000..a6c4208cb99
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc
new file mode 100644
index 00000000000..e2ee03eb9bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc
new file mode 100644
index 00000000000..962dfacf1c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc
new file mode 100644
index 00000000000..86acb97528f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc
new file mode 100644
index 00000000000..13b6178425e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc
new file mode 100644
index 00000000000..0ca90a052d1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc
new file mode 100644
index 00000000000..0e91b631975
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc
new file mode 100644
index 00000000000..004d3e3f5f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc
new file mode 100644
index 00000000000..2fbfcd80747
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc
new file mode 100644
index 00000000000..01f0e4c18b2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc
new file mode 100644
index 00000000000..34d90771dc2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc
new file mode 100644
index 00000000000..19375502263
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc
new file mode 100644
index 00000000000..34654adc06a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3b9e2f8d286
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc
new file mode 100644
index 00000000000..167f9cc4f4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..4cd4990fd6a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..7db0ab8e1ea
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc
new file mode 100644
index 00000000000..726683dbb7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..3195c2f4cf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..7418dfc973c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc
new file mode 100644
index 00000000000..6527c4dd66d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..da8dbb92f28
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d9473d3731c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc
new file mode 100644
index 00000000000..aabffd9b45e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..f5d933f600e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..da250a2ae75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc
new file mode 100644
index 00000000000..fb110076926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..4377025dea5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..19549a6c7c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc
new file mode 100644
index 00000000000..c8d79627abc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..e02e923bc50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..cc4f3b6df19
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc
new file mode 100644
index 00000000000..e8cdb8a87a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc
new file mode 100644
index 00000000000..b34e52e7ce9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc
new file mode 100644
index 00000000000..5027f4731fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char.cc
new file mode 100644
index 00000000000..45963f1c3bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc
new file mode 100644
index 00000000000..7e8d9bc485b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..f8a0d0f423a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc
new file mode 100644
index 00000000000..4a312aafc1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc
new file mode 100644
index 00000000000..0c4dc08ede6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..f4f72e0079a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc
new file mode 100644
index 00000000000..771f162def2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc
new file mode 100644
index 00000000000..2ce1fba820b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc
new file mode 100644
index 00000000000..4e37962fa98
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double.cc
new file mode 100644
index 00000000000..a46b26ad82c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc
new file mode 100644
index 00000000000..38bd2b87841
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc
new file mode 100644
index 00000000000..9bd0353a0b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float.cc
new file mode 100644
index 00000000000..9f69bb92434
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc
new file mode 100644
index 00000000000..cf34de20fd3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc
new file mode 100644
index 00000000000..723ffa29b7d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int.cc
new file mode 100644
index 00000000000..3c2f4599d1f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc
new file mode 100644
index 00000000000..ab892ec4882
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc
new file mode 100644
index 00000000000..2034a6dead9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long.cc
new file mode 100644
index 00000000000..c0ea634cb78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc
new file mode 100644
index 00000000000..cbcd8981926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc
new file mode 100644
index 00000000000..dcc65a34cd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc
new file mode 100644
index 00000000000..f07bf55b066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc
new file mode 100644
index 00000000000..19ce4612cb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc
new file mode 100644
index 00000000000..d7ca3bebbe9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc
new file mode 100644
index 00000000000..a1d1a91fa19
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc
new file mode 100644
index 00000000000..a5e0352d9ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc
new file mode 100644
index 00000000000..de7b69b8d36
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short.cc
new file mode 100644
index 00000000000..d5c3ed05d43
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc
new file mode 100644
index 00000000000..b9242b69551
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..f69adef42c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc
new file mode 100644
index 00000000000..3d44ee57712
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..72d15dabf41
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..52011535c7b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc
new file mode 100644
index 00000000000..49167f61bdf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..bd955b7e72e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..4840ffd74fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc
new file mode 100644
index 00000000000..a725ce1b2a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..ac94a36f0cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..a27a5a7cb72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc
new file mode 100644
index 00000000000..4b9d6d9d0a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..b51f48ef4a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..c3e4386453f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc
new file mode 100644
index 00000000000..c42d302d75f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..29b38f9d276
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..96b6d9403ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc
new file mode 100644
index 00000000000..53b0ce0d9e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..d38710ef118
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a90234fb0aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc
new file mode 100644
index 00000000000..0429dd527d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/abs.h b/libstdc++-v3/testsuite/experimental/simd/tests/abs.h
new file mode 100644
index 00000000000..8769aa0ac20
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/abs.h
@@ -0,0 +1,22 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include <cmath>    // abs & sqrt
+#include <cstdlib>  // integer abs
+#include "bits/test_values.h"
+
+template <typename V> void test()
+{
+  if constexpr (std::is_signed_v<typename V::value_type>)
+    {
+      using std::abs;
+      using T = typename V::value_type;
+      using L = std::numeric_limits<T>;
+      test_values<V>({L::max(), L::lowest(), L::min(), -L::max() / 2, T(), -T(),
+		      T(-1), T(-2)},
+		     {1000}, [](V input) {
+		       const V expected(
+			 [&](auto i) { return T(std::abs(T(input[i]))); });
+		       COMPARE(abs(input), expected) << "input: " << input;
+		     });
+    }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h b/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h
new file mode 100644
index 00000000000..088646838ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h
@@ -0,0 +1,13 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V> void test()
+{
+  using T = typename V::value_type;
+  V a{[](auto i) -> T { return i & 1u; }};
+  V b{[](auto i) -> T { return (i + 1u) & 1u; }};
+  COMPARE(min(a, b), V{0});
+  COMPARE(max(a, b), V{1});
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h
new file mode 100644
index 00000000000..f4e7b3b6f13
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h
@@ -0,0 +1,145 @@
+#include <array>
+
+// is_conversion_undefined {{{1
+/* implementation-defined
+ * ======================
+ * §4.7 p3 (integral conversions)
+ *  If the destination type is signed, the value is unchanged if it can be represented in the
+ *  destination type (and bit-field width); otherwise, the value is implementation-defined.
+ *
+ * undefined
+ * =========
+ * §4.9/1  (floating-point conversions)
+ *   If the source value is neither exactly represented in the destination type nor between
+ *   two adjacent destination values the result is undefined.
+ *
+ * §4.10/1 (floating-integral conversions)
+ *  floating point type can be converted to integer type.
+ *  The behavior is undefined if the truncated value cannot be
+ *  represented in the destination type.
+ *
+ * §4.10/2
+ *  integer can be converted to floating point type.
+ *  If the value being converted is outside the range of values that can be represented, the
+ *  behavior is undefined.
+ */
+template <typename To, typename From>
+constexpr bool is_conversion_undefined_impl(From x, std::true_type)
+{
+    return x > static_cast<long double>(std::numeric_limits<To>::max()) ||
+           x < static_cast<long double>(std::numeric_limits<To>::min());
+}
+
+template <typename To, typename From>
+constexpr bool is_conversion_undefined_impl(From, std::false_type)
+{
+    return false;
+}
+
+template <typename To, typename From> constexpr bool is_conversion_undefined(From x)
+{
+    static_assert(std::is_arithmetic<From>::value,
+                  "this overload is only meant for builtin arithmetic types");
+    return is_conversion_undefined_impl<To, From>(
+        x, std::integral_constant<bool, (std::is_floating_point<From>::value &&
+                                         (std::is_integral<To>::value ||
+                                          (std::is_floating_point<To>::value &&
+                                           sizeof(From) > sizeof(To))))>());
+}
+
+static_assert(is_conversion_undefined<uint>(float(0x100000000LL)),
+              "testing my expectations of is_conversion_undefined");
+static_assert(!is_conversion_undefined<float>(0x100000000LL),
+              "testing my expectations of is_conversion_undefined");
+
+template <typename To, typename T, typename A>
+inline std::experimental::simd_mask<T, A> is_conversion_undefined(const std::experimental::simd<T, A> &x)
+{
+    std::experimental::simd_mask<T, A> k = false;
+    for (std::size_t i = 0; i < x.size(); ++i) {
+        k[i] = is_conversion_undefined(x[i]);
+    }
+    return k;
+}
+
+//operators helpers  //{{{1
+template <class T> constexpr T genHalfBits()
+{
+    return std::numeric_limits<T>::max() >> (std::numeric_limits<T>::digits / 2);
+}
+template <> constexpr long double genHalfBits<long double>() { return 0; }
+template <> constexpr double genHalfBits<double>() { return 0; }
+template <> constexpr float genHalfBits<float>() { return 0; }
+
+template <class U, class T, class UU> constexpr U avoid_ub(UU x)
+{
+    return is_conversion_undefined<T>(U(x)) ? U(0) : U(x);
+}
+
+template <class U, class T, class UU> constexpr U avoid_ub2(UU x)
+{
+    return is_conversion_undefined<U>(x) ? U(0) : avoid_ub<U, T>(x);
+}
+
+// conversion test input data //{{{1
+template <class U, class T>
+static const std::array<U, 53> cvt_input_data = {{
+    avoid_ub<U, T>(0xc0000080U),
+    avoid_ub<U, T>(0xc0000081U),
+    avoid_ub<U, T>(0xc0000082U),
+    avoid_ub<U, T>(0xc0000084U),
+    avoid_ub<U, T>(0xc0000088U),
+    avoid_ub<U, T>(0xc0000090U),
+    avoid_ub<U, T>(0xc00000A0U),
+    avoid_ub<U, T>(0xc00000C0U),
+    avoid_ub<U, T>(0xc000017fU),
+    avoid_ub<U, T>(0xc0000180U),
+    avoid_ub<U, T>(0x100000001LL),
+    avoid_ub<U, T>(0x100000011LL),
+    avoid_ub<U, T>(0x100000111LL),
+    avoid_ub<U, T>(0x100001111LL),
+    avoid_ub<U, T>(0x100011111LL),
+    avoid_ub<U, T>(0x100111111LL),
+    avoid_ub<U, T>(0x101111111LL),
+    avoid_ub<U, T>(-0x100000001LL),
+    avoid_ub<U, T>(-0x100000011LL),
+    avoid_ub<U, T>(-0x100000111LL),
+    avoid_ub<U, T>(-0x100001111LL),
+    avoid_ub<U, T>(-0x100011111LL),
+    avoid_ub<U, T>(-0x100111111LL),
+    avoid_ub<U, T>(-0x101111111LL),
+    avoid_ub<U, T>(std::numeric_limits<U>::min()),
+    avoid_ub<U, T>(std::numeric_limits<U>::min() + 1),
+    avoid_ub<U, T>(std::numeric_limits<U>::lowest()),
+    avoid_ub<U, T>(std::numeric_limits<U>::lowest() + 1),
+    avoid_ub<U, T>(-1),
+    avoid_ub<U, T>(-10),
+    avoid_ub<U, T>(-100),
+    avoid_ub<U, T>(-1000),
+    avoid_ub<U, T>(-10000),
+    avoid_ub<U, T>(0),
+    avoid_ub<U, T>(1),
+    avoid_ub<U, T>(genHalfBits<U>() - 1),
+    avoid_ub<U, T>(genHalfBits<U>()),
+    avoid_ub<U, T>(genHalfBits<U>() + 1),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() - 1),
+    avoid_ub<U, T>(std::numeric_limits<U>::max()),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() - 0xff),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() - 0xff),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() - 0x55),
+    avoid_ub<U, T>(-(std::numeric_limits<U>::min() + 1)),
+    avoid_ub<U, T>(-std::numeric_limits<U>::max()),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 6 - 1)),
+    avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 6 - 1)),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 4 - 1)),
+    avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 4 - 1)),
+    avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 2 - 1)),
+    avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 2 - 1)),
+    avoid_ub<U, T>(std::numeric_limits<T>::max() - 1),
+    avoid_ub<U, T>(std::numeric_limits<T>::max() * 0.75),
+}};
+
+template <class T, class U> struct cvt_inputs {
+    static constexpr size_t size() { return cvt_input_data<U, T>.size(); }
+    U operator[](size_t i) const { return cvt_input_data<U, T>[i]; }
+};
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h
new file mode 100644
index 00000000000..931b36edb61
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h
@@ -0,0 +1,62 @@
+/*  This file is part of the Vc library. {{{
+Copyright © 2017 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+    * Neither the names of contributing organizations nor the
+      names of its contributors may be used to endorse or promote products
+      derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#include <experimental/simd>
+
+template <class M> inline M make_mask(const std::initializer_list<bool> &init)
+{
+    std::size_t i = 0;
+    M r = {};
+    for (;;) {
+        for (bool x : init) {
+            r[i] = x;
+            if (++i == M::size()) {
+                return r;
+            }
+        }
+    }
+}
+
+template <class V>
+inline V make_vec(const std::initializer_list<typename V::value_type> &init,
+                  typename V::value_type inc = 0)
+{
+    std::size_t i = 0;
+    V r = {};
+    typename V::value_type base = 0;
+    for (;;) {
+        for (auto x : init) {
+            r[i] = base + x;
+            if (++i == V::size()) {
+                return r;
+            }
+        }
+        base += inc;
+    }
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h
new file mode 100644
index 00000000000..ebf58cd0d32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h
@@ -0,0 +1,112 @@
+#include <tuple>
+#include <utility>
+#include <cstdio>
+
+template <typename T> struct SincosReference //{{{1
+{
+    T x, s, c;
+
+    std::tuple<const T &, const T &, const T &> as_tuple() const
+    {
+        return std::tie(x, s, c);
+    }
+};
+
+template <typename T> struct Reference {
+    T x, ref;
+
+    std::tuple<const T &, const T &> as_tuple() const { return std::tie(x, ref); }
+};
+
+template <typename T> struct Array
+{
+    std::size_t size_;
+    const T *data_;
+    Array() : size_(0), data_(nullptr) {}
+    Array(size_t s, const T *p) : size_(s), data_(p) {}
+    const T *begin() const { return data_; }
+    const T *end() const { return data_ + size_; }
+    std::size_t size() const { return size_; }
+};
+
+namespace function {
+struct sincos{ static constexpr const char *const str = "sincos"; };
+struct atan  { static constexpr const char *const str = "atan"; };
+struct asin  { static constexpr const char *const str = "asin"; };
+struct acos  { static constexpr const char *const str = "acos"; };
+struct log   { static constexpr const char *const str = "ln"; };
+struct log2  { static constexpr const char *const str = "log2"; };
+struct log10 { static constexpr const char *const str = "log10"; };
+}
+
+template <class F> struct testdatatype_for_function {
+    template <class T> using type = Reference<T>;
+};
+template <> struct testdatatype_for_function<function::sincos> {
+    template <class T> using type = SincosReference<T>;
+};
+template <class F, class T>
+using testdatatype_for_function_t =
+    typename testdatatype_for_function<F>::template type<T>;
+
+template<typename T> struct StaticDeleter
+{
+    const T *ptr;
+    StaticDeleter(const T *p) : ptr(p) {}
+    ~StaticDeleter() { delete[] ptr; }
+};
+
+template <class F, class T> inline std::string filename()
+{
+    static_assert(std::is_floating_point<T>::value, "");
+    using Lim = std::numeric_limits<T>;
+    static const auto cache =
+      std::string("reference-") + F::str +
+      (sizeof(T) == 4 && Lim::digits == 24 && Lim::max_exponent == 128
+	 ? "-sp"
+	 : (sizeof(T) == 8 && Lim::digits == 53 && Lim::max_exponent == 1024
+	      ? "-dp"
+	      : (sizeof(T) == 16 && Lim::digits == 64 &&
+		     Lim::max_exponent == 16384
+		   ? "-dep"
+		   : (sizeof(T) == 16 && Lim::digits == 113 &&
+			  Lim::max_exponent == 16384
+			? "-qp"
+			: "-unknown")))) +
+      ".dat";
+    return cache;
+}
+
+template <class Fun, class T, class Ref = testdatatype_for_function_t<Fun, T>>
+Array<Ref> referenceData()
+{
+  static Array<Ref> data;
+  if (data.data_ == nullptr)
+    {
+      FILE* file = std::fopen(filename<Fun, T>().c_str(), "rb");
+      if (file)
+	{
+	  std::fseek(file, 0, SEEK_END);
+	  const size_t size = std::ftell(file) / sizeof(Ref);
+	  std::rewind(file);
+	  auto                      mem = new Ref[size];
+	  static StaticDeleter<Ref> _cleanup(data.data_);
+	  data.size_ = std::fread(mem, sizeof(Ref), size, file);
+	  data.data_ = mem;
+	  std::fclose(file);
+	}
+      else
+	{
+	  __builtin_fprintf(
+	    stderr,
+	    "%s:%d: the reference data %s does not exist in the current "
+	    "working directory.\n",
+	    __FILE__, __LINE__, filename<Fun, T>().c_str());
+	  __builtin_abort();
+	}
+    }
+  return data;
+}
+
+//}}}1
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h
new file mode 100644
index 00000000000..1eb1b0d1681
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h
@@ -0,0 +1,170 @@
+#ifndef VC_TESTS_METAHELPERS_H_
+#define VC_TESTS_METAHELPERS_H_
+
+#include <functional>
+#include <limits>
+#include <type_traits>
+#include <utility>
+
+namespace vir
+{
+namespace test
+{
+// operator_is_substitution_failure {{{1
+template <class A, class B, class Op>
+constexpr bool operator_is_substitution_failure_impl(float)
+{
+  return true;
+}
+
+template <class A, class B, class Op>
+constexpr
+    typename std::conditional<true, bool,
+                              decltype(Op()(std::declval<A>(), std::declval<B>()))>::type
+    operator_is_substitution_failure_impl(int)
+{
+  return false;
+}
+
+template <class... Ts> constexpr bool operator_is_substitution_failure()
+{
+  return operator_is_substitution_failure_impl<Ts...>(int());
+}
+
+// sfinae_is_callable{{{1
+#ifdef Vc_CLANG
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wundefined-inline"
+#endif
+template <class... Args, class F>
+constexpr auto sfinae_is_callable_impl(int, F &&f) -> typename std::conditional<
+    true, std::true_type, decltype(std::forward<F>(f)(std::declval<Args>()...))>::type;
+template <class... Args, class F> constexpr std::false_type sfinae_is_callable_impl(float, const F &);
+template <class... Args, class F> constexpr bool sfinae_is_callable(F &&)
+{
+  return decltype(sfinae_is_callable_impl<Args...>(int(), std::declval<F>()))::value;
+}
+template <class... Args, class F>
+constexpr auto sfinae_is_callable_t(F &&f)
+    -> decltype(sfinae_is_callable_impl<Args...>(int(), std::declval<F>()));
+
+#ifdef Vc_CLANG
+#pragma clang diagnostic pop
+#endif
+
+// traits {{{1
+template <class A, class B> constexpr bool has_less_bits()
+{
+  return std::numeric_limits<A>::digits < std::numeric_limits<B>::digits;
+}
+
+//}}}1
+}  // namespace test
+}  // namespace vir
+
+// more operator objects {{{1
+struct assignment {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() = std::declval<B>()) operator()(A &&a,
+                                                                         B &&b) const
+        noexcept(noexcept(std::forward<A>(a) = std::forward<B>(b)))
+    {
+        return std::forward<A>(a) = std::forward<B>(b);
+    }
+};
+
+struct bit_shift_left {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() << std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) << std::forward<B>(b)))
+    {
+        return std::forward<A>(a) << std::forward<B>(b);
+    }
+};
+
+struct bit_shift_right {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() >> std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) >> std::forward<B>(b)))
+    {
+        return std::forward<A>(a) >> std::forward<B>(b);
+    }
+};
+
+struct assign_modulus {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() %= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) %= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) %= std::forward<B>(b);
+    }
+};
+
+struct assign_bit_and {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() &= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) &= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) &= std::forward<B>(b);
+    }
+};
+
+struct assign_bit_or {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() |= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) |= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) |= std::forward<B>(b);
+    }
+};
+
+struct assign_bit_xor {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() ^= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) ^= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) ^= std::forward<B>(b);
+    }
+};
+
+struct assign_bit_shift_left {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() <<= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) <<= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) <<= std::forward<B>(b);
+    }
+};
+
+struct assign_bit_shift_right {
+    template <class A, class B>
+    constexpr decltype(std::declval<A>() >>= std::declval<B>()) operator()(A &&a,
+                                                                          B &&b) const
+        noexcept(noexcept(std::forward<A>(a) >>= std::forward<B>(b)))
+    {
+        return std::forward<A>(a) >>= std::forward<B>(b);
+    }
+};
+
+// operator_is_substitution_failure {{{1
+template <class A, class B, class Op = std::plus<>>
+constexpr bool is_substitution_failure =
+    vir::test::operator_is_substitution_failure<A, B, Op>();
+
+// sfinae_is_callable{{{1
+using vir::test::sfinae_is_callable;
+
+// traits {{{1
+using vir::test::has_less_bits;
+
+//}}}1
+
+#endif  // VC_TESTS_METAHELPERS_H_
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h
new file mode 100644
index 00000000000..1b611c56b1d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h
@@ -0,0 +1,112 @@
+/*  This file is part of the Vc library. {{{
+Copyright © 2018 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+    * Neither the names of contributing organizations nor the
+      names of its contributors may be used to endorse or promote products
+      derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#ifndef VC_TESTS_SIMD_VIEW_H_
+#define VC_TESTS_SIMD_VIEW_H_
+
+#include <experimental/simd>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+namespace experimental
+{
+namespace imported_begin_end
+{
+    using std::begin;
+    using std::end;
+    template <class T> using begin_type = decltype(begin(std::declval<T>()));
+    template <class T> using end_type = decltype(end(std::declval<T>()));
+}  // namespace imported_begin_end
+
+template <class V, class It, class End> class viewer
+{
+    It it;
+    const End end;
+
+    template <class F> void for_each_impl(F &&fun, std::index_sequence<0, 1, 2>)
+    {
+        for (; it + V::size() <= end; it += V::size()) {
+            fun(V([&](auto i) { return std::get<0>(it[i].as_tuple()); }),
+                V([&](auto i) { return std::get<1>(it[i].as_tuple()); }),
+                V([&](auto i) { return std::get<2>(it[i].as_tuple()); }));
+        }
+        if (it != end) {
+            fun(V([&](auto i) {
+                    auto ii = it + i < end ? i + 0 : 0;
+                    return std::get<0>(it[ii].as_tuple());
+                }),
+                V([&](auto i) {
+                    auto ii = it + i < end ? i + 0 : 0;
+                    return std::get<1>(it[ii].as_tuple());
+                }),
+                V([&](auto i) {
+                    auto ii = it + i < end ? i + 0 : 0;
+                    return std::get<2>(it[ii].as_tuple());
+                }));
+        }
+    }
+
+    template <class F> void for_each_impl(F &&fun, std::index_sequence<0, 1>)
+    {
+        for (; it + V::size() <= end; it += V::size()) {
+            fun(V([&](auto i) { return std::get<0>(it[i].as_tuple()); }),
+                V([&](auto i) { return std::get<1>(it[i].as_tuple()); }));
+        }
+        if (it != end) {
+            fun(V([&](auto i) {
+                    auto ii = it + i < end ? i + 0 : 0;
+                    return std::get<0>(it[ii].as_tuple());
+                }),
+                V([&](auto i) {
+                    auto ii = it + i < end ? i + 0 : 0;
+                    return std::get<1>(it[ii].as_tuple());
+                }));
+        }
+    }
+
+public:
+    viewer(It _it, End _end) : it(_it), end(_end) {}
+
+    template <class F> void for_each(F &&fun) {
+        constexpr size_t N =
+            std::tuple_size<std::decay_t<decltype(it->as_tuple())>>::value;
+        for_each_impl(std::forward<F>(fun), std::make_index_sequence<N>());
+    }
+};
+
+template <class V, class Cont>
+viewer<V, imported_begin_end::begin_type<const Cont &>,
+       imported_begin_end::end_type<const Cont &>>
+simd_view(const Cont &data)
+{
+    using std::begin;
+    using std::end;
+    return {begin(data), end(data)};
+}
+}  // namespace experimental
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif  // VC_TESTS_SIMD_VIEW_H_
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h
new file mode 100644
index 00000000000..1327b814290
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h
@@ -0,0 +1,227 @@
+#include <experimental/simd>
+#include <initializer_list>
+#include <random>
+#include <cfenv>
+
+template <class T, class A>
+std::experimental::simd<T, A> iif(std::experimental::simd_mask<T, A> k,
+                   const typename std::experimental::simd_mask<T, A>::simd_type &t,
+                   const std::experimental::simd<T, A> &f)
+{
+    auto r = f;
+    where(k, r) = t;
+    return r;
+}
+
+template <class V>
+V epilogue_load(const typename V::value_type *mem, const std::size_t size)
+{
+    const int rem = size % V::size();
+    return where(V([](int i) { return i; }) < rem, V(0))
+        .copy_from(mem + size / V::size() * V::size(), std::experimental::element_aligned);
+}
+
+template <class V, class... F>
+void test_values(const std::initializer_list<typename V::value_type> &inputs,
+                 F &&... fun_pack)
+{
+    for (auto it = inputs.begin(); it + V::size() <= inputs.end(); it += V::size()) {
+        [](auto...) {}((fun_pack(V(&it[0], std::experimental::element_aligned)), 0)...);
+    }
+    [](auto...) {}((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size())), 0)...);
+}
+
+template <class V> struct RandomValues {
+  using T = typename V::value_type;
+  using L = std::numeric_limits<T>;
+  static constexpr bool isfp = std::is_floating_point_v<T>;
+  const std::size_t count;
+  std::conditional_t<std::is_floating_point_v<T>,
+		     std::uniform_real_distribution<T>,
+		     std::uniform_int_distribution<T>>
+    dist;
+  const bool uniform;
+
+  RandomValues(std::size_t count_, T min, T max)
+    : count(count_), dist(min, max), uniform(true)
+  {
+    if constexpr (std::is_floating_point_v<T>)
+      VERIFY(max - min <= L::max());
+  }
+
+  RandomValues(std::size_t count_)
+    : count(count_), dist(isfp ? 1 : L::lowest(), isfp ? 2 : L::max()),
+      uniform(!isfp)
+  {
+  }
+
+  template <typename URBG> V operator()(URBG& gen)
+  {
+    if (uniform)
+      return V([&](int) { return dist(gen); });
+    else
+      {
+	auto exp_dist
+	  = std::normal_distribution<float>(0.f, L::max_exponent * .5f);
+	return V([&](int) {
+	  const T mant = dist(gen);
+	  T fp = 0;
+	  do
+	    {
+	      const int exp = exp_dist(gen);
+	      fp = std::ldexp(mant, exp);
+	    }
+	  while (fp >= L::max() || fp <= L::denorm_min());
+	  fp = gen() & 0x4 ? fp : -fp;
+	  return fp;
+	});
+      }
+  }
+};
+
+static std::mt19937 g_mt_gen{0};
+
+template <class V, class... F>
+void
+test_values(const std::initializer_list<typename V::value_type>& inputs,
+	    RandomValues<V> random, F&&... fun_pack)
+{
+  test_values<V>(inputs, fun_pack...);
+  for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+    {
+      [](auto...) {}((fun_pack(random(g_mt_gen)), 0)...);
+    }
+}
+
+template <class V, class... F>
+void test_values_2arg(const std::initializer_list<typename V::value_type> &inputs,
+                      F &&... fun_pack)
+{
+    for (auto scalar_it = inputs.begin(); scalar_it != inputs.end(); ++scalar_it) {
+        for (auto it = inputs.begin(); it + V::size() <= inputs.end(); it += V::size()) {
+            [](auto...) {
+            }((fun_pack(V(&it[0], std::experimental::element_aligned), V(*scalar_it)), 0)...);
+        }
+        [](auto...) {
+        }((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size()), V(*scalar_it)),
+           0)...);
+    }
+}
+
+template <class V, class... F>
+void
+test_values_2arg(const std::initializer_list<typename V::value_type>& inputs,
+		 RandomValues<V> random, F&&... fun_pack)
+{
+  test_values_2arg<V>(inputs, fun_pack...);
+  for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+    {
+      [](auto...) {}((fun_pack(random(g_mt_gen), random(g_mt_gen)), 0)...);
+    }
+}
+
+template <class V, class... F>
+void test_values_3arg(const std::initializer_list<typename V::value_type> &inputs,
+                      F &&... fun_pack)
+{
+    for (auto scalar_it1 = inputs.begin(); scalar_it1 != inputs.end(); ++scalar_it1) {
+        for (auto scalar_it2 = inputs.begin(); scalar_it2 != inputs.end(); ++scalar_it2) {
+            for (auto it = inputs.begin(); it + V::size() <= inputs.end();
+                 it += V::size()) {
+                [](auto...) {}((fun_pack(V(&it[0], std::experimental::element_aligned), V(*scalar_it1),
+                                         V(*scalar_it2)),
+                                0)...);
+            }
+            [](auto...) {}((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size()),
+                                     V(*scalar_it1), V(*scalar_it2)),
+                            0)...);
+        }
+    }
+}
+
+template <class V, class... F>
+void
+test_values_3arg(const std::initializer_list<typename V::value_type>& inputs,
+		 RandomValues<V> random, F&&... fun_pack)
+{
+  test_values_3arg<V>(inputs, fun_pack...);
+  for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+    {
+      [](auto...) {
+      }((fun_pack(random(g_mt_gen), random(g_mt_gen), random(g_mt_gen)), 0)...);
+    }
+}
+
+#define MAKE_TESTER_2(name_, reference_)                                       \
+  [&](const auto... inputs) {                                                  \
+    const auto totest = name_(inputs...);                                      \
+    using R = std::remove_const_t<decltype(totest)>;                           \
+    auto&& expected = [&](const auto&... vs) -> const R {                      \
+      R tmp = {};                                                              \
+      for (std::size_t i = 0; i < R::size(); ++i)                              \
+	{                                                                      \
+	  tmp[i] = reference_(vs[i]...);                                       \
+	}                                                                      \
+      return tmp;                                                              \
+    };                                                                         \
+    const R expect1 = expected(inputs...);                                     \
+    if constexpr (std::is_floating_point_v<typename R::value_type>)            \
+      {                                                                        \
+	((COMPARE(isnan(totest), isnan(expect1)) << #name_ "(")                \
+	 << ... << inputs)                                                     \
+	  << ") = " << totest << " != " << expect1;                            \
+	const R expect2 = expected(iif(isnan(expect1), 0, inputs)...);         \
+	((FUZZY_COMPARE(name_(iif(isnan(expect1), 0, inputs)...), expect2)     \
+	  << "\nclean = ")                                                     \
+	 << ... << iif(isnan(expect1), 0, inputs));                            \
+      }                                                                        \
+    else                                                                       \
+      {                                                                        \
+	((COMPARE(name_(inputs...), expect1) << "\n" #name_ "(")               \
+	 << ... << inputs)                                                     \
+	  << ")";                                                              \
+      }                                                                        \
+  }
+
+#define MAKE_TESTER(name_) MAKE_TESTER_2(name_, std::name_)
+
+#define MAKE_TESTER_NOFPEXCEPT(name_)                                          \
+  [&](const auto... inputs) {                                                  \
+    std::feclearexcept(FE_ALL_EXCEPT);                                         \
+    auto totest = name_(inputs...);                                            \
+    ((COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0) << "\n" #name_ "(")         \
+     << ... << inputs)                                                         \
+      << ")";                                                                  \
+    using R = std::remove_const_t<decltype(totest)>;                           \
+    auto&& expected = [&](const auto&... vs) -> const R {                      \
+      R tmp = {};                                                              \
+      for (std::size_t i = 0; i < R::size(); ++i)                              \
+	{                                                                      \
+	  tmp[i] = std::name_(vs[i]...);                                       \
+	}                                                                      \
+      return tmp;                                                              \
+    };                                                                         \
+    const R expect1 = expected(inputs...);                                     \
+    if constexpr (std::is_floating_point_v<typename R::value_type>)            \
+      {                                                                        \
+	((COMPARE(isnan(totest), isnan(expect1)) << #name_ "(")                \
+	 << ... << inputs)                                                     \
+	  << ") = " << totest << " != " << expect1;                            \
+	const R expect2 = expected(iif(isnan(expect1), 0, inputs)...);         \
+	std::feclearexcept(FE_ALL_EXCEPT);                                     \
+	asm volatile("");                                                      \
+	totest = name_(iif(isnan(expect1), 0, inputs)...);                     \
+	asm volatile("");                                                      \
+	((COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0) << "\n" #name_ "(")     \
+	 << ... << inputs)                                                     \
+	  << ")";                                                              \
+	FUZZY_COMPARE(totest, expect2);                                        \
+      }                                                                        \
+    else                                                                       \
+      {                                                                        \
+	((COMPARE(totest, expect1) << "\n" #name_ "(") << ... << inputs)       \
+	  << ")";                                                              \
+      }                                                                        \
+  }
+
+// vim: foldmethod=marker ts=8 sw=2 noet sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h
new file mode 100644
index 00000000000..0c61255d381
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h
@@ -0,0 +1,113 @@
+/*{{{
+Copyright © 2011-2018 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+    * Neither the names of contributing organizations nor the
+      names of its contributors may be used to endorse or promote products
+      derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#ifndef ULP_H
+#define ULP_H
+
+#include <cmath>
+#include <experimental/simd>
+#include <limits>
+#include <type_traits>
+#include <cfenv>
+
+namespace vir {
+namespace test {
+template <typename T, typename R = typename T::value_type>
+R
+value_type_impl(int);
+
+template <typename T>
+T
+value_type_impl(float);
+
+template <typename T> using value_type_t = decltype(value_type_impl<T>(int()));
+
+template <typename T>
+inline T
+ulp_distance(const T& val_, const T& ref_)
+{
+  if constexpr (std::is_floating_point_v<value_type_t<T>>)
+    {
+      const int fp_exceptions = std::fetestexcept(FE_ALL_EXCEPT);
+      T val = val_;
+      T ref = ref_;
+
+      T diff = T();
+
+      using std::abs;
+      using std::fpclassify;
+      using std::frexp;
+      using std::isnan;
+      using std::isinf;
+      using std::ldexp;
+      using std::max;
+      using std::experimental::where;
+      using limits = std::numeric_limits<value_type_t<T>>;
+
+      where(ref == 0, val) = abs(val);
+      where(ref == 0, diff) = 1;
+      where(ref == 0, ref) = limits::min();
+      where(isinf(ref) && ref == val, ref)
+        = 0; // where(val_ == ref_) = 0 below will fix it up
+
+      where(val == 0, ref) = abs(ref);
+      where(val == 0, diff) += 1;
+      where(val == 0, val) = limits::min();
+
+      using I = decltype(fpclassify(std::declval<T>()));
+      I exp = {};
+      frexp(ref, &exp);
+      // lower bound for exp must be min_exponent to scale the resulting
+      // difference from a denormal correctly
+      exp = max(exp, I(limits::min_exponent));
+      diff += ldexp(abs(ref - val), limits::digits - exp);
+      where(val_ == ref_ || (isnan(val_) && isnan(ref_)), diff) = T();
+      std::feclearexcept(FE_ALL_EXCEPT ^ fp_exceptions);
+      return diff;
+    }
+  else
+    {
+      if (val_ > ref_)
+	return val_ - ref_;
+      else
+	return ref_ - val_;
+    }
+}
+
+template <typename T>
+inline T
+ulp_distance_signed(const T& _val, const T& _ref)
+{
+  using std::copysign;
+  return copysign(ulp_distance(_val, _ref), _val - _ref);
+}
+} // namespace test
+} // namespace vir
+
+#endif // ULP_H
+
+// vim: sw=2 et sts=2 foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
new file mode 100644
index 00000000000..eca79417e09
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
@@ -0,0 +1,250 @@
+#ifndef TESTS_BITS_VERIFY_H_
+#define TESTS_BITS_VERIFY_H_
+
+#include <experimental/simd>
+#include <sstream>
+#include <iomanip>
+#include "ulp.h"
+
+#ifdef _GLIBCXX_SIMD_HAVE_NEON
+// work around PR89357:
+#define alignas(...) __attribute__((aligned(__VA_ARGS__)))
+#endif
+
+using schar = signed char;
+using uchar = unsigned char;
+using ushort = unsigned short;
+using uint = unsigned int;
+using ulong = unsigned long;
+using llong = long long;
+using ullong = unsigned long long;
+using ldouble = long double;
+using wchar = wchar_t;
+using char16 = char16_t;
+using char32 = char32_t;
+
+template <class T>
+T
+make_value_unknown(const T& x)
+{
+  if constexpr (std::is_constructible_v<T, const volatile T&>)
+    {
+      const volatile T& y = x;
+      return y;
+    }
+  else
+    {
+      T y = x;
+      asm("" : "+m"(y));
+      return y;
+    }
+}
+
+class verify
+{
+  const bool m_failed = false;
+
+  template <typename T,
+	    typename = decltype(std::declval<std::stringstream&>()
+				<< std::declval<const T&>())>
+  void print(const T& x, int) const
+  {
+    std::stringstream ss;
+    ss << x;
+    __builtin_fprintf(stderr, "%s", ss.str().c_str());
+  }
+
+  template <typename T>
+  void print(const T& x, ...) const
+  {
+    if constexpr (std::experimental::is_simd_v<T>)
+      {
+	std::stringstream ss;
+	if constexpr (std::is_floating_point_v<typename T::value_type>)
+	  {
+	    ss << "\n(" << x[0] << " == " << std::hexfloat << x[0]
+	       << std::defaultfloat << ')';
+	    for (unsigned i = 1; i < x.size(); ++i)
+	      {
+		ss << (i % 4 == 0 ? ",\n(" : ", (") << x[i]
+		   << " == " << std::hexfloat << x[i] << std::defaultfloat
+		   << ')';
+	      }
+	  }
+	else
+	  {
+	    ss << +x[0];
+	    for (unsigned i = 1; i < x.size(); ++i)
+	      {
+		ss << ", " << +x[i];
+	      }
+	  }
+	__builtin_fprintf(stderr, "%s", ss.str().c_str());
+      }
+    else if constexpr (std::experimental::is_simd_mask_v<T>)
+      {
+	__builtin_fprintf(stderr, (x[0] ? "[1" : "[0"));
+	for (unsigned i = 1; i < x.size(); ++i)
+	  {
+	    __builtin_fprintf(stderr, (x[i] ? "1" : "0"));
+	  }
+	__builtin_fprintf(stderr, "]");
+      }
+    else
+      {
+	print_hex(&x, sizeof(T));
+      }
+  }
+
+  void print_hex(const void* x, std::size_t n) const
+  {
+    __builtin_fprintf(stderr, "0x");
+    const auto* bytes = static_cast<const unsigned char*>(x);
+    for (std::size_t i = 0; i < n; ++i)
+      {
+	__builtin_fprintf(stderr, (i && i % 4 == 0) ? "'%02x" : "%02x",
+			  bytes[i]);
+      }
+  }
+
+public:
+  template <typename... Ts>
+  verify(bool        ok,
+	 const char* file,
+	 const int   line,
+	 const char* func,
+	 const char* cond,
+	 const Ts&... extra_info)
+  : m_failed(!ok)
+  {
+    if (m_failed)
+      {
+	__builtin_fprintf(stderr, "%s:%d: (%s): Assertion '%s' failed.\n", file,
+			  line, func, cond);
+	auto &&unused [[maybe_unused]] = {0, (print(extra_info, int()), 0)...};
+      }
+  }
+
+  ~verify()
+  {
+    if (m_failed)
+      {
+	__builtin_fprintf(stderr, "\n");
+	__builtin_abort();
+      }
+  }
+
+  template <typename T>
+  const verify& operator<<(const T& x) const
+  {
+    if (m_failed)
+      {
+	print(x, int());
+      }
+    return *this;
+  }
+};
+
+#define COMPARE(_a, _b)                                                        \
+  [&](auto&& _aa, auto&& _bb) {                                                \
+    return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__,   \
+		  __PRETTY_FUNCTION__, "all_of(" #_a " == " #_b ")",           \
+		  #_a " = ", _aa, "\n" #_b " = ", _bb);                        \
+  }((_a), (_b))
+
+#define VERIFY(_test)                                                          \
+  verify(_test, __FILE__, __LINE__, __PRETTY_FUNCTION__, #_test)
+
+// ulp_distance_signed can raise FP exceptions and thus must be conditionally
+// executed
+#define ULP_COMPARE(_a, _b, _allowed_distance)                                 \
+  [&](auto&& _aa, auto&& _bb) {                                                \
+    const bool success = std::experimental::all_of(                            \
+      vir::test::ulp_distance(_aa, _bb) <= (_allowed_distance));               \
+    return verify(success, __FILE__, __LINE__, __PRETTY_FUNCTION__,            \
+		  "all_of(" #_a " ~~ " #_b ")", #_a " = ", _aa,                \
+		  "\n" #_b " = ", _bb, "\ndistance = ",                        \
+		  success ? 0 : vir::test::ulp_distance_signed(_aa, _bb));     \
+  }((_a), (_b))
+
+namespace vir
+{
+namespace test
+{
+  template <typename T>
+  inline T _S_fuzzyness = 0;
+  template <typename T>
+  void setFuzzyness(T x)
+  {
+    _S_fuzzyness<T> = x;
+  }
+} // namespace test
+} // namespace vir
+
+#define FUZZY_COMPARE(_a, _b)                                                  \
+  ULP_COMPARE(                                                                 \
+    _a, _b,                                                                    \
+    vir::test::_S_fuzzyness<vir::test::value_type_t<decltype((_a) + (_b))>>)
+
+template <typename V>
+void test();
+template <typename V>
+void invoke_test(...)
+{
+}
+template <typename V, typename = decltype(V())>
+void invoke_test(int)
+{
+  test<V>();
+  __builtin_fprintf(stderr, "PASS: %s\n", __PRETTY_FUNCTION__);
+}
+
+template <class T> void iterate_abis()/*{{{*/
+{
+  using namespace std::experimental::parallelism_v2;
+#ifndef TESTFIXEDSIZE
+  invoke_test<simd<T, simd_abi::scalar>>(int());
+  invoke_test<simd<T, simd_abi::_VecBuiltin<16>>>(int());
+  invoke_test<simd<T, simd_abi::_VecBltnBtmsk<64>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<3>>>(int());
+#ifdef STRESSTEST
+  invoke_test<simd<T, simd_abi::_VecBuiltin<12>>>(int());
+  invoke_test<simd<T, simd_abi::_VecBuiltin<32>>>(int());
+  invoke_test<simd<T, simd_abi::_VecBltnBtmsk<56>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<4>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<12>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<24>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<28>>>(int());
+#endif
+#else
+  invoke_test<simd<T, simd_abi::fixed_size<1>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<2>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<5>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<6>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<7>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<8>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<9>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<10>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<11>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<13>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<14>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<15>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<16>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<17>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<18>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<19>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<20>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<21>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<22>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<23>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<25>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<26>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<27>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<29>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<30>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<31>>>(int());
+  invoke_test<simd<T, simd_abi::fixed_size<32>>>(int());
+#endif
+}/*}}}*/
+
+#endif  // TESTS_BITS_VERIFY_H_
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h b/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h
new file mode 100644
index 00000000000..76ac8143fe1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h
@@ -0,0 +1,87 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+enum unscoped_enum
+{
+  foo
+};
+enum class scoped_enum
+{
+  bar
+};
+struct convertible
+{
+  operator int();
+  operator float();
+};
+
+template <typename V> void test()
+{
+  using T = typename V::value_type;
+  VERIFY(std::experimental::is_simd_v<V>);
+  VERIFY(std::experimental::is_abi_tag_v<typename V::abi_type>);
+
+  {
+    V x;     // not initialized
+    x = V{}; // default broadcasts 0
+    COMPARE(x, V(0));
+    COMPARE(x, V());
+    COMPARE(x, V{});
+    x = V(); // default broadcasts 0
+    COMPARE(x, V(0));
+    COMPARE(x, V());
+    COMPARE(x, V{});
+    x = 0;
+    COMPARE(x, V(0));
+    COMPARE(x, V());
+    COMPARE(x, V{});
+
+    for (std::size_t i = 0; i < V::size(); ++i)
+      {
+	COMPARE(T(x[i]), T(0)) << "i = " << i;
+	COMPARE(x[i], T(0)) << "i = " << i;
+      }
+  }
+
+  V x = 3;
+  V y = T(0);
+  for (std::size_t i = 0; i < V::size(); ++i)
+    {
+      COMPARE(x[i], T(3)) << "i = " << i;
+      COMPARE(y[i], T(0)) << "i = " << i;
+    }
+  y = 3;
+  COMPARE(x, y);
+
+  VERIFY(!(is_substitution_failure<V&, unscoped_enum, assignment>) );
+  VERIFY((is_substitution_failure<V&, scoped_enum, assignment>) );
+  COMPARE((is_substitution_failure<V&, convertible, assignment>),
+	  (!std::is_convertible<convertible, T>::value));
+  COMPARE((is_substitution_failure<V&, long double, assignment>),
+	  (sizeof(long double) > sizeof(T) || std::is_integral<T>::value));
+  COMPARE((is_substitution_failure<V&, double, assignment>),
+	  (sizeof(double) > sizeof(T) || std::is_integral<T>::value));
+  COMPARE((is_substitution_failure<V&, float, assignment>),
+	  (sizeof(float) > sizeof(T) || std::is_integral<T>::value));
+  COMPARE((is_substitution_failure<V&, long long, assignment>),
+	  (has_less_bits<T, long long>() || std::is_unsigned<T>::value));
+  COMPARE((is_substitution_failure<V&, unsigned long long, assignment>),
+	  (has_less_bits<T, unsigned long long>()));
+  COMPARE((is_substitution_failure<V&, long, assignment>),
+	  (has_less_bits<T, long>() || std::is_unsigned<T>::value));
+  COMPARE((is_substitution_failure<V&, unsigned long, assignment>),
+	  (has_less_bits<T, unsigned long>()));
+  // int broadcast *always* works:
+  VERIFY(!(is_substitution_failure<V&, int, assignment>) );
+  // uint broadcast works for any unsigned T:
+  COMPARE((is_substitution_failure<V&, unsigned int, assignment>),
+	  (!std::is_unsigned<T>::value && has_less_bits<T, unsigned int>()));
+  COMPARE((is_substitution_failure<V&, short, assignment>),
+	  (has_less_bits<T, short>() || std::is_unsigned<T>::value));
+  COMPARE((is_substitution_failure<V&, unsigned short, assignment>),
+	  (has_less_bits<T, unsigned short>()));
+  COMPARE((is_substitution_failure<V&, signed char, assignment>),
+	  (has_less_bits<T, signed char>() || std::is_unsigned<T>::value));
+  COMPARE((is_substitution_failure<V&, unsigned char, assignment>),
+	  (has_less_bits<T, unsigned char>()));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/casts.h b/libstdc++-v3/testsuite/experimental/simd/tests/casts.h
new file mode 100644
index 00000000000..ecf757a4fc7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/casts.h
@@ -0,0 +1,132 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/conversions.h"
+
+using std::experimental::simd_cast;
+using std::experimental::static_simd_cast;
+
+template <class T, size_t N> struct gen_cast
+{
+  std::array<T, N> data;
+  template <class V> gen_cast(const V& v)
+  {
+    for (size_t i = 0; i < V::size(); ++i)
+      {
+	data[i] = static_cast<T>(v[i]);
+      }
+  }
+  template <class I> constexpr T operator()(I) { return data[I::value]; }
+};
+
+template <class V, class To> struct gen_seq_t
+{
+  using From = typename V::value_type;
+  const size_t N = cvt_input_data<From, To>.size();
+  size_t offset = 0;
+  constexpr void operator++() { offset += V::size(); }
+  explicit constexpr operator bool() const { return offset < N; }
+  template <class I> constexpr From operator()(I) const
+  {
+    size_t i = I::value + offset;
+    return i < N ? cvt_input_data<From, To>[i] : From(i);
+  }
+};
+
+template <class To> struct foo
+{
+  template <class T> auto operator()(const T& v) -> decltype(simd_cast<To>(v));
+};
+
+template <typename V, typename To>
+void
+casts()
+{
+  using From = typename V::value_type;
+  constexpr auto N = V::size();
+  if constexpr (N <= std::experimental::simd_abi::max_fixed_size<To>)
+    {
+      using W = std::experimental::fixed_size_simd<To, N>;
+
+      if constexpr (std::is_integral_v<From>)
+	{
+	  using A = typename V::abi_type;
+	  using TU = std::make_unsigned_t<From>;
+	  using TS = std::make_signed_t<From>;
+	  COMPARE(typeid(static_simd_cast<TU>(V())),
+		  typeid(std::experimental::simd<TU, A>));
+	  COMPARE(typeid(static_simd_cast<TS>(V())),
+		  typeid(std::experimental::simd<TS, A>));
+	}
+
+      using is_simd_cast_allowed
+	= decltype(vir::test::sfinae_is_callable_t<const V&>(foo<To>()));
+
+      COMPARE(
+	is_simd_cast_allowed::value,
+	std::numeric_limits<From>::digits <= std::numeric_limits<To>::digits
+	  && std::numeric_limits<From>::max() <= std::numeric_limits<To>::max()
+	  && !(std::is_signed<From>::value && std::is_unsigned<To>::value));
+
+      if constexpr (is_simd_cast_allowed::value)
+	{
+	  for (gen_seq_t<V, To> gen_seq; gen_seq; ++gen_seq)
+	    {
+	      const V seq(gen_seq);
+	      COMPARE(simd_cast<V>(seq), seq);
+	      COMPARE(simd_cast<W>(seq), W(gen_cast<To, N>(seq)))
+		<< "seq = " << seq;
+	      auto test = simd_cast<To>(seq);
+	      // decltype(test) is not W if
+	      // a) V::abi_type is not fixed_size and
+	      // b.1) V::value_type and To are integral and of equal rank or
+	      // b.2) V::value_type and To are equal
+	      COMPARE(test, decltype(test)(gen_cast<To, N>(seq)));
+	      if (std::is_same<To, From>::value)
+		{
+		  COMPARE(typeid(decltype(test)), typeid(V));
+		}
+	    }
+	}
+
+      for (gen_seq_t<V, To> gen_seq; gen_seq; ++gen_seq)
+	{
+	  const V seq(gen_seq);
+	  COMPARE(static_simd_cast<V>(seq), seq);
+	  COMPARE(static_simd_cast<W>(seq), W(gen_cast<To, N>(seq))) << '\n'
+								     << seq;
+	  auto test = static_simd_cast<To>(seq);
+	  // decltype(test) is not W if
+	  // a) V::abi_type is not fixed_size and
+	  // b.1) V::value_type and To are integral and of equal rank or
+	  // b.2) V::value_type and To are equal
+	  COMPARE(test, decltype(test)(gen_cast<To, N>(seq)));
+	  if (std::is_same<To, From>::value)
+	    {
+	      COMPARE(typeid(decltype(test)), typeid(V));
+	    }
+	}
+    }
+}
+
+template <typename V>
+void
+test()
+{
+  casts<V, long double>();
+  casts<V, double>();
+  casts<V, float>();
+  casts<V, long long>();
+  casts<V, unsigned long long>();
+  casts<V, unsigned long>();
+  casts<V, long>();
+  casts<V, int>();
+  casts<V, unsigned int>();
+  casts<V, short>();
+  casts<V, unsigned short>();
+  casts<V, char>();
+  casts<V, signed char>();
+  casts<V, unsigned char>();
+  casts<V, char32_t>();
+  casts<V, char16_t>();
+  casts<V, wchar_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h b/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h
new file mode 100644
index 00000000000..bd9aff386ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h
@@ -0,0 +1,64 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+#include <cfenv>
+
+template <typename F>
+auto
+verify_no_fp_exceptions(F&& fun)
+{
+  std::feclearexcept(FE_ALL_EXCEPT);
+  auto r = fun();
+  COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0);
+  return r;
+}
+
+#define NOFPEXCEPT(...) verify_no_fp_exceptions([&]() { return __VA_ARGS__; })
+
+template <typename V>
+void
+test()
+{
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>(
+    {
+      0., 1., -1.,
+#if __GCC_IEC_559 >= 2
+	-0., limits::infinity(), -limits::infinity(), limits::denorm_min(),
+	-limits::denorm_min(), limits::quiet_NaN(),
+#ifdef __SUPPORT_SNAN__
+	limits::signaling_NaN(),
+#endif
+#endif
+	limits::max(), -limits::max(), limits::min(), limits::min() * 0.9,
+	-limits::min(), -limits::min() * 0.9
+    },
+    [](const V input) {
+      using intv = std::experimental::fixed_size_simd<int, V::size()>;
+      COMPARE(NOFPEXCEPT(isfinite(input)),
+	      !V([&](auto i) { return std::isfinite(input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(isinf(input)),
+	      !V([&](auto i) { return std::isinf(input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(isnan(input)),
+	      !V([&](auto i) { return std::isnan(input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(isnormal(input)),
+	      !V([&](auto i) { return std::isnormal(input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(signbit(input)),
+	      !V([&](auto i) { return std::signbit(input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(isunordered(input, V())),
+	      !V([&](auto i) { return std::isunordered(input[i], 0) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(isunordered(V(), input)),
+	      !V([&](auto i) { return std::isunordered(0, input[i]) ? 0 : 1; }))
+	<< input;
+      COMPARE(NOFPEXCEPT(fpclassify(input)),
+	      intv([&](auto i) { return std::fpclassify(input[i]); }))
+	<< input;
+    });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h b/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h
new file mode 100644
index 00000000000..deafa2f1296
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h
@@ -0,0 +1,88 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  using int_v = std::experimental::fixed_size_simd<int, V::size()>;
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>(
+    {
+      0, 0.25, 0.5, 1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
+	20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 31, -0., -0.25, -0.5, -1,
+	-3, -4, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15, -16, -17, -18,
+	-19, -20, -21, -22, -23, -24, -25, -26, -27, -28, -29, -32, -31,
+#if __GCC_IEC_559 >= 2
+	limits::denorm_min(), -limits::denorm_min(), limits::min() / 2,
+	-limits::min() / 2,
+#endif
+	limits::max(), -limits::max(), limits::max() * 0.123f,
+	-limits::max() * 0.123f
+    },
+    [](const V input) {
+      V expectedFraction;
+      const int_v expectedExponent([&](auto i) {
+	int exp;
+	expectedFraction[i] = std::frexp(input[i], &exp);
+	return exp;
+      });
+      int_v exponent = {};
+      const V fraction = frexp(input, &exponent);
+      COMPARE(fraction, expectedFraction)
+	<< ", input = " << input << ", delta: " << fraction - expectedFraction;
+      COMPARE(exponent, expectedExponent)
+	<< "\ninput: " << input << ", fraction: " << fraction;
+    });
+#ifdef __STDC_IEC_559__
+  test_values<V>(
+    // If x is a NaN, a NaN is returned, and the value of *exp is unspecified.
+    //
+    // If x is positive  infinity  (negative  infinity),  positive  infinity
+    // (negative infinity) is returned, and the value of *exp is unspecified.
+    // This behavior is only guaranteed with C's Annex F when __STDC_IEC_559__
+    // is defined.
+    {limits::quiet_NaN(),
+     limits::infinity(),
+     -limits::infinity(),
+     limits::quiet_NaN(),
+     limits::infinity(),
+     -limits::infinity(),
+     limits::quiet_NaN(),
+     limits::infinity(),
+     -limits::infinity(),
+     limits::quiet_NaN(),
+     limits::infinity(),
+     -limits::infinity(),
+     limits::quiet_NaN(),
+     limits::infinity(),
+     -limits::infinity(),
+     limits::denorm_min(),
+     limits::denorm_min() * 1.72,
+     -limits::denorm_min(),
+     -limits::denorm_min() * 1.72,
+     0.,
+     -0.,
+     1,
+     -1},
+    [](const V input) {
+      const V expectedFraction([&](auto i) {
+	int exp;
+	return std::frexp(input[i], &exp);
+      });
+      int_v exponent = {};
+      const V fraction = frexp(input, &exponent);
+      COMPARE(isnan(fraction), isnan(expectedFraction))
+	<< fraction << ", input = " << input
+	<< ", delta: " << fraction - expectedFraction;
+      COMPARE(isinf(fraction), isinf(expectedFraction))
+	<< fraction << ", input = " << input
+	<< ", delta: " << fraction - expectedFraction;
+      COMPARE(signbit(fraction), signbit(expectedFraction))
+	<< fraction << ", input = " << input
+	<< ", delta: " << fraction - expectedFraction;
+    });
+#endif
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/generator.h b/libstdc++-v3/testsuite/experimental/simd/tests/generator.h
new file mode 100644
index 00000000000..5b824772962
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/generator.h
@@ -0,0 +1,39 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <class V> struct call_generator
+{
+  template <class F> auto operator()(const F& f) -> decltype(V(f));
+};
+
+using schar = signed char;
+using uchar = unsigned char;
+using ullong = unsigned long long;
+
+template <typename V>
+void
+test()
+{
+  using T = typename V::value_type;
+  V x([](int) { return T(1); });
+  COMPARE(x, V(1));
+  x = V(
+    [](int) { return 1; }); // unconditionally returns int from generator lambda
+  COMPARE(x, V(1));
+  x = V([](auto i) { return T(i); });
+  COMPARE(x, V([](T i) { return i; }));
+
+  VERIFY((
+    sfinae_is_callable<int (&)(int)>(call_generator<V>()))); // int always works
+  COMPARE(sfinae_is_callable<schar (&)(int)>(call_generator<V>()),
+	  std::is_signed<T>::value);
+  COMPARE(sfinae_is_callable<uchar (&)(int)>(call_generator<V>()),
+	  !(std::is_signed_v<T> && sizeof(T) <= sizeof(uchar)));
+  COMPARE(sfinae_is_callable<float (&)(int)>(call_generator<V>()),
+	  (std::is_floating_point<T>::value));
+
+  COMPARE(sfinae_is_callable<ullong (&)(int)>(call_generator<V>()),
+	  std::numeric_limits<T>::max() >= std::numeric_limits<ullong>::max()
+	    && std::numeric_limits<T>::digits
+		 >= std::numeric_limits<ullong>::digits);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h
new file mode 100644
index 00000000000..b06abe31b46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h
@@ -0,0 +1,131 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(1);
+  vir::test::setFuzzyness<double>(1);
+  vir::test::setFuzzyness<long double>(2); // because of the bad reference
+
+  using T = typename V::value_type;
+  using limits = std::numeric_limits<T>;
+  // 3-arg std::hypot needs to be fixed, this is a better reference:
+  auto&& hypot3 = [](T x, T y, T z) -> T {
+    x = std::abs(x);
+    y = std::abs(y);
+    z = std::abs(z);
+    if (std::isinf(x) || std::isinf(y) || std::isinf(z))
+      {
+	return limits::infinity();
+      }
+    else if (std::isnan(x) || std::isnan(y) || std::isnan(z))
+      {
+	return limits::quiet_NaN();
+      }
+    else if (x == y && y == z)
+      {
+	return x * std::sqrt(T(3));
+      }
+    else if (z == 0 && y == 0)
+      return x;
+    else if (x == 0 && z == 0)
+      return y;
+    else if (x == 0 && y == 0)
+      return z;
+    else if (x == 0)
+      return std::hypot(y, z);
+    else if (y == 0)
+      return std::hypot(x, z);
+    else if (z == 0)
+      return std::hypot(x, y);
+    else
+      {
+	long double hi = std::max(std::max(x, y), z);
+	long double lo0 = std::min(std::max(x, y), z);
+	long double lo1 = std::min(x, y);
+	if (std::isinf(x * x + y * y + z * z) || 0 == (lo0 * lo0 + lo1 * lo1))
+	  {
+	    lo0 /= hi;
+	    lo1 /= hi;
+	    return std::abs(hi) * std::sqrt(1 + (lo0 * lo0 + lo1 * lo1));
+	  }
+	else
+	  {
+	    return std::sqrt(hi * hi + (lo0 * lo0 + lo1 * lo1));
+	  }
+      }
+  };
+  test_values_3arg<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+      limits::min() / 3, -0., limits::denorm_min(),
+#endif
+      0., 1., -1., limits::min(), limits::max(), -limits::max()},
+    {100000}, MAKE_TESTER_2(hypot, hypot3));
+  COMPARE(hypot(V(limits::max()), V(limits::max()), V()),
+	  V(limits::infinity()));
+  COMPARE(hypot(V(limits::max()), V(), V(limits::max())),
+	  V(limits::infinity()));
+  COMPARE(hypot(V(), V(limits::max()), V(limits::max())),
+	  V(limits::infinity()));
+  COMPARE(hypot(V(limits::min()), V(limits::min()), V(limits::min())),
+	  V(limits::min() * std::sqrt(T(3))));
+  VERIFY((sfinae_is_callable<V, V, V>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, T, V>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, T>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, V, T>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, V, V>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, V>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, V, T>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<int, int, V>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<int, V, int>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, int>(
+    [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+
+  vir::test::setFuzzyness<float>(0);
+  vir::test::setFuzzyness<double>(0);
+  test_values_3arg<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+      limits::min() / 3, limits::denorm_min(),
+#endif
+      0., limits::min(), limits::max()},
+    {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(fma));
+  VERIFY((sfinae_is_callable<V, V, V>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, T, V>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, T>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, V, T>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<T, V, V>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, V>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, V, T>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<int, int, V>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<int, V, int>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, T, int>(
+    [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+}
+
+// vim: ts=8 noet sw=2 sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h
new file mode 100644
index 00000000000..384bfa4d897
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h
@@ -0,0 +1,220 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/metahelpers.h"
+
+// for_constexpr {{{1
+template <typename T, T Begin, T End, T Stride = 1, typename F>
+void
+for_constexpr(F&& fun)
+{
+  if constexpr (Begin <= End)
+    {
+      fun(std::integral_constant<T, Begin>());
+      if constexpr (Begin < End)
+	{
+	  for_constexpr<T, Begin + Stride, End, Stride>(static_cast<F&&>(fun));
+	}
+    }
+}
+
+template <typename V>
+void
+test() //{{{1
+{
+  using T = typename V::value_type;
+  if constexpr (std::is_integral_v<T>)
+    {
+      constexpr int nbits(sizeof(T) * CHAR_BIT);
+      constexpr int n_promo_bits = std::max(nbits, int(sizeof(int) * CHAR_BIT));
+
+      // complement{{{2
+      COMPARE(~V(), V(~T()));
+      COMPARE(~V(~T()), V());
+
+      { // modulus{{{2
+	V x = make_vec<V>({3, 4}, 2);
+	COMPARE(x % x, V(0));
+	V y = x - 1;
+	COMPARE(x % y, V(1));
+	y = x + 1;
+	COMPARE(x % y, x);
+	if (std::is_signed<T>::value)
+	  {
+	    x = -x;
+	    COMPARE(x % y, x);
+	    x = -y;
+	    COMPARE(x % y, V(0));
+	    x = x - 1;
+	    COMPARE(x % y, V(-1));
+	    x %= y;
+	    COMPARE(x, V(-1));
+	  }
+      }
+
+      { // bit_and{{{2
+	V x = make_vec<V>({3, 4, 5}, 8);
+	COMPARE(x & x, x);
+	COMPARE(x & ~x, V());
+	COMPARE(x & V(), V());
+	COMPARE(V() & x, V());
+	V y = make_vec<V>({1, 5, 3}, 8);
+	COMPARE(x & y, make_vec<V>({1, 4, 1}, 8));
+	x &= y;
+	COMPARE(x, make_vec<V>({1, 4, 1}, 8));
+      }
+
+      { // bit_or{{{2
+	V x = make_vec<V>({3, 4, 5}, 8);
+	COMPARE(x | x, x);
+	COMPARE(x | ~x, ~V());
+	COMPARE(x | V(), x);
+	COMPARE(V() | x, x);
+	V y = make_vec<V>({1, 5, 3}, 8);
+	COMPARE(x | y, make_vec<V>({3, 5, 7}, 8));
+	x |= y;
+	COMPARE(x, make_vec<V>({3, 5, 7}, 8));
+      }
+
+      { // bit_xor{{{2
+	V x = make_vec<V>({3, 4, 5}, 8);
+	COMPARE(x ^ x, V());
+	COMPARE(x ^ ~x, ~V());
+	COMPARE(x ^ V(), x);
+	COMPARE(V() ^ x, x);
+	V y = make_vec<V>({1, 5, 3}, 8);
+	COMPARE(x ^ y, make_vec<V>({2, 1, 6}, 0));
+	x ^= y;
+	COMPARE(x, make_vec<V>({2, 1, 6}, 0));
+      }
+
+      { // bit_shift_left{{{2
+	// Note:
+	// - negative RHS or RHS >= max(#bits(T), #bits(int)) is UB
+	// - negative LHS is UB
+	// - shifting into (or over) the sign bit is UB
+	// - unsigned LHS overflow is modulo arithmetic
+	COMPARE(V() << 1, V());
+	for (int i = 0; i < nbits - 1; ++i)
+	  {
+	    COMPARE(V(1) << i, V(T(1) << i)) << "i: " << i;
+	  }
+	for_constexpr<int, 0, n_promo_bits - 1>([](auto shift_ic) {
+	  constexpr int shift = shift_ic;
+	  const V seq = make_value_unknown(V([&](T i) {
+	    if constexpr (std::is_signed_v<T>)
+	      {
+		const T max = std::numeric_limits<T>::max() >> shift;
+		return max == 0 ? 1 : (std::abs(max - i) % max) + 1;
+	      }
+	    else
+	      {
+		return ~T() - i;
+	      }
+	  }));
+	  const V ref([&](T i) { return T(seq[i] << shift); });
+	  COMPARE(seq << shift, ref) << "seq: " << seq << ", shift: " << shift;
+	  COMPARE(seq << make_value_unknown(shift), ref)
+	    << "seq: " << seq << ", shift: " << shift;
+	});
+	{
+	  V seq = make_vec<V>({0, 1}, nbits - 2);
+	  seq %= nbits - 1;
+	  COMPARE(make_vec<V>({0, 1}, 0) << seq,
+		  V([&](auto i) { return T(T(i & 1) << seq[i]); }))
+	    << "seq = " << seq;
+	  COMPARE(make_vec<V>({1, 0}, 0) << seq,
+		  V([&](auto i) { return T(T(~i & 1) << seq[i]); }));
+	  COMPARE(V(1) << seq, V([&](auto i) { return T(T(1) << seq[i]); }));
+	}
+	if (std::is_unsigned<T>::value)
+	  {
+	    constexpr int shift_count = nbits - 1;
+	    COMPARE(V(1) << shift_count, V(T(1) << shift_count));
+	    constexpr T max = // avoid overflow warning in the last COMPARE
+	      std::is_unsigned<T>::value ? std::numeric_limits<T>::max() : T(1);
+	    COMPARE(V(max) << shift_count, V(max << shift_count))
+	      << "shift_count: " << shift_count;
+	  }
+      }
+
+      { // bit_shift_right{{{2
+	// Note:
+	// - negative LHS is implementation defined
+	// - negative RHS or RHS >= #bits is UB
+	// - no other UB
+	COMPARE(V(~T()) >> V(0), V(~T()));
+	COMPARE(V(~T()) >> V(make_value_unknown(0)), V(~T()));
+	for (int s = 1; s < nbits; ++s)
+	  {
+	    COMPARE(V(~T()) >> V(s), V(T(~T()) >> s)) << "s: " << s;
+	  }
+	for (int s = 1; s < nbits; ++s)
+	  {
+	    COMPARE(V(~T(1)) >> V(s), V(T(~T(1)) >> s)) << "s: " << s;
+	  }
+	COMPARE(V(0) >> V(1), V(0));
+	COMPARE(V(1) >> V(1), V(0));
+	COMPARE(V(2) >> V(1), V(1));
+	COMPARE(V(3) >> V(1), V(1));
+	COMPARE(V(7) >> V(2), V(1));
+	for (int j = 0; j < 100; ++j)
+	  {
+	    const V seq([&](auto i) -> T { return (j + i) % n_promo_bits; });
+	    COMPARE(V(1) >> seq, V([&](auto i) { return T(T(1) >> seq[i]); }))
+	      << "seq = " << seq;
+	    COMPARE(make_value_unknown(V(1)) >> make_value_unknown(seq),
+		    V([&](auto i) { return T(T(1) >> seq[i]); }))
+	      << "seq = " << seq;
+	  }
+	for_constexpr<int, 0, n_promo_bits - 1>([](auto shift_ic) {
+	  constexpr int shift = shift_ic;
+	  const V seq = make_value_unknown(V([&](int i) {
+	    using U = std::make_unsigned_t<T>;
+	    return T(~U() >> (i % 32));
+	  }));
+	  const V ref([&](T i) { return T(seq[i] >> shift); });
+	  COMPARE(seq >> shift, ref) << "seq: " << seq << ", shift: " << shift;
+	  COMPARE(seq >> make_value_unknown(shift), ref)
+	    << "seq: " << seq << ", shift: " << shift;
+	});
+      }
+
+      //}}}2
+    }
+  else
+    {
+      VERIFY((is_substitution_failure<V, V, std::modulus<>>) );
+      VERIFY((is_substitution_failure<V, V, std::bit_and<>>) );
+      VERIFY((is_substitution_failure<V, V, std::bit_or<>>) );
+      VERIFY((is_substitution_failure<V, V, std::bit_xor<>>) );
+      VERIFY((is_substitution_failure<V, V, bit_shift_left>) );
+      VERIFY((is_substitution_failure<V, V, bit_shift_right>) );
+
+      VERIFY((is_substitution_failure<V&, V, assign_modulus>) );
+      VERIFY((is_substitution_failure<V&, V, assign_bit_and>) );
+      VERIFY((is_substitution_failure<V&, V, assign_bit_or>) );
+      VERIFY((is_substitution_failure<V&, V, assign_bit_xor>) );
+      VERIFY((is_substitution_failure<V&, V, assign_bit_shift_left>) );
+      VERIFY((is_substitution_failure<V&, V, assign_bit_shift_right>) );
+    }
+}
+// }}}1
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h b/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h
new file mode 100644
index 00000000000..8947d483a09
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h
@@ -0,0 +1,135 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(0);
+  vir::test::setFuzzyness<double>(0);
+
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(),
+      limits::infinity(),
+      -limits::infinity(),
+      -0.,
+      limits::denorm_min(),
+      limits::min() / 3,
+      -limits::denorm_min(),
+      -limits::min() / 3,
+#endif
+      +0.,
+      +1.3,
+      -1.3,
+      2.1,
+      -2.1,
+      0.99,
+      0.9,
+      -0.9,
+      -0.99,
+      limits::min(),
+      limits::max(),
+      -limits::min(),
+      -limits::max()},
+    {10000, -limits::max() / 2, limits::max() / 2},
+    [](const V input) {
+      for (int exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+	{
+	  const auto totest = ldexp(input, exp);
+	  using R = std::remove_const_t<decltype(totest)>;
+	  auto&& expected = [&](const auto& v) -> const R {
+	    R tmp = {};
+	    using std::ldexp;
+	    for (std::size_t i = 0; i < R::size(); ++i)
+	      {
+		tmp[i] = ldexp(v[i], exp);
+	      }
+	    return tmp;
+	  };
+	  const R expect1 = expected(input);
+	  COMPARE(isnan(totest), isnan(expect1))
+	    << "ldexp(" << input << ", " << exp << ") = " << totest
+	    << " != " << expect1;
+	  FUZZY_COMPARE(ldexp(iif(isnan(expect1), 0, input), exp),
+			expected(iif(isnan(expect1), 0, input)))
+	    << "\nclean = " << iif(isnan(expect1), 0, input);
+	}
+    },
+    [](const V input) {
+      for (int exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+	{
+	  const auto totest = scalbn(input, exp);
+	  using R = std::remove_const_t<decltype(totest)>;
+	  auto&& expected = [&](const auto& v) -> const R {
+	    R tmp = {};
+	    using std::scalbn;
+	    for (std::size_t i = 0; i < R::size(); ++i)
+	      {
+		tmp[i] = scalbn(v[i], exp);
+	      }
+	    return tmp;
+	  };
+	  const R expect1 = expected(input);
+	  COMPARE(isnan(totest), isnan(expect1))
+	    << "scalbn(" << input << ", " << exp << ") = " << totest
+	    << " != " << expect1;
+	  FUZZY_COMPARE(scalbn(iif(isnan(expect1), 0, input), exp),
+			expected(iif(isnan(expect1), 0, input)))
+	    << "\nclean = " << iif(isnan(expect1), 0, input);
+	}
+    },
+    [](const V input) {
+      for (long exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+	{
+	  const auto totest = scalbln(input, exp);
+	  using R = std::remove_const_t<decltype(totest)>;
+	  auto&& expected = [&](const auto& v) -> const R {
+	    R tmp = {};
+	    using std::scalbln;
+	    for (std::size_t i = 0; i < R::size(); ++i)
+	      {
+		tmp[i] = scalbln(v[i], exp);
+	      }
+	    return tmp;
+	  };
+	  const R expect1 = expected(input);
+	  COMPARE(isnan(totest), isnan(expect1))
+	    << "scalbln(" << input << ", " << exp << ") = " << totest
+	    << " != " << expect1;
+	  FUZZY_COMPARE(scalbln(iif(isnan(expect1), 0, input), exp),
+			expected(iif(isnan(expect1), 0, input)))
+	    << "\nclean = " << iif(isnan(expect1), 0, input);
+	}
+    },
+    [](const V input) {
+      V integral = {};
+      const V totest = modf(input, &integral);
+      auto&& expected = [&](const auto& v) -> std::pair<const V, const V> {
+	std::pair<V, V> tmp = {};
+	using std::modf;
+	for (std::size_t i = 0; i < V::size(); ++i)
+	  {
+	    typename V::value_type tmp2;
+	    tmp.first[i] = modf(v[i], &tmp2);
+	    tmp.second[i] = tmp2;
+	  }
+	return tmp;
+      };
+      const auto expect1 = expected(input);
+      COMPARE(isnan(totest), isnan(expect1.first))
+	<< "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+      COMPARE(isnan(integral), isnan(expect1.second))
+	<< "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+      COMPARE(isnan(totest), isnan(integral))
+	<< "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+      const V clean = iif(isnan(totest), 0, input);
+      const auto expect2 = expected(clean);
+      COMPARE(modf(clean, &integral), expect2.first) << "\nclean = " << clean;
+      COMPARE(integral, expect2.second);
+    });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h
new file mode 100644
index 00000000000..74945532734
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h
@@ -0,0 +1,209 @@
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/conversions.h"
+
+template <typename V, typename U>
+void
+load_store()
+{
+  // types, tags, and constants {{{2
+  using T = typename V::value_type;
+  auto&& gen = make_vec<V>;
+  using std::experimental::element_aligned;
+  using std::experimental::vector_aligned;
+
+  // stride_alignment: consider V::size() == 6. The only reliable alignment is
+  // 2 * sizeof(U). I.e. if the first address is aligned to 8 * sizeof(U), then
+  // the next address is 6 * sizeof(U) larger, thus only aligned to 2 *
+  // sizeof(U).
+  // => the LSB determines the stride alignment
+  constexpr size_t stride_alignment = size_t(1) << __builtin_ctz(V::size());
+  using stride_aligned_t = std::conditional_t<
+    V::size() == stride_alignment, decltype(vector_aligned),
+    std::experimental::overaligned_tag<stride_alignment * sizeof(U)>>;
+  constexpr stride_aligned_t stride_aligned = {};
+  constexpr size_t alignment = 2 * std::experimental::memory_alignment_v<V, U>;
+  constexpr auto overaligned = std::experimental::overaligned<alignment>;
+  const V indexes_from_0([](auto i) { return i; });
+  for (std::size_t i = 0; i < V::size(); ++i)
+    {
+      COMPARE(indexes_from_0[i], T(i));
+    }
+
+  // loads {{{2
+  cvt_inputs<T, U> test_values;
+
+  constexpr auto mem_size
+    = test_values.size() > 3 * V::size() ? test_values.size() : 3 * V::size();
+  alignas(std::experimental::memory_alignment_v<V, U> * 2) U mem[mem_size] = {};
+  alignas(std::experimental::memory_alignment_v<V, T> * 2) T reference[mem_size]
+    = {};
+  for (std::size_t i = 0; i < test_values.size(); ++i)
+    {
+      const U value = test_values[i];
+      mem[i] = value;
+      reference[i] = static_cast<T>(value);
+    }
+  for (std::size_t i = test_values.size(); i < mem_size; ++i)
+    {
+      mem[i] = U(i);
+      reference[i] = mem[i];
+    }
+
+  V x(&mem[V::size()], stride_aligned);
+  auto&& compare = [&](const std::size_t offset) {
+    static int n = 0;
+    const V ref(&reference[offset], element_aligned);
+    for (auto i = 0ul; i < V::size(); ++i)
+      {
+	if (is_conversion_undefined<T>(mem[i + offset]))
+	  {
+	    continue;
+	  }
+	COMPARE(x[i], reference[i + offset])
+	  << "\nbefore conversion: " << mem[i + offset]
+	  << "\n   offset = " << offset << "\n        x = " << x
+	  << "\nreference = " << ref << "\nx == ref  = " << (x == ref)
+	  << "\ncall no. " << n;
+      }
+    ++n;
+  };
+  compare(V::size());
+  x = V{mem, overaligned};
+  compare(0);
+  x = {&mem[1], element_aligned};
+  compare(1);
+
+  x.copy_from(&mem[V::size()], stride_aligned);
+  compare(V::size());
+  x.copy_from(&mem[1], element_aligned);
+  compare(1);
+  x.copy_from(mem, vector_aligned);
+  compare(0);
+
+  for (std::size_t i = 0; i < mem_size - V::size(); ++i)
+    {
+      x.copy_from(&mem[i], element_aligned);
+      compare(i);
+    }
+
+  for (std::size_t i = 0; i < test_values.size(); ++i)
+    {
+      mem[i] = U(i);
+    }
+  x = indexes_from_0;
+  using M = typename V::mask_type;
+  const M alternating_mask = make_mask<M>({0, 1});
+  where(alternating_mask, x).copy_from(&mem[V::size()], stride_aligned);
+
+  const V indexes_from_size = gen({T(V::size())}, 1);
+  COMPARE(x == indexes_from_size, alternating_mask)
+    << "x: " << x << "\nindexes_from_size: " << indexes_from_size;
+  COMPARE(x == indexes_from_0, !alternating_mask);
+  where(alternating_mask, x).copy_from(&mem[1], element_aligned);
+
+  const V indexes_from_1 = gen({1, 2, 3, 4}, 4);
+  COMPARE(x == indexes_from_1, alternating_mask);
+  COMPARE(x == indexes_from_0, !alternating_mask);
+  where(!alternating_mask, x).copy_from(mem, overaligned);
+  COMPARE(x == indexes_from_0, !alternating_mask);
+  COMPARE(x == indexes_from_1, alternating_mask);
+
+  x = where(alternating_mask, V()).copy_from(&mem[V::size()], stride_aligned);
+  COMPARE(x == indexes_from_size, alternating_mask);
+  COMPARE(x == 0, !alternating_mask);
+
+  x = where(!alternating_mask, V()).copy_from(&mem[1], element_aligned);
+  COMPARE(x == indexes_from_1, !alternating_mask);
+  COMPARE(x == 0, alternating_mask);
+
+  // stores {{{2
+  auto&& init_mem = [&mem](U init) {
+    for (auto i = mem_size; i; --i)
+      {
+	mem[i - 1] = init;
+      }
+  };
+  init_mem(-1);
+  x = indexes_from_1;
+  x.copy_to(&mem[V::size()], stride_aligned);
+  std::size_t i = 0;
+  for (; i < V::size(); ++i)
+    {
+      COMPARE(mem[i], U(-1)) << "i: " << i;
+    }
+  for (; i < 2 * V::size(); ++i)
+    {
+      COMPARE(mem[i], U(i - V::size() + 1)) << "i: " << i;
+    }
+  for (; i < 3 * V::size(); ++i)
+    {
+      COMPARE(mem[i], U(-1)) << "i: " << i;
+    }
+
+  init_mem(-1);
+  x.copy_to(&mem[1], element_aligned);
+  COMPARE(mem[0], U(-1));
+  for (i = 1; i <= V::size(); ++i)
+    {
+      COMPARE(mem[i], U(i));
+    }
+  for (; i < 3 * V::size(); ++i)
+    {
+      COMPARE(mem[i], U(-1));
+    }
+
+  init_mem(-1);
+  x.copy_to(mem, vector_aligned);
+  for (i = 0; i < V::size(); ++i)
+    {
+      COMPARE(mem[i], U(i + 1));
+    }
+  for (; i < 3 * V::size(); ++i)
+    {
+      COMPARE(mem[i], U(-1));
+    }
+
+  init_mem(-1);
+  where(alternating_mask, indexes_from_0)
+    .copy_to(&mem[V::size()], stride_aligned);
+  for (i = 0; i < V::size() + 1; ++i)
+    {
+      COMPARE(mem[i], U(-1));
+    }
+  for (; i < 2 * V::size(); i += 2)
+    {
+      COMPARE(mem[i], U(i - V::size()));
+    }
+  for (i = V::size() + 2; i < 2 * V::size(); i += 2)
+    {
+      COMPARE(mem[i], U(-1));
+    }
+  for (; i < 3 * V::size(); ++i)
+    {
+      COMPARE(mem[i], U(-1));
+    }
+}
+
+template <typename V>
+void
+test()
+{
+  load_store<V, long double>();
+  load_store<V, double>();
+  load_store<V, float>();
+  load_store<V, long long>();
+  load_store<V, unsigned long long>();
+  load_store<V, unsigned long>();
+  load_store<V, long>();
+  load_store<V, int>();
+  load_store<V, unsigned int>();
+  load_store<V, short>();
+  load_store<V, unsigned short>();
+  load_store<V, char>();
+  load_store<V, signed char>();
+  load_store<V, unsigned char>();
+  load_store<V, char32_t>();
+  load_store<V, char16_t>();
+  load_store<V, wchar_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h b/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h
new file mode 100644
index 00000000000..159edbf34e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h
@@ -0,0 +1,54 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/mathreference.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(1);
+  vir::test::setFuzzyness<double>(1);
+
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>({1,
+		  2,
+		  4,
+		  8,
+		  16,
+		  32,
+		  64,
+		  128,
+		  256,
+		  512,
+		  1024,
+		  2048,
+		  3,
+		  5,
+		  7,
+		  15,
+		  17,
+		  31,
+		  33,
+		  63,
+		  65,
+#ifdef __STDC_IEC_559__
+		  limits::quiet_NaN(),
+		  limits::infinity(),
+		  -limits::infinity(),
+		  limits::denorm_min(),
+		  -limits::denorm_min(),
+		  limits::min() / 3,
+		  -limits::min() / 3,
+		  -0.,
+#endif
+		  +0.,
+		  limits::min(),
+		  limits::max(),
+		  -limits::min(),
+		  -limits::max()},
+		 {10000, -limits::max() / 2, limits::max() / 2},
+		 MAKE_TESTER(log), MAKE_TESTER(log10), MAKE_TESTER(log1p),
+		 MAKE_TESTER(log2), MAKE_TESTER(logb));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h
new file mode 100644
index 00000000000..dc9af0a1ac4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h
@@ -0,0 +1,50 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  static_assert(std::is_convertible<typename M::reference, bool>::value,
+		"A smart_reference<simd_mask> must be convertible to bool.");
+  static_assert(
+    std::is_same<bool, decltype(std::declval<const typename M::reference&>()
+				== true)>::value,
+    "A smart_reference<simd_mask> must be comparable against bool.");
+  static_assert(
+    vir::test::sfinae_is_callable<typename M::reference&&, bool>(
+      [](auto&& a, auto&& b) -> decltype(std::declval<decltype(a)>()
+					 == std::declval<decltype(b)>()) {
+	return {};
+      }),
+    "A smart_reference<simd_mask> must be comparable against bool.");
+  VERIFY(std::experimental::is_simd_mask_v<M>);
+
+  {
+    M x;     // uninitialized
+    x = M{}; // default broadcasts 0
+    COMPARE(x, M(false));
+    COMPARE(x, M());
+    COMPARE(x, M{});
+    x = M(); // default broadcasts 0
+    COMPARE(x, M(false));
+    COMPARE(x, M());
+    COMPARE(x, M{});
+    x = x;
+    for (std::size_t i = 0; i < M::size(); ++i)
+      {
+	COMPARE(x[i], false);
+      }
+  }
+
+  M x(true);
+  M y(false);
+  for (std::size_t i = 0; i < M::size(); ++i)
+    {
+      COMPARE(x[i], true);
+      COMPARE(y[i], false);
+    }
+  y = M(true);
+  COMPARE(x, y);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h
new file mode 100644
index 00000000000..cc2cdcac7af
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h
@@ -0,0 +1,94 @@
+#include "bits/verify.h"
+
+namespace stdx = std::experimental;
+
+template <typename From, typename To>
+void
+conversions()
+{
+  using ToV = typename To::simd_type;
+
+  using stdx::simd_cast;
+  using stdx::static_simd_cast;
+  using stdx::__proposed::resizing_simd_cast;
+
+  auto x = resizing_simd_cast<To>(From());
+  COMPARE(typeid(x), typeid(To));
+  COMPARE(x, To());
+
+  x = resizing_simd_cast<To>(From(true));
+  const To ref = ToV([](auto i) { return i; }) < int(From::size());
+  COMPARE(x, ref) << "converted from: " << From(true);
+
+  const ullong all_bits = ~ullong() >> (64 - From::size());
+  for (ullong bit_pos = 1; bit_pos /*until overflow*/; bit_pos *= 2)
+    {
+      for (ullong bits : {bit_pos & all_bits, ~bit_pos & all_bits})
+	{
+	  const auto from = From::__from_bitset(bits);
+	  const auto to = resizing_simd_cast<To>(from);
+	  COMPARE(to, To::__from_bitset(bits))
+	    << "\nfrom: " << from << "\nbits: " << std::hex << bits << std::dec;
+	  for (std::size_t i = 0; i < To::size(); ++i)
+	    {
+	      COMPARE(to[i], (bits >> i) & 1)
+		<< "\nfrom: " << from << "\nto: " << to
+		<< "\nbits: " << std::hex << bits << std::dec << "\ni: " << i;
+	    }
+	}
+    }
+}
+
+template <typename T, typename V, typename = void> struct rebind_or_max_fixed
+{
+  using type = stdx::rebind_simd_t<
+    T, stdx::resize_simd_t<stdx::simd_abi::max_fixed_size<T>, V>>;
+};
+template <typename T, typename V>
+struct rebind_or_max_fixed<T, V, std::void_t<stdx::rebind_simd_t<T, V>>>
+{
+  using type = stdx::rebind_simd_t<T, V>;
+};
+
+template <typename From, typename To>
+void
+apply_abis()
+{
+  using M0 = typename rebind_or_max_fixed<To, From>::type;
+  using M1 = stdx::native_simd_mask<To>;
+  using M2 = stdx::simd_mask<To>;
+  using M3 = stdx::simd_mask<To, stdx::simd_abi::scalar>;
+
+  using std::is_same_v;
+  conversions<From, M0>();
+  if constexpr (!is_same_v<M1, M0>)
+    conversions<From, M1>();
+  if constexpr (!is_same_v<M2, M0> && !is_same_v<M2, M1>)
+    conversions<From, M2>();
+  if constexpr (!is_same_v<M3, M0> && !is_same_v<M3, M1> && !is_same_v<M3, M2>)
+    conversions<From, M3>();
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  apply_abis<M, ldouble>();
+  apply_abis<M, double>();
+  apply_abis<M, float>();
+  apply_abis<M, ullong>();
+  apply_abis<M, llong>();
+  apply_abis<M, ulong>();
+  apply_abis<M, long>();
+  apply_abis<M, uint>();
+  apply_abis<M, int>();
+  apply_abis<M, ushort>();
+  apply_abis<M, short>();
+  apply_abis<M, uchar>();
+  apply_abis<M, schar>();
+  apply_abis<M, char>();
+  apply_abis<M, wchar_t>();
+  apply_abis<M, char16_t>();
+  apply_abis<M, char32_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h
new file mode 100644
index 00000000000..56149ba343e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h
@@ -0,0 +1,84 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <class M, class M2>
+constexpr bool assign_should_work
+  = std::is_same<M, M2>::value
+    || (std::is_same<typename M::abi_type,
+		     std::experimental::simd_abi::fixed_size<M::size()>>::value
+	&& std::is_same<typename M::abi_type, typename M2::abi_type>::value);
+template <class M, class M2>
+constexpr bool assign_should_not_work = !assign_should_work<M, M2>;
+
+template <class L, class R>
+std::enable_if_t<assign_should_work<L, R>>
+implicit_conversions_test()
+{
+  L x = R(true);
+  COMPARE(x, L(true));
+  x = R(false);
+  COMPARE(x, L(false));
+  R y(false);
+  y[0] = true;
+  x = y;
+  L ref(false);
+  ref[0] = true;
+  COMPARE(x, ref);
+}
+
+template <class L, class R>
+std::enable_if_t<assign_should_not_work<L, R>>
+implicit_conversions_test()
+{
+  VERIFY((is_substitution_failure<L&, R, assignment>) );
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  using std::experimental::fixed_size_simd_mask;
+  using std::experimental::native_simd_mask;
+  using std::experimental::simd_mask;
+
+  implicit_conversions_test<M, simd_mask<ldouble>>();
+  implicit_conversions_test<M, simd_mask<double>>();
+  implicit_conversions_test<M, simd_mask<float>>();
+  implicit_conversions_test<M, simd_mask<ullong>>();
+  implicit_conversions_test<M, simd_mask<llong>>();
+  implicit_conversions_test<M, simd_mask<ulong>>();
+  implicit_conversions_test<M, simd_mask<long>>();
+  implicit_conversions_test<M, simd_mask<uint>>();
+  implicit_conversions_test<M, simd_mask<int>>();
+  implicit_conversions_test<M, simd_mask<ushort>>();
+  implicit_conversions_test<M, simd_mask<short>>();
+  implicit_conversions_test<M, simd_mask<uchar>>();
+  implicit_conversions_test<M, simd_mask<schar>>();
+  implicit_conversions_test<M, native_simd_mask<ldouble>>();
+  implicit_conversions_test<M, native_simd_mask<double>>();
+  implicit_conversions_test<M, native_simd_mask<float>>();
+  implicit_conversions_test<M, native_simd_mask<ullong>>();
+  implicit_conversions_test<M, native_simd_mask<llong>>();
+  implicit_conversions_test<M, native_simd_mask<ulong>>();
+  implicit_conversions_test<M, native_simd_mask<long>>();
+  implicit_conversions_test<M, native_simd_mask<uint>>();
+  implicit_conversions_test<M, native_simd_mask<int>>();
+  implicit_conversions_test<M, native_simd_mask<ushort>>();
+  implicit_conversions_test<M, native_simd_mask<short>>();
+  implicit_conversions_test<M, native_simd_mask<uchar>>();
+  implicit_conversions_test<M, native_simd_mask<schar>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<ldouble, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<double, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<float, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<ullong, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<llong, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<ulong, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<long, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<uint, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<int, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<ushort, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<short, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<uchar, M::size()>>();
+  implicit_conversions_test<M, fixed_size_simd_mask<schar, M::size()>>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h
new file mode 100644
index 00000000000..933d30ed7a6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h
@@ -0,0 +1,144 @@
+#include "bits/verify.h"
+
+// simd_mask generator functions {{{1
+template <class M>
+M
+make_mask(const std::initializer_list<bool>& init)
+{
+  std::size_t i = 0;
+  M r = {};
+  for (;;)
+    {
+      for (bool x : init)
+	{
+	  r[i] = x;
+	  if (++i == M::size())
+	    {
+	      return r;
+	    }
+	}
+    }
+}
+
+template <class M>
+M
+make_alternating_mask()
+{
+  return make_mask<M>({false, true});
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  // loads {{{2
+  constexpr size_t alignment = 2 * std::experimental::memory_alignment_v<M>;
+  alignas(alignment) bool mem[3 * M::size()];
+  std::memset(mem, 0, sizeof(mem));
+  for (std::size_t i = 1; i < sizeof(mem) / sizeof(*mem); i += 2)
+    {
+      COMPARE(mem[i - 1], false);
+      mem[i] = true;
+    }
+  using std::experimental::element_aligned;
+  using std::experimental::vector_aligned;
+  constexpr size_t stride_alignment
+    = M::size() & 1
+	? 1
+	: M::size() & 2
+	    ? 2
+	    : M::size() & 4
+		? 4
+		: M::size() & 8
+		    ? 8
+		    : M::size() & 16
+			? 16
+			: M::size() & 32
+			    ? 32
+			    : M::size() & 64
+				? 64
+				: M::size() & 128 ? 128
+						  : M::size() & 256 ? 256 : 512;
+  using stride_aligned_t = std::conditional_t<
+    M::size() == stride_alignment, decltype(vector_aligned),
+    std::experimental::overaligned_tag<stride_alignment * sizeof(bool)>>;
+  constexpr stride_aligned_t stride_aligned = {};
+  constexpr auto overaligned = std::experimental::overaligned<alignment>;
+
+  const M alternating_mask = make_alternating_mask<M>();
+
+  M x(&mem[M::size()], stride_aligned);
+  COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : alternating_mask)
+    << x.__to_bitset()
+    << ", alternating_mask: " << alternating_mask.__to_bitset();
+  x = {&mem[1], element_aligned};
+  COMPARE(x, !alternating_mask);
+  x = M{mem, overaligned};
+  COMPARE(x, alternating_mask);
+
+  x.copy_from(&mem[M::size()], stride_aligned);
+  COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : alternating_mask);
+  x.copy_from(&mem[1], element_aligned);
+  COMPARE(x, !alternating_mask);
+  x.copy_from(mem, vector_aligned);
+  COMPARE(x, alternating_mask);
+
+  x = !alternating_mask;
+  where(alternating_mask, x).copy_from(&mem[M::size()], stride_aligned);
+  COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : M{true});
+  x = M(true);                                                    // 1111
+  where(alternating_mask, x).copy_from(&mem[1], element_aligned); // load .0.0
+  COMPARE(x, !alternating_mask);                                  // 1010
+  where(alternating_mask, x).copy_from(mem, overaligned);         // load .1.1
+  COMPARE(x, M{true});                                            // 1111
+
+  // stores {{{2
+  memset(mem, 0, sizeof(mem));
+  x = M(true);
+  x.copy_to(&mem[M::size()], stride_aligned);
+  std::size_t i = 0;
+  for (; i < M::size(); ++i)
+    {
+      COMPARE(mem[i], false);
+    }
+  for (; i < 2 * M::size(); ++i)
+    {
+      COMPARE(mem[i], true) << "i: " << i << ", x: " << x;
+    }
+  for (; i < 3 * M::size(); ++i)
+    {
+      COMPARE(mem[i], false);
+    }
+  memset(mem, 0, sizeof(mem));
+  x.copy_to(&mem[1], element_aligned);
+  COMPARE(mem[0], false);
+  for (i = 1; i <= M::size(); ++i)
+    {
+      COMPARE(mem[i], true);
+    }
+  for (; i < 3 * M::size(); ++i)
+    {
+      COMPARE(mem[i], false);
+    }
+  memset(mem, 0, sizeof(mem));
+  alternating_mask.copy_to(mem, overaligned);
+  for (i = 0; i < M::size(); ++i)
+    {
+      COMPARE(mem[i], (i & 1) == 1);
+    }
+  for (; i < 3 * M::size(); ++i)
+    {
+      COMPARE(mem[i], false);
+    }
+  x.copy_to(mem, vector_aligned);
+  where(alternating_mask, !x).copy_to(mem, overaligned);
+  for (i = 0; i < M::size(); ++i)
+    {
+      COMPARE(mem[i], i % 2 == 0);
+    }
+  for (; i < 3 * M::size(); ++i)
+    {
+      COMPARE(mem[i], false);
+    }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h
new file mode 100644
index 00000000000..ebfce1ed569
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h
@@ -0,0 +1,94 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+using schar = signed char;
+using uchar = unsigned char;
+using ushort = unsigned short;
+using uint = unsigned int;
+using ulong = unsigned long;
+using llong = long long;
+using ullong = unsigned long long;
+using ldouble = long double;
+using wchar = wchar_t;
+using char16 = char16_t;
+using char32 = char32_t;
+
+template <typename M0, typename M1>
+constexpr bool
+bit_and_is_illformed()
+{
+  return is_substitution_failure<M0, M1, std::bit_and<>>;
+}
+
+template <typename M0, typename M1>
+void
+test_binary_op_cvt()
+{
+  COMPARE((bit_and_is_illformed<M0, M1>()), !(std::is_same_v<M0, M1>) );
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  // binary ops without conversions work
+  COMPARE(typeid(M() & M()), typeid(M));
+
+  // nothing else works: no implicit conv. or ambiguous
+  using std::experimental::fixed_size_simd_mask;
+  using std::experimental::native_simd_mask;
+  using std::experimental::simd_mask;
+  test_binary_op_cvt<M, bool>();
+
+  test_binary_op_cvt<M, simd_mask<ldouble>>();
+  test_binary_op_cvt<M, simd_mask<double>>();
+  test_binary_op_cvt<M, simd_mask<float>>();
+  test_binary_op_cvt<M, simd_mask<ullong>>();
+  test_binary_op_cvt<M, simd_mask<llong>>();
+  test_binary_op_cvt<M, simd_mask<ulong>>();
+  test_binary_op_cvt<M, simd_mask<long>>();
+  test_binary_op_cvt<M, simd_mask<uint>>();
+  test_binary_op_cvt<M, simd_mask<int>>();
+  test_binary_op_cvt<M, simd_mask<ushort>>();
+  test_binary_op_cvt<M, simd_mask<short>>();
+  test_binary_op_cvt<M, simd_mask<uchar>>();
+  test_binary_op_cvt<M, simd_mask<schar>>();
+  test_binary_op_cvt<M, simd_mask<wchar>>();
+  test_binary_op_cvt<M, simd_mask<char16>>();
+  test_binary_op_cvt<M, simd_mask<char32>>();
+
+  test_binary_op_cvt<M, native_simd_mask<ldouble>>();
+  test_binary_op_cvt<M, native_simd_mask<double>>();
+  test_binary_op_cvt<M, native_simd_mask<float>>();
+  test_binary_op_cvt<M, native_simd_mask<ullong>>();
+  test_binary_op_cvt<M, native_simd_mask<llong>>();
+  test_binary_op_cvt<M, native_simd_mask<ulong>>();
+  test_binary_op_cvt<M, native_simd_mask<long>>();
+  test_binary_op_cvt<M, native_simd_mask<uint>>();
+  test_binary_op_cvt<M, native_simd_mask<int>>();
+  test_binary_op_cvt<M, native_simd_mask<ushort>>();
+  test_binary_op_cvt<M, native_simd_mask<short>>();
+  test_binary_op_cvt<M, native_simd_mask<uchar>>();
+  test_binary_op_cvt<M, native_simd_mask<schar>>();
+  test_binary_op_cvt<M, native_simd_mask<wchar>>();
+  test_binary_op_cvt<M, native_simd_mask<char16>>();
+  test_binary_op_cvt<M, native_simd_mask<char32>>();
+
+  test_binary_op_cvt<M, fixed_size_simd_mask<ldouble, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<double, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<float, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<ullong, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<llong, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<ulong, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<long, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<uint, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<int, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<ushort, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<short, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<uchar, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<schar, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<wchar, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<char16, 2>>();
+  test_binary_op_cvt<M, fixed_size_simd_mask<char32, 2>>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h
new file mode 100644
index 00000000000..5156bc09021
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h
@@ -0,0 +1,40 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  { // compares{{{2
+    M x(true), y(false);
+    VERIFY(all_of(x == x));
+    VERIFY(all_of(x != y));
+    VERIFY(all_of(y != x));
+    VERIFY(!all_of(x != x));
+    VERIFY(!all_of(x == y));
+    VERIFY(!all_of(y == x));
+  }
+  { // subscripting{{{2
+    M x(true);
+    for (std::size_t i = 0; i < M::size(); ++i)
+      {
+	COMPARE(x[i], true) << "\nx: " << x << ", i: " << i;
+	x[i] = !x[i];
+      }
+    COMPARE(x, M{false});
+    for (std::size_t i = 0; i < M::size(); ++i)
+      {
+	COMPARE(x[i], false) << "\nx: " << x << ", i: " << i;
+	x[i] = !x[i];
+      }
+    COMPARE(x, M{true});
+  }
+  { // negation{{{2
+    M x(false);
+    M y = !x;
+    COMPARE(y, M{true});
+    COMPARE(!y, x);
+  }
+}
+
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h
new file mode 100644
index 00000000000..33011a2bf4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h
@@ -0,0 +1,211 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+// simd_mask generator functions {{{1
+template <class M>
+M
+make_mask(const std::initializer_list<bool>& init)
+{
+  std::size_t i = 0;
+  M r = {};
+  for (;;)
+    {
+      for (bool x : init)
+	{
+	  r[i] = x;
+	  if (++i == M::size())
+	    {
+	      return r;
+	    }
+	}
+    }
+}
+
+template <class M>
+M
+make_alternating_mask()
+{
+  return make_mask<M>({false, true});
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  const M alternating_mask = make_alternating_mask<M>();
+  COMPARE(alternating_mask[0], false); // assumption below
+  auto&& gen = make_mask<M>;
+
+  // all_of
+  VERIFY(all_of(M{true}));
+  VERIFY(!all_of(alternating_mask));
+  VERIFY(!all_of(M{false}));
+  using std::experimental::all_of;
+  VERIFY(all_of(true));
+  VERIFY(!all_of(false));
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+
+  // any_of
+  VERIFY(any_of(M{true}));
+  COMPARE(any_of(alternating_mask), M::size() > 1);
+  VERIFY(!any_of(M{false}));
+  using std::experimental::any_of;
+  VERIFY(any_of(true));
+  VERIFY(!any_of(false));
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+
+  // none_of
+  VERIFY(!none_of(M{true}));
+  COMPARE(none_of(alternating_mask), M::size() == 1);
+  VERIFY(none_of(M{false}));
+  using std::experimental::none_of;
+  VERIFY(!none_of(true));
+  VERIFY(none_of(false));
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+
+  // some_of
+  VERIFY(!some_of(M{true}));
+  VERIFY(!some_of(M{false}));
+  if (M::size() > 1)
+    {
+      VERIFY(some_of(gen({true, false})));
+      VERIFY(some_of(gen({false, true})));
+      if (M::size() > 3)
+	{
+	  VERIFY(some_of(gen({0, 0, 0, 1})));
+	}
+    }
+  using std::experimental::some_of;
+  VERIFY(!some_of(true));
+  VERIFY(!some_of(false));
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+
+  // popcount
+  COMPARE(popcount(M{true}), int(M::size()));
+  COMPARE(popcount(alternating_mask), int(M::size()) / 2);
+  COMPARE(popcount(M{false}), 0);
+  COMPARE(popcount(gen({0, 0, 1})), int(M::size()) / 3);
+  COMPARE(popcount(gen({0, 0, 0, 1})), int(M::size()) / 4);
+  COMPARE(popcount(gen({0, 0, 0, 0, 1})), int(M::size()) / 5);
+  COMPARE(std::experimental::popcount(true), 1);
+  COMPARE(std::experimental::popcount(false), 0);
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+
+  // find_first_set
+  {
+    M x(false);
+    for (int i = int(M::size() / 2 - 1); i >= 0; --i)
+      {
+	x[i] = true;
+	COMPARE(find_first_set(x), i) << x;
+      }
+    x = M(false);
+    for (int i = int(M::size() - 1); i >= 0; --i)
+      {
+	x[i] = true;
+	COMPARE(find_first_set(x), i) << x;
+      }
+  }
+  COMPARE(find_first_set(M{true}), 0);
+  if (M::size() > 1)
+    {
+      COMPARE(find_first_set(gen({0, 1})), 1);
+    }
+  if (M::size() > 2)
+    {
+      COMPARE(find_first_set(gen({0, 0, 1})), 2);
+    }
+  COMPARE(std::experimental::find_first_set(true), 0);
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+      return {};
+    }));
+
+  // find_last_set
+  {
+    M x(false);
+    for (int i = 0; i < int(M::size()); ++i)
+      {
+	x[i] = true;
+	COMPARE(find_last_set(x), i) << x;
+      }
+  }
+  COMPARE(find_last_set(M{true}), int(M::size()) - 1);
+  if (M::size() > 1)
+    {
+      COMPARE(find_last_set(gen({1, 0})),
+	      int(M::size()) - 2 + int(M::size() & 1));
+    }
+  if (M::size() > 3 && (M::size() & 3) == 0)
+    {
+      COMPARE(find_last_set(gen({1, 0, 0, 0})),
+	      int(M::size()) - 4 - int(M::size() & 3));
+    }
+  COMPARE(std::experimental::find_last_set(true), 0);
+  VERIFY(sfinae_is_callable<bool>(
+    [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<int>(
+    [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<float>(
+    [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+      return {};
+    }));
+  VERIFY(!sfinae_is_callable<char>(
+    [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+      return {};
+    }));
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h b/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h
new file mode 100644
index 00000000000..b0eec615a49
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h
@@ -0,0 +1,57 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(0);
+  vir::test::setFuzzyness<double>(0);
+
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>({+0.,
+		  0.5,
+		  -0.5,
+		  1.5,
+		  -1.5,
+		  2.5,
+		  -2.5,
+		  0x1.fffffffffffffp52,
+		  -0x1.fffffffffffffp52,
+		  0x1.ffffffffffffep52,
+		  -0x1.ffffffffffffep52,
+		  0x1.ffffffffffffdp52,
+		  -0x1.ffffffffffffdp52,
+		  0x1.fffffep21,
+		  -0x1.fffffep21,
+		  0x1.fffffcp21,
+		  -0x1.fffffcp21,
+		  0x1.fffffep22,
+		  -0x1.fffffep22,
+		  0x1.fffffcp22,
+		  -0x1.fffffcp22,
+		  0x1.fffffep23,
+		  -0x1.fffffep23,
+		  0x1.fffffcp23,
+		  -0x1.fffffcp23,
+		  0x1.8p23,
+		  -0x1.8p23,
+#ifdef __STDC_IEC_559__
+		  limits::infinity(),
+		  -limits::infinity(),
+		  -0.,
+		  limits::quiet_NaN(),
+		  limits::denorm_min(),
+		  limits::min() / 3,
+#endif
+		  limits::min(),
+		  limits::max()},
+		 {10000, -limits::max() / 2, limits::max() / 2},
+		 MAKE_TESTER(sqrt), MAKE_TESTER(erf), MAKE_TESTER(erfc),
+		 MAKE_TESTER(tgamma), MAKE_TESTER(lgamma), MAKE_TESTER(ceil),
+		 MAKE_TESTER(floor), MAKE_TESTER(trunc), MAKE_TESTER(round),
+		 MAKE_TESTER(lround), MAKE_TESTER(llround),
+		 MAKE_TESTER(nearbyint), MAKE_TESTER(rint), MAKE_TESTER(lrint),
+		 MAKE_TESTER(llrint), MAKE_TESTER(ilogb));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h b/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h
new file mode 100644
index 00000000000..ae7cf257ec9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h
@@ -0,0 +1,53 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  using T = typename V::value_type;
+  using limits = std::numeric_limits<T>;
+
+  vir::test::setFuzzyness<float>(1);
+  vir::test::setFuzzyness<double>(1);
+  vir::test::setFuzzyness<long double>(1);
+  test_values_2arg<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+      limits::denorm_min(), limits::min() / 3,
+#endif
+      +0., limits::min(), limits::max()},
+    {100000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(hypot));
+  COMPARE(hypot(V(limits::max()), V(limits::max())), V(limits::infinity()));
+  COMPARE(hypot(V(limits::min()), V(limits::min())),
+	  V(limits::min() * std::sqrt(T(2))));
+  VERIFY((sfinae_is_callable<V, V>(
+    [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+  VERIFY((sfinae_is_callable<typename V::value_type, V>(
+    [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+  VERIFY((sfinae_is_callable<V, typename V::value_type>(
+    [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+
+  vir::test::setFuzzyness<float>(0);
+  vir::test::setFuzzyness<double>(0);
+  vir::test::setFuzzyness<long double>(0);
+  test_values_2arg<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+      limits::denorm_min(), limits::min() / 3, -0.,
+#endif
+      +0., limits::min(), limits::max()},
+    {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(pow),
+    MAKE_TESTER(fmod), MAKE_TESTER(remainder), MAKE_TESTER_NOFPEXCEPT(copysign),
+    MAKE_TESTER(nextafter), // MAKE_TESTER(nexttoward),
+    MAKE_TESTER(fdim), MAKE_TESTER(fmax), MAKE_TESTER(fmin),
+    MAKE_TESTER_NOFPEXCEPT(isgreater), MAKE_TESTER_NOFPEXCEPT(isgreaterequal),
+    MAKE_TESTER_NOFPEXCEPT(isless), MAKE_TESTER_NOFPEXCEPT(islessequal),
+    MAKE_TESTER_NOFPEXCEPT(islessgreater), MAKE_TESTER_NOFPEXCEPT(isunordered));
+}
+
+// vim: ts=8 et sw=2 sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h
new file mode 100644
index 00000000000..82bb1ee5981
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h
@@ -0,0 +1,1064 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+// type with sizeof(char) but different signedness
+using xchar = std::conditional_t<std::is_unsigned_v<char>, schar, uchar>;
+
+// vT {{{
+using vschar = std::experimental::native_simd<schar>;
+using vuchar = std::experimental::native_simd<uchar>;
+using vshort = std::experimental::native_simd<short>;
+using vushort = std::experimental::native_simd<ushort>;
+using vint = std::experimental::native_simd<int>;
+using vuint = std::experimental::native_simd<uint>;
+using vlong = std::experimental::native_simd<long>;
+using vulong = std::experimental::native_simd<ulong>;
+using vllong = std::experimental::native_simd<llong>;
+using vullong = std::experimental::native_simd<ullong>;
+using vfloat = std::experimental::native_simd<float>;
+using vdouble = std::experimental::native_simd<double>;
+using vldouble = std::experimental::native_simd<long double>;
+using vchar = std::experimental::native_simd<char>;
+using vxchar = std::experimental::native_simd<xchar>;
+// }}}
+// viN/vfN {{{
+template <typename T>
+using vi8 = std::experimental::fixed_size_simd<T, vschar::size()>;
+template <typename T>
+using vi16 = std::experimental::fixed_size_simd<T, vshort::size()>;
+template <typename T>
+using vf32 = std::experimental::fixed_size_simd<T, vfloat::size()>;
+template <typename T>
+using vi32 = std::experimental::fixed_size_simd<T, vint::size()>;
+template <typename T>
+using vf64 = std::experimental::fixed_size_simd<T, vdouble::size()>;
+template <typename T>
+using vi64 = std::experimental::fixed_size_simd<T, vllong::size()>;
+template <typename T>
+using vl = typename std::conditional<sizeof(long) == sizeof(llong), vi64<T>,
+				     vi32<T>>::type;
+// }}}
+
+template <class A, class B, class Expected = A>
+void
+binary_op_return_type()
+{
+  using namespace vir::test;
+  static_assert(std::is_same<A, Expected>::value, "");
+  using AC = std::add_const_t<A>;
+  using BC = std::add_const_t<B>;
+  COMPARE(typeid(A() + B()), typeid(Expected));
+  COMPARE(typeid(B() + A()), typeid(Expected));
+  COMPARE(typeid(AC() + BC()), typeid(Expected));
+  COMPARE(typeid(BC() + AC()), typeid(Expected));
+}
+
+template <typename V>
+void
+test()
+{
+  using T = typename V::value_type;
+  namespace simd_abi = std::experimental::simd_abi;
+  binary_op_return_type<V, V, V>();
+  binary_op_return_type<V, T, V>();
+  binary_op_return_type<V, int, V>();
+
+  if constexpr (std::is_same_v<V, vfloat>)
+    { //{{{2
+      binary_op_return_type<vfloat, schar>();
+      binary_op_return_type<vfloat, uchar>();
+      binary_op_return_type<vfloat, short>();
+      binary_op_return_type<vfloat, ushort>();
+
+      binary_op_return_type<vf32<float>, schar>();
+      binary_op_return_type<vf32<float>, uchar>();
+      binary_op_return_type<vf32<float>, short>();
+      binary_op_return_type<vf32<float>, ushort>();
+      binary_op_return_type<vf32<float>, int>();
+      binary_op_return_type<vf32<float>, float>();
+
+      binary_op_return_type<vf32<float>, vf32<schar>>();
+      binary_op_return_type<vf32<float>, vf32<uchar>>();
+      binary_op_return_type<vf32<float>, vf32<short>>();
+      binary_op_return_type<vf32<float>, vf32<ushort>>();
+      binary_op_return_type<vf32<float>, vf32<float>>();
+
+      VERIFY((is_substitution_failure<vfloat, uint>) );
+      VERIFY((is_substitution_failure<vfloat, long>) );
+      VERIFY((is_substitution_failure<vfloat, ulong>) );
+      VERIFY((is_substitution_failure<vfloat, llong>) );
+      VERIFY((is_substitution_failure<vfloat, ullong>) );
+      VERIFY((is_substitution_failure<vfloat, double>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<schar>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<uchar>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<short>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<ushort>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<int>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<uint>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<long>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<ulong>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<llong>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<ullong>>) );
+      VERIFY((is_substitution_failure<vfloat, vf32<float>>) );
+
+      VERIFY((is_substitution_failure<vf32<float>, vfloat>) );
+      VERIFY((is_substitution_failure<vf32<float>, uint>) );
+      VERIFY((is_substitution_failure<vf32<float>, long>) );
+      VERIFY((is_substitution_failure<vf32<float>, ulong>) );
+      VERIFY((is_substitution_failure<vf32<float>, llong>) );
+      VERIFY((is_substitution_failure<vf32<float>, ullong>) );
+      VERIFY((is_substitution_failure<vf32<float>, double>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<int>>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<uint>>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<long>>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<ulong>>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<llong>>) );
+      VERIFY((is_substitution_failure<vf32<float>, vf32<ullong>>) );
+
+      VERIFY((is_substitution_failure<vfloat, vf32<double>>) );
+    }
+  else if constexpr (std::is_same_v<V, vdouble>)
+    { //{{{2
+      binary_op_return_type<vdouble, float, vdouble>();
+      binary_op_return_type<vdouble, schar>();
+      binary_op_return_type<vdouble, uchar>();
+      binary_op_return_type<vdouble, short>();
+      binary_op_return_type<vdouble, ushort>();
+      binary_op_return_type<vdouble, uint>();
+
+      binary_op_return_type<vf64<double>, schar>();
+      binary_op_return_type<vf64<double>, uchar>();
+      binary_op_return_type<vf64<double>, short>();
+      binary_op_return_type<vf64<double>, ushort>();
+      binary_op_return_type<vf64<double>, uint>();
+      binary_op_return_type<vf64<double>, int, vf64<double>>();
+      binary_op_return_type<vf64<double>, float, vf64<double>>();
+      binary_op_return_type<vf64<double>, double, vf64<double>>();
+      binary_op_return_type<vf64<double>, vf64<double>, vf64<double>>();
+      binary_op_return_type<vf32<double>, schar>();
+      binary_op_return_type<vf32<double>, uchar>();
+      binary_op_return_type<vf32<double>, short>();
+      binary_op_return_type<vf32<double>, ushort>();
+      binary_op_return_type<vf32<double>, uint>();
+      binary_op_return_type<vf32<double>, int, vf32<double>>();
+      binary_op_return_type<vf32<double>, float, vf32<double>>();
+      binary_op_return_type<vf32<double>, double, vf32<double>>();
+      binary_op_return_type<vf64<double>, vf64<schar>>();
+      binary_op_return_type<vf64<double>, vf64<uchar>>();
+      binary_op_return_type<vf64<double>, vf64<short>>();
+      binary_op_return_type<vf64<double>, vf64<ushort>>();
+      binary_op_return_type<vf64<double>, vf64<int>>();
+      binary_op_return_type<vf64<double>, vf64<uint>>();
+      binary_op_return_type<vf64<double>, vf64<float>>();
+
+      VERIFY((is_substitution_failure<vdouble, llong>) );
+      VERIFY((is_substitution_failure<vdouble, ullong>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<schar>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<uchar>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<short>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<ushort>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<int>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<uint>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<long>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<ulong>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<llong>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<ullong>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<float>>) );
+      VERIFY((is_substitution_failure<vdouble, vf64<double>>) );
+
+      VERIFY((is_substitution_failure<vf64<double>, vdouble>) );
+      VERIFY((is_substitution_failure<vf64<double>, llong>) );
+      VERIFY((is_substitution_failure<vf64<double>, ullong>) );
+      VERIFY((is_substitution_failure<vf64<double>, vf64<llong>>) );
+      VERIFY((is_substitution_failure<vf64<double>, vf64<ullong>>) );
+
+      VERIFY((is_substitution_failure<vf32<double>, llong>) );
+      VERIFY((is_substitution_failure<vf32<double>, ullong>) );
+
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  VERIFY((is_substitution_failure<vdouble, long>) );
+	  VERIFY((is_substitution_failure<vdouble, ulong>) );
+	  VERIFY((is_substitution_failure<vf64<double>, long>) );
+	  VERIFY((is_substitution_failure<vf64<double>, ulong>) );
+	  VERIFY((is_substitution_failure<vf64<double>, vf64<long>>) );
+	  VERIFY((is_substitution_failure<vf64<double>, vf64<ulong>>) );
+	  VERIFY((is_substitution_failure<vf32<double>, long>) );
+	  VERIFY((is_substitution_failure<vf32<double>, ulong>) );
+	}
+      else
+	{
+	  binary_op_return_type<vdouble, long>();
+	  binary_op_return_type<vdouble, ulong>();
+	  binary_op_return_type<vf64<double>, long>();
+	  binary_op_return_type<vf64<double>, ulong>();
+	  binary_op_return_type<vf64<double>, vf64<long>>();
+	  binary_op_return_type<vf64<double>, vf64<ulong>>();
+	  binary_op_return_type<vf32<double>, long>();
+	  binary_op_return_type<vf32<double>, ulong>();
+	}
+    }
+  else if constexpr (std::is_same_v<V, vldouble>)
+    { //{{{2
+      binary_op_return_type<vldouble, schar>();
+      binary_op_return_type<vldouble, uchar>();
+      binary_op_return_type<vldouble, short>();
+      binary_op_return_type<vldouble, ushort>();
+      binary_op_return_type<vldouble, uint>();
+      binary_op_return_type<vldouble, long>();
+      binary_op_return_type<vldouble, ulong>();
+      binary_op_return_type<vldouble, float>();
+      binary_op_return_type<vldouble, double>();
+
+      binary_op_return_type<vf64<long double>, schar>();
+      binary_op_return_type<vf64<long double>, uchar>();
+      binary_op_return_type<vf64<long double>, short>();
+      binary_op_return_type<vf64<long double>, ushort>();
+      binary_op_return_type<vf64<long double>, int>();
+      binary_op_return_type<vf64<long double>, uint>();
+      binary_op_return_type<vf64<long double>, long>();
+      binary_op_return_type<vf64<long double>, ulong>();
+      binary_op_return_type<vf64<long double>, float>();
+      binary_op_return_type<vf64<long double>, double>();
+      binary_op_return_type<vf64<long double>, vf64<long double>>();
+
+      using std::experimental::simd;
+      using A = simd_abi::fixed_size<vldouble::size()>;
+      binary_op_return_type<simd<long double, A>, schar>();
+      binary_op_return_type<simd<long double, A>, uchar>();
+      binary_op_return_type<simd<long double, A>, short>();
+      binary_op_return_type<simd<long double, A>, ushort>();
+      binary_op_return_type<simd<long double, A>, int>();
+      binary_op_return_type<simd<long double, A>, uint>();
+      binary_op_return_type<simd<long double, A>, long>();
+      binary_op_return_type<simd<long double, A>, ulong>();
+      binary_op_return_type<simd<long double, A>, float>();
+      binary_op_return_type<simd<long double, A>, double>();
+
+      if constexpr (sizeof(ldouble) == sizeof(double))
+	{
+	  VERIFY((is_substitution_failure<vldouble, llong>) );
+	  VERIFY((is_substitution_failure<vldouble, ullong>) );
+	  VERIFY((is_substitution_failure<vf64<ldouble>, llong>) );
+	  VERIFY((is_substitution_failure<vf64<ldouble>, ullong>) );
+	  VERIFY((is_substitution_failure<simd<ldouble, A>, llong>) );
+	  VERIFY((is_substitution_failure<simd<ldouble, A>, ullong>) );
+	}
+      else
+	{
+	  binary_op_return_type<vldouble, llong>();
+	  binary_op_return_type<vldouble, ullong>();
+	  binary_op_return_type<vf64<long double>, llong>();
+	  binary_op_return_type<vf64<long double>, ullong>();
+	  binary_op_return_type<simd<long double, A>, llong>();
+	  binary_op_return_type<simd<long double, A>, ullong>();
+	}
+
+      VERIFY((is_substitution_failure<vf64<long double>, vldouble>) );
+      COMPARE((is_substitution_failure<simd<long double, A>, vldouble>),
+	      (!std::is_same<A, vldouble::abi_type>::value));
+    }
+  else if constexpr (std::is_same_v<V, vlong>)
+    { //{{{2
+      VERIFY((is_substitution_failure<vi32<long>, double>) );
+      VERIFY((is_substitution_failure<vi32<long>, float>) );
+      VERIFY((is_substitution_failure<vi32<long>, vi32<float>>) );
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  binary_op_return_type<vlong, uint>();
+	  binary_op_return_type<vlong, llong>();
+	  binary_op_return_type<vi32<long>, uint>();
+	  binary_op_return_type<vi32<long>, llong>();
+	  binary_op_return_type<vi64<long>, uint>();
+	  binary_op_return_type<vi64<long>, llong>();
+	  binary_op_return_type<vi32<long>, vi32<uint>>();
+	  binary_op_return_type<vi64<long>, vi64<uint>>();
+	  VERIFY((is_substitution_failure<vi32<long>, vi32<double>>) );
+	  VERIFY((is_substitution_failure<vi64<long>, vi64<double>>) );
+	}
+      else
+	{
+	  VERIFY((is_substitution_failure<vlong, uint>) );
+	  VERIFY((is_substitution_failure<vlong, llong>) );
+	  VERIFY((is_substitution_failure<vi32<long>, uint>) );
+	  VERIFY((is_substitution_failure<vi32<long>, llong>) );
+	  VERIFY((is_substitution_failure<vi64<long>, uint>) );
+	  VERIFY((is_substitution_failure<vi64<long>, llong>) );
+	  VERIFY((is_substitution_failure<vi32<long>, vi32<uint>>) );
+	  VERIFY((is_substitution_failure<vi64<long>, vi64<uint>>) );
+	  binary_op_return_type<vi32<double>, vi32<long>>();
+	  binary_op_return_type<vi64<double>, vi64<long>>();
+	}
+
+      binary_op_return_type<vlong, schar, vlong>();
+      binary_op_return_type<vlong, uchar, vlong>();
+      binary_op_return_type<vlong, short, vlong>();
+      binary_op_return_type<vlong, ushort, vlong>();
+
+      binary_op_return_type<vi32<long>, schar, vi32<long>>();
+      binary_op_return_type<vi32<long>, uchar, vi32<long>>();
+      binary_op_return_type<vi32<long>, short, vi32<long>>();
+      binary_op_return_type<vi32<long>, ushort, vi32<long>>();
+      binary_op_return_type<vi32<long>, int, vi32<long>>();
+      binary_op_return_type<vi32<long>, long, vi32<long>>();
+      binary_op_return_type<vi32<long>, vi32<long>, vi32<long>>();
+      binary_op_return_type<vi64<long>, schar, vi64<long>>();
+      binary_op_return_type<vi64<long>, uchar, vi64<long>>();
+      binary_op_return_type<vi64<long>, short, vi64<long>>();
+      binary_op_return_type<vi64<long>, ushort, vi64<long>>();
+      binary_op_return_type<vi64<long>, int, vi64<long>>();
+      binary_op_return_type<vi64<long>, long, vi64<long>>();
+      binary_op_return_type<vi64<long>, vi64<long>, vi64<long>>();
+
+      VERIFY((is_substitution_failure<vlong, vulong>) );
+      VERIFY((is_substitution_failure<vlong, ulong>) );
+      VERIFY((is_substitution_failure<vlong, ullong>) );
+      VERIFY((is_substitution_failure<vlong, float>) );
+      VERIFY((is_substitution_failure<vlong, double>) );
+      VERIFY((is_substitution_failure<vlong, vl<schar>>) );
+      VERIFY((is_substitution_failure<vlong, vl<uchar>>) );
+      VERIFY((is_substitution_failure<vlong, vl<short>>) );
+      VERIFY((is_substitution_failure<vlong, vl<ushort>>) );
+      VERIFY((is_substitution_failure<vlong, vl<int>>) );
+      VERIFY((is_substitution_failure<vlong, vl<uint>>) );
+      VERIFY((is_substitution_failure<vlong, vl<long>>) );
+      VERIFY((is_substitution_failure<vlong, vl<ulong>>) );
+      VERIFY((is_substitution_failure<vlong, vl<llong>>) );
+      VERIFY((is_substitution_failure<vlong, vl<ullong>>) );
+      VERIFY((is_substitution_failure<vlong, vl<float>>) );
+      VERIFY((is_substitution_failure<vlong, vl<double>>) );
+      VERIFY((is_substitution_failure<vl<long>, vlong>) );
+      VERIFY((is_substitution_failure<vl<long>, vulong>) );
+      VERIFY((is_substitution_failure<vi32<long>, ulong>) );
+      VERIFY((is_substitution_failure<vi32<long>, ullong>) );
+      binary_op_return_type<vi32<long>, vi32<schar>>();
+      binary_op_return_type<vi32<long>, vi32<uchar>>();
+      binary_op_return_type<vi32<long>, vi32<short>>();
+      binary_op_return_type<vi32<long>, vi32<ushort>>();
+      binary_op_return_type<vi32<long>, vi32<int>>();
+      VERIFY((is_substitution_failure<vi32<long>, vi32<ulong>>) );
+      VERIFY((is_substitution_failure<vi32<long>, vi32<ullong>>) );
+      VERIFY((is_substitution_failure<vi64<long>, ulong>) );
+      VERIFY((is_substitution_failure<vi64<long>, ullong>) );
+      VERIFY((is_substitution_failure<vi64<long>, float>) );
+      VERIFY((is_substitution_failure<vi64<long>, double>) );
+      binary_op_return_type<vi64<long>, vi64<schar>>();
+      binary_op_return_type<vi64<long>, vi64<uchar>>();
+      binary_op_return_type<vi64<long>, vi64<short>>();
+      binary_op_return_type<vi64<long>, vi64<ushort>>();
+      binary_op_return_type<vi64<long>, vi64<int>>();
+      VERIFY((is_substitution_failure<vi64<long>, vi64<ulong>>) );
+      VERIFY((is_substitution_failure<vi64<long>, vi64<ullong>>) );
+      VERIFY((is_substitution_failure<vi64<long>, vi64<float>>) );
+
+      binary_op_return_type<vi32<llong>, vi32<long>>();
+      binary_op_return_type<vi64<llong>, vi64<long>>();
+    }
+  else if constexpr (std::is_same_v<V, vulong>)
+    { //{{{2
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  binary_op_return_type<vulong, ullong, vulong>();
+	  binary_op_return_type<vi32<ulong>, ullong, vi32<ulong>>();
+	  binary_op_return_type<vi64<ulong>, ullong, vi64<ulong>>();
+	  VERIFY((is_substitution_failure<vi32<ulong>, vi32<llong>>) );
+	  VERIFY((is_substitution_failure<vi32<ulong>, vi32<double>>) );
+	  VERIFY((is_substitution_failure<vi64<ulong>, vi64<llong>>) );
+	  VERIFY((is_substitution_failure<vi64<ulong>, vi64<double>>) );
+	}
+      else
+	{
+	  VERIFY((is_substitution_failure<vulong, ullong>) );
+	  VERIFY((is_substitution_failure<vi32<ulong>, ullong>) );
+	  VERIFY((is_substitution_failure<vi64<ulong>, ullong>) );
+	  binary_op_return_type<vi32<llong>, vi32<ulong>>();
+	  binary_op_return_type<vi32<double>, vi32<ulong>>();
+	  binary_op_return_type<vi64<llong>, vi64<ulong>>();
+	  binary_op_return_type<vi64<double>, vi64<ulong>>();
+	}
+
+      binary_op_return_type<vulong, uchar, vulong>();
+      binary_op_return_type<vulong, ushort, vulong>();
+      binary_op_return_type<vulong, uint, vulong>();
+      binary_op_return_type<vi32<ulong>, uchar, vi32<ulong>>();
+      binary_op_return_type<vi32<ulong>, ushort, vi32<ulong>>();
+      binary_op_return_type<vi32<ulong>, int, vi32<ulong>>();
+      binary_op_return_type<vi32<ulong>, uint, vi32<ulong>>();
+      binary_op_return_type<vi32<ulong>, ulong, vi32<ulong>>();
+      binary_op_return_type<vi32<ulong>, vi32<ulong>, vi32<ulong>>();
+      binary_op_return_type<vi64<ulong>, uchar, vi64<ulong>>();
+      binary_op_return_type<vi64<ulong>, ushort, vi64<ulong>>();
+      binary_op_return_type<vi64<ulong>, int, vi64<ulong>>();
+      binary_op_return_type<vi64<ulong>, uint, vi64<ulong>>();
+      binary_op_return_type<vi64<ulong>, ulong, vi64<ulong>>();
+      binary_op_return_type<vi64<ulong>, vi64<ulong>, vi64<ulong>>();
+
+      VERIFY((is_substitution_failure<vi32<ulong>, llong>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, float>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, double>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, vi32<float>>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, vi64<float>>) );
+      VERIFY((is_substitution_failure<vulong, schar>) );
+      VERIFY((is_substitution_failure<vulong, short>) );
+      VERIFY((is_substitution_failure<vulong, vlong>) );
+      VERIFY((is_substitution_failure<vulong, long>) );
+      VERIFY((is_substitution_failure<vulong, llong>) );
+      VERIFY((is_substitution_failure<vulong, float>) );
+      VERIFY((is_substitution_failure<vulong, double>) );
+      VERIFY((is_substitution_failure<vulong, vl<schar>>) );
+      VERIFY((is_substitution_failure<vulong, vl<uchar>>) );
+      VERIFY((is_substitution_failure<vulong, vl<short>>) );
+      VERIFY((is_substitution_failure<vulong, vl<ushort>>) );
+      VERIFY((is_substitution_failure<vulong, vl<int>>) );
+      VERIFY((is_substitution_failure<vulong, vl<uint>>) );
+      VERIFY((is_substitution_failure<vulong, vl<long>>) );
+      VERIFY((is_substitution_failure<vulong, vl<ulong>>) );
+      VERIFY((is_substitution_failure<vulong, vl<llong>>) );
+      VERIFY((is_substitution_failure<vulong, vl<ullong>>) );
+      VERIFY((is_substitution_failure<vulong, vl<float>>) );
+      VERIFY((is_substitution_failure<vulong, vl<double>>) );
+      VERIFY((is_substitution_failure<vl<ulong>, vlong>) );
+      VERIFY((is_substitution_failure<vl<ulong>, vulong>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, schar>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, short>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, long>) );
+      VERIFY((is_substitution_failure<vi32<ulong>, vi32<schar>>) );
+      binary_op_return_type<vi32<ulong>, vi32<uchar>>();
+      VERIFY((is_substitution_failure<vi32<ulong>, vi32<short>>) );
+      binary_op_return_type<vi32<ulong>, vi32<ushort>>();
+      VERIFY((is_substitution_failure<vi32<ulong>, vi32<int>>) );
+      binary_op_return_type<vi32<ulong>, vi32<uint>>();
+      VERIFY((is_substitution_failure<vi32<ulong>, vi32<long>>) );
+      binary_op_return_type<vi32<ullong>, vi32<ulong>>();
+      VERIFY((is_substitution_failure<vi64<ulong>, schar>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, short>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, long>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, llong>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, float>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, double>) );
+      VERIFY((is_substitution_failure<vi64<ulong>, vi64<schar>>) );
+      binary_op_return_type<vi64<ulong>, vi64<uchar>>();
+      VERIFY((is_substitution_failure<vi64<ulong>, vi64<short>>) );
+      binary_op_return_type<vi64<ulong>, vi64<ushort>>();
+      VERIFY((is_substitution_failure<vi64<ulong>, vi64<int>>) );
+      binary_op_return_type<vi64<ulong>, vi64<uint>>();
+      VERIFY((is_substitution_failure<vi64<ulong>, vi64<long>>) );
+      binary_op_return_type<vi64<ullong>, vi64<ulong>>();
+    }
+  else if constexpr (std::is_same_v<V, vllong>)
+    { //{{{2
+      binary_op_return_type<vllong, schar, vllong>();
+      binary_op_return_type<vllong, uchar, vllong>();
+      binary_op_return_type<vllong, short, vllong>();
+      binary_op_return_type<vllong, ushort, vllong>();
+      binary_op_return_type<vllong, uint, vllong>();
+      binary_op_return_type<vllong, long, vllong>();
+      binary_op_return_type<vi32<llong>, schar, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, uchar, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, short, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, ushort, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, int, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, uint, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, long, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, llong, vi32<llong>>();
+      binary_op_return_type<vi32<llong>, vi32<llong>, vi32<llong>>();
+      binary_op_return_type<vi64<llong>, schar, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, uchar, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, short, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, ushort, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, int, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, uint, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, long, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, llong, vi64<llong>>();
+      binary_op_return_type<vi64<llong>, vi64<llong>>();
+      binary_op_return_type<vi32<llong>, vi32<schar>>();
+      binary_op_return_type<vi32<llong>, vi32<uchar>>();
+      binary_op_return_type<vi32<llong>, vi32<short>>();
+      binary_op_return_type<vi32<llong>, vi32<ushort>>();
+      binary_op_return_type<vi32<llong>, vi32<int>>();
+      binary_op_return_type<vi32<llong>, vi32<uint>>();
+      binary_op_return_type<vi32<llong>, vi32<long>>();
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  VERIFY((is_substitution_failure<vi32<llong>, vi32<ulong>>) );
+	  VERIFY((is_substitution_failure<vi32<llong>, ulong>) );
+	  VERIFY((is_substitution_failure<vi64<llong>, ulong>) );
+	  VERIFY((is_substitution_failure<vllong, ulong>) );
+	}
+      else
+	{
+	  binary_op_return_type<vi32<llong>, vi32<ulong>>();
+	  binary_op_return_type<vi32<llong>, ulong>();
+	  binary_op_return_type<vi64<llong>, ulong>();
+	  binary_op_return_type<vllong, ulong>();
+	}
+
+      VERIFY((is_substitution_failure<vllong, vullong>) );
+      VERIFY((is_substitution_failure<vllong, ullong>) );
+      VERIFY((is_substitution_failure<vllong, float>) );
+      VERIFY((is_substitution_failure<vllong, double>) );
+      VERIFY((is_substitution_failure<vllong, vi64<schar>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<uchar>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<short>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<ushort>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<int>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<uint>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<long>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<ulong>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<llong>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<ullong>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<float>>) );
+      VERIFY((is_substitution_failure<vllong, vi64<double>>) );
+      VERIFY((is_substitution_failure<vi32<llong>, ullong>) );
+      VERIFY((is_substitution_failure<vi32<llong>, float>) );
+      VERIFY((is_substitution_failure<vi32<llong>, double>) );
+      VERIFY((is_substitution_failure<vi32<llong>, vi32<ullong>>) );
+      VERIFY((is_substitution_failure<vi32<llong>, vi32<float>>) );
+      VERIFY((is_substitution_failure<vi32<llong>, vi32<double>>) );
+      VERIFY((is_substitution_failure<vi64<llong>, vllong>) );
+      VERIFY((is_substitution_failure<vi64<llong>, vullong>) );
+      VERIFY((is_substitution_failure<vi64<llong>, ullong>) );
+      VERIFY((is_substitution_failure<vi64<llong>, float>) );
+      VERIFY((is_substitution_failure<vi64<llong>, double>) );
+      binary_op_return_type<vi64<llong>, vi64<schar>>();
+      binary_op_return_type<vi64<llong>, vi64<uchar>>();
+      binary_op_return_type<vi64<llong>, vi64<short>>();
+      binary_op_return_type<vi64<llong>, vi64<ushort>>();
+      binary_op_return_type<vi64<llong>, vi64<int>>();
+      binary_op_return_type<vi64<llong>, vi64<uint>>();
+      binary_op_return_type<vi64<llong>, vi64<long>>();
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  VERIFY((is_substitution_failure<vi64<llong>, vi64<ulong>>) );
+	}
+      else
+	{
+	  binary_op_return_type<vi64<llong>, vi64<ulong>>();
+	}
+      VERIFY((is_substitution_failure<vi64<llong>, vi64<ullong>>) );
+      VERIFY((is_substitution_failure<vi64<llong>, vi64<float>>) );
+      VERIFY((is_substitution_failure<vi64<llong>, vi64<double>>) );
+    }
+  else if constexpr (std::is_same_v<V, vullong>)
+    { //{{{2
+      binary_op_return_type<vullong, uchar, vullong>();
+      binary_op_return_type<vullong, ushort, vullong>();
+      binary_op_return_type<vullong, uint, vullong>();
+      binary_op_return_type<vullong, ulong, vullong>();
+      binary_op_return_type<vi32<ullong>, uchar, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, ushort, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, int, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, uint, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, ulong, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, ullong, vi32<ullong>>();
+      binary_op_return_type<vi32<ullong>, vi32<ullong>, vi32<ullong>>();
+      binary_op_return_type<vi64<ullong>, uchar, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, ushort, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, int, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, uint, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, ulong, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, ullong, vi64<ullong>>();
+      binary_op_return_type<vi64<ullong>, vi64<ullong>, vi64<ullong>>();
+
+      VERIFY((is_substitution_failure<vullong, schar>) );
+      VERIFY((is_substitution_failure<vullong, short>) );
+      VERIFY((is_substitution_failure<vullong, long>) );
+      VERIFY((is_substitution_failure<vullong, llong>) );
+      VERIFY((is_substitution_failure<vullong, vllong>) );
+      VERIFY((is_substitution_failure<vullong, float>) );
+      VERIFY((is_substitution_failure<vullong, double>) );
+      VERIFY((is_substitution_failure<vullong, vi64<schar>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<uchar>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<short>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<ushort>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<int>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<uint>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<long>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<ulong>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<llong>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<ullong>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<float>>) );
+      VERIFY((is_substitution_failure<vullong, vi64<double>>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, schar>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, short>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, long>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, llong>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, float>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, double>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<schar>>) );
+      binary_op_return_type<vi32<ullong>, vi32<uchar>>();
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<short>>) );
+      binary_op_return_type<vi32<ullong>, vi32<ushort>>();
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<int>>) );
+      binary_op_return_type<vi32<ullong>, vi32<uint>>();
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<long>>) );
+      binary_op_return_type<vi32<ullong>, vi32<ulong>>();
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<llong>>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<float>>) );
+      VERIFY((is_substitution_failure<vi32<ullong>, vi32<double>>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, schar>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, short>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, long>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, llong>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, vllong>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, vullong>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, float>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, double>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<schar>>) );
+      binary_op_return_type<vi64<ullong>, vi64<uchar>>();
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<short>>) );
+      binary_op_return_type<vi64<ullong>, vi64<ushort>>();
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<int>>) );
+      binary_op_return_type<vi64<ullong>, vi64<uint>>();
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<long>>) );
+      binary_op_return_type<vi64<ullong>, vi64<ulong>>();
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<llong>>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<float>>) );
+      VERIFY((is_substitution_failure<vi64<ullong>, vi64<double>>) );
+    }
+  else if constexpr (std::is_same_v<V, vint>)
+    { //{{{2
+      binary_op_return_type<vint, schar, vint>();
+      binary_op_return_type<vint, uchar, vint>();
+      binary_op_return_type<vint, short, vint>();
+      binary_op_return_type<vint, ushort, vint>();
+      binary_op_return_type<vi32<int>, schar, vi32<int>>();
+      binary_op_return_type<vi32<int>, uchar, vi32<int>>();
+      binary_op_return_type<vi32<int>, short, vi32<int>>();
+      binary_op_return_type<vi32<int>, ushort, vi32<int>>();
+      binary_op_return_type<vi32<int>, int, vi32<int>>();
+      binary_op_return_type<vi32<int>, vi32<int>, vi32<int>>();
+      binary_op_return_type<vi32<int>, vi32<schar>>();
+      binary_op_return_type<vi32<int>, vi32<uchar>>();
+      binary_op_return_type<vi32<int>, vi32<short>>();
+      binary_op_return_type<vi32<int>, vi32<ushort>>();
+
+      binary_op_return_type<vi32<llong>, vi32<int>>();
+      binary_op_return_type<vi32<double>, vi32<int>>();
+
+      // order is important for MSVC. This compiler is just crazy: It considers
+      // operators from unrelated simd template instantiations as candidates -
+      // but only after they have been tested. So e.g. vi32<int> + llong will
+      // produce a vi32<llong> if a vi32<llong> operator test is done before the
+      // vi32<int> + llong test.
+      VERIFY((is_substitution_failure<vi32<int>, double>) );
+      VERIFY((is_substitution_failure<vi32<int>, float>) );
+      VERIFY((is_substitution_failure<vi32<int>, llong>) );
+      VERIFY((is_substitution_failure<vi32<int>, vi32<float>>) );
+      VERIFY((is_substitution_failure<vint, vuint>) );
+      VERIFY((is_substitution_failure<vint, uint>) );
+      VERIFY((is_substitution_failure<vint, ulong>) );
+      VERIFY((is_substitution_failure<vint, llong>) );
+      VERIFY((is_substitution_failure<vint, ullong>) );
+      VERIFY((is_substitution_failure<vint, float>) );
+      VERIFY((is_substitution_failure<vint, double>) );
+      VERIFY((is_substitution_failure<vint, vi32<schar>>) );
+      VERIFY((is_substitution_failure<vint, vi32<uchar>>) );
+      VERIFY((is_substitution_failure<vint, vi32<short>>) );
+      VERIFY((is_substitution_failure<vint, vi32<ushort>>) );
+      VERIFY((is_substitution_failure<vint, vi32<int>>) );
+      VERIFY((is_substitution_failure<vint, vi32<uint>>) );
+      VERIFY((is_substitution_failure<vint, vi32<long>>) );
+      VERIFY((is_substitution_failure<vint, vi32<ulong>>) );
+      VERIFY((is_substitution_failure<vint, vi32<llong>>) );
+      VERIFY((is_substitution_failure<vint, vi32<ullong>>) );
+      VERIFY((is_substitution_failure<vint, vi32<float>>) );
+      VERIFY((is_substitution_failure<vint, vi32<double>>) );
+      VERIFY((is_substitution_failure<vi32<int>, vint>) );
+      VERIFY((is_substitution_failure<vi32<int>, vuint>) );
+      VERIFY((is_substitution_failure<vi32<int>, uint>) );
+      VERIFY((is_substitution_failure<vi32<int>, ulong>) );
+      VERIFY((is_substitution_failure<vi32<int>, ullong>) );
+      VERIFY((is_substitution_failure<vi32<int>, vi32<uint>>) );
+      VERIFY((is_substitution_failure<vi32<int>, vi32<ulong>>) );
+      VERIFY((is_substitution_failure<vi32<int>, vi32<ullong>>) );
+
+      binary_op_return_type<vi32<long>, vi32<int>>();
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  VERIFY((is_substitution_failure<vint, long>) );
+	  VERIFY((is_substitution_failure<vi32<int>, long>) );
+	}
+      else
+	{
+	  binary_op_return_type<vint, long>();
+	  binary_op_return_type<vi32<int>, long>();
+	}
+    }
+  else if constexpr (std::is_same_v<V, vuint>)
+    { //{{{2
+      VERIFY((is_substitution_failure<vi32<uint>, llong>) );
+      VERIFY((is_substitution_failure<vi32<uint>, ullong>) );
+      VERIFY((is_substitution_failure<vi32<uint>, float>) );
+      VERIFY((is_substitution_failure<vi32<uint>, double>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vi32<float>>) );
+
+      binary_op_return_type<vuint, uchar, vuint>();
+      binary_op_return_type<vuint, ushort, vuint>();
+      binary_op_return_type<vi32<uint>, uchar, vi32<uint>>();
+      binary_op_return_type<vi32<uint>, ushort, vi32<uint>>();
+      binary_op_return_type<vi32<uint>, int, vi32<uint>>();
+      binary_op_return_type<vi32<uint>, uint, vi32<uint>>();
+      binary_op_return_type<vi32<uint>, vi32<uint>, vi32<uint>>();
+      binary_op_return_type<vi32<uint>, vi32<uchar>>();
+      binary_op_return_type<vi32<uint>, vi32<ushort>>();
+
+      binary_op_return_type<vi32<llong>, vi32<uint>>();
+      binary_op_return_type<vi32<ullong>, vi32<uint>>();
+      binary_op_return_type<vi32<double>, vi32<uint>>();
+
+      VERIFY((is_substitution_failure<vuint, schar>) );
+      VERIFY((is_substitution_failure<vuint, short>) );
+      VERIFY((is_substitution_failure<vuint, vint>) );
+      VERIFY((is_substitution_failure<vuint, long>) );
+      VERIFY((is_substitution_failure<vuint, llong>) );
+      VERIFY((is_substitution_failure<vuint, ullong>) );
+      VERIFY((is_substitution_failure<vuint, float>) );
+      VERIFY((is_substitution_failure<vuint, double>) );
+      VERIFY((is_substitution_failure<vuint, vi32<schar>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<uchar>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<short>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<ushort>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<int>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<uint>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<long>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<ulong>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<llong>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<ullong>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<float>>) );
+      VERIFY((is_substitution_failure<vuint, vi32<double>>) );
+      VERIFY((is_substitution_failure<vi32<uint>, schar>) );
+      VERIFY((is_substitution_failure<vi32<uint>, short>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vint>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vuint>) );
+      VERIFY((is_substitution_failure<vi32<uint>, long>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vi32<schar>>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vi32<short>>) );
+      VERIFY((is_substitution_failure<vi32<uint>, vi32<int>>) );
+
+      binary_op_return_type<vi32<ulong>, vi32<uint>>();
+      if constexpr (sizeof(long) == sizeof(llong))
+	{
+	  VERIFY((is_substitution_failure<vuint, ulong>) );
+	  VERIFY((is_substitution_failure<vi32<uint>, ulong>) );
+	  binary_op_return_type<vi32<long>, vi32<uint>>();
+	}
+      else
+	{
+	  binary_op_return_type<vuint, ulong>();
+	  binary_op_return_type<vi32<uint>, ulong>();
+	  VERIFY((is_substitution_failure<vi32<uint>, vi32<long>>) );
+	}
+    }
+  else if constexpr (std::is_same_v<V, vshort>)
+    { //{{{2
+      binary_op_return_type<vshort, schar, vshort>();
+      binary_op_return_type<vshort, uchar, vshort>();
+      binary_op_return_type<vi16<short>, schar, vi16<short>>();
+      binary_op_return_type<vi16<short>, uchar, vi16<short>>();
+      binary_op_return_type<vi16<short>, short, vi16<short>>();
+      binary_op_return_type<vi16<short>, int, vi16<short>>();
+      binary_op_return_type<vi16<short>, vi16<schar>>();
+      binary_op_return_type<vi16<short>, vi16<uchar>>();
+      binary_op_return_type<vi16<short>, vi16<short>>();
+
+      binary_op_return_type<vi16<int>, vi16<short>>();
+      binary_op_return_type<vi16<long>, vi16<short>>();
+      binary_op_return_type<vi16<llong>, vi16<short>>();
+      binary_op_return_type<vi16<float>, vi16<short>>();
+      binary_op_return_type<vi16<double>, vi16<short>>();
+
+      VERIFY((is_substitution_failure<vi16<short>, double>) );
+      VERIFY((is_substitution_failure<vi16<short>, llong>) );
+      VERIFY((is_substitution_failure<vshort, vushort>) );
+      VERIFY((is_substitution_failure<vshort, ushort>) );
+      VERIFY((is_substitution_failure<vshort, uint>) );
+      VERIFY((is_substitution_failure<vshort, long>) );
+      VERIFY((is_substitution_failure<vshort, ulong>) );
+      VERIFY((is_substitution_failure<vshort, llong>) );
+      VERIFY((is_substitution_failure<vshort, ullong>) );
+      VERIFY((is_substitution_failure<vshort, float>) );
+      VERIFY((is_substitution_failure<vshort, double>) );
+      VERIFY((is_substitution_failure<vshort, vi16<schar>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<uchar>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<short>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<ushort>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<int>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<uint>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<long>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<ulong>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<llong>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<ullong>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<float>>) );
+      VERIFY((is_substitution_failure<vshort, vi16<double>>) );
+      VERIFY((is_substitution_failure<vi16<short>, vshort>) );
+      VERIFY((is_substitution_failure<vi16<short>, vushort>) );
+      VERIFY((is_substitution_failure<vi16<short>, ushort>) );
+      VERIFY((is_substitution_failure<vi16<short>, uint>) );
+      VERIFY((is_substitution_failure<vi16<short>, long>) );
+      VERIFY((is_substitution_failure<vi16<short>, ulong>) );
+      VERIFY((is_substitution_failure<vi16<short>, ullong>) );
+      VERIFY((is_substitution_failure<vi16<short>, float>) );
+      VERIFY((is_substitution_failure<vi16<short>, vi16<ushort>>) );
+      VERIFY((is_substitution_failure<vi16<short>, vi16<uint>>) );
+      VERIFY((is_substitution_failure<vi16<short>, vi16<ulong>>) );
+      VERIFY((is_substitution_failure<vi16<short>, vi16<ullong>>) );
+    }
+  else if constexpr (std::is_same_v<V, vushort>)
+    { //{{{2
+      binary_op_return_type<vushort, uchar, vushort>();
+      binary_op_return_type<vushort, uint, vushort>();
+      binary_op_return_type<vi16<ushort>, uchar, vi16<ushort>>();
+      binary_op_return_type<vi16<ushort>, ushort, vi16<ushort>>();
+      binary_op_return_type<vi16<ushort>, int, vi16<ushort>>();
+      binary_op_return_type<vi16<ushort>, uint, vi16<ushort>>();
+      binary_op_return_type<vi16<ushort>, vi16<uchar>>();
+      binary_op_return_type<vi16<ushort>, vi16<ushort>>();
+
+      binary_op_return_type<vi16<int>, vi16<ushort>>();
+      binary_op_return_type<vi16<long>, vi16<ushort>>();
+      binary_op_return_type<vi16<llong>, vi16<ushort>>();
+      binary_op_return_type<vi16<uint>, vi16<ushort>>();
+      binary_op_return_type<vi16<ulong>, vi16<ushort>>();
+      binary_op_return_type<vi16<ullong>, vi16<ushort>>();
+      binary_op_return_type<vi16<float>, vi16<ushort>>();
+      binary_op_return_type<vi16<double>, vi16<ushort>>();
+
+      VERIFY((is_substitution_failure<vi16<ushort>, llong>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, ullong>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, double>) );
+      VERIFY((is_substitution_failure<vushort, schar>) );
+      VERIFY((is_substitution_failure<vushort, short>) );
+      VERIFY((is_substitution_failure<vushort, vshort>) );
+      VERIFY((is_substitution_failure<vushort, long>) );
+      VERIFY((is_substitution_failure<vushort, ulong>) );
+      VERIFY((is_substitution_failure<vushort, llong>) );
+      VERIFY((is_substitution_failure<vushort, ullong>) );
+      VERIFY((is_substitution_failure<vushort, float>) );
+      VERIFY((is_substitution_failure<vushort, double>) );
+      VERIFY((is_substitution_failure<vushort, vi16<schar>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<uchar>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<short>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<ushort>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<int>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<uint>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<long>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<ulong>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<llong>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<ullong>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<float>>) );
+      VERIFY((is_substitution_failure<vushort, vi16<double>>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, schar>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, short>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, vshort>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, vushort>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, long>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, ulong>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, float>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, vi16<schar>>) );
+      VERIFY((is_substitution_failure<vi16<ushort>, vi16<short>>) );
+    }
+  else if constexpr (std::is_same_v<V, vchar>)
+    { //{{{2
+      binary_op_return_type<vi8<char>, char, vi8<char>>();
+      binary_op_return_type<vi8<char>, int, vi8<char>>();
+      binary_op_return_type<vi8<char>, vi8<char>, vi8<char>>();
+
+      if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+	{
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<short>>),
+		  std::is_unsigned_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<int>>),
+		  std::is_unsigned_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<long>>),
+		  std::is_unsigned_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<llong>>),
+		  std::is_unsigned_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<ushort>>),
+		  std::is_signed_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<uint>>),
+		  std::is_signed_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<ulong>>),
+		  std::is_signed_v<char>);
+	  COMPARE((is_substitution_failure<vi8<char>, vi8<ullong>>),
+		  std::is_signed_v<char>);
+	  if constexpr (std::is_signed_v<char>)
+	    {
+	      binary_op_return_type<vi8<short>, vi8<char>>();
+	      binary_op_return_type<vi8<int>, vi8<char>>();
+	      binary_op_return_type<vi8<long>, vi8<char>>();
+	      binary_op_return_type<vi8<llong>, vi8<char>>();
+	    }
+	  else
+	    {
+	      binary_op_return_type<vi8<ushort>, vi8<char>>();
+	      binary_op_return_type<vi8<uint>, vi8<char>>();
+	      binary_op_return_type<vi8<ulong>, vi8<char>>();
+	      binary_op_return_type<vi8<ullong>, vi8<char>>();
+	    }
+	  binary_op_return_type<vi8<float>, vi8<char>>();
+	  binary_op_return_type<vi8<double>, vi8<char>>();
+	}
+
+      VERIFY((is_substitution_failure<vi8<char>, llong>) );
+      VERIFY((is_substitution_failure<vi8<char>, double>) );
+      VERIFY((is_substitution_failure<vchar, vxchar>) );
+      VERIFY((is_substitution_failure<vchar, xchar>) );
+      VERIFY((is_substitution_failure<vchar, short>) );
+      VERIFY((is_substitution_failure<vchar, ushort>) );
+      COMPARE((is_substitution_failure<vchar, uint>), std::is_signed_v<char>);
+      VERIFY((is_substitution_failure<vchar, long>) );
+      VERIFY((is_substitution_failure<vchar, ulong>) );
+      VERIFY((is_substitution_failure<vchar, llong>) );
+      VERIFY((is_substitution_failure<vchar, ullong>) );
+      VERIFY((is_substitution_failure<vchar, float>) );
+      VERIFY((is_substitution_failure<vchar, double>) );
+      VERIFY((is_substitution_failure<vchar, vi8<char>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<uchar>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<schar>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<short>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<ushort>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<int>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<uint>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<long>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<ulong>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<llong>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<ullong>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<float>>) );
+      VERIFY((is_substitution_failure<vchar, vi8<double>>) );
+      VERIFY((is_substitution_failure<vi8<char>, vchar>) );
+      VERIFY((is_substitution_failure<vi8<char>, vuchar>) );
+      VERIFY((is_substitution_failure<vi8<char>, vschar>) );
+      VERIFY((is_substitution_failure<vi8<char>, xchar>) );
+      VERIFY((is_substitution_failure<vi8<char>, short>) );
+      VERIFY((is_substitution_failure<vi8<char>, ushort>) );
+      COMPARE((is_substitution_failure<vi8<char>, uint>),
+	      std::is_signed_v<char>);
+      VERIFY((is_substitution_failure<vi8<char>, long>) );
+      VERIFY((is_substitution_failure<vi8<char>, ulong>) );
+      VERIFY((is_substitution_failure<vi8<char>, ullong>) );
+      VERIFY((is_substitution_failure<vi8<char>, float>) );
+
+      // conversion between any char types must fail because the dst type's
+      // integer conversion rank isn't greater (as required by 9.6.4p4.3)
+      VERIFY((is_substitution_failure<vi8<char>, vi8<schar>>) );
+      VERIFY((is_substitution_failure<vi8<char>, vi8<uchar>>) );
+    }
+  else if constexpr (std::is_same_v<V, vschar>)
+    { //{{{2
+      binary_op_return_type<vi8<schar>, schar, vi8<schar>>();
+      binary_op_return_type<vi8<schar>, int, vi8<schar>>();
+      binary_op_return_type<vi8<schar>, vi8<schar>, vi8<schar>>();
+
+      if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+	{
+	  binary_op_return_type<vi8<short>, vi8<schar>>();
+	  binary_op_return_type<vi8<int>, vi8<schar>>();
+	  binary_op_return_type<vi8<long>, vi8<schar>>();
+	  binary_op_return_type<vi8<llong>, vi8<schar>>();
+	  binary_op_return_type<vi8<float>, vi8<schar>>();
+	  binary_op_return_type<vi8<double>, vi8<schar>>();
+	}
+
+      VERIFY((is_substitution_failure<vi8<schar>, llong>) );
+      VERIFY((is_substitution_failure<vi8<schar>, double>) );
+      VERIFY((is_substitution_failure<vschar, vuchar>) );
+      VERIFY((is_substitution_failure<vschar, uchar>) );
+      VERIFY((is_substitution_failure<vschar, short>) );
+      VERIFY((is_substitution_failure<vschar, ushort>) );
+      VERIFY((is_substitution_failure<vschar, uint>) );
+      VERIFY((is_substitution_failure<vschar, long>) );
+      VERIFY((is_substitution_failure<vschar, ulong>) );
+      VERIFY((is_substitution_failure<vschar, llong>) );
+      VERIFY((is_substitution_failure<vschar, ullong>) );
+      VERIFY((is_substitution_failure<vschar, float>) );
+      VERIFY((is_substitution_failure<vschar, double>) );
+      VERIFY((is_substitution_failure<vschar, vi8<schar>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<uchar>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<short>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<ushort>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<int>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<uint>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<long>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<ulong>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<llong>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<ullong>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<float>>) );
+      VERIFY((is_substitution_failure<vschar, vi8<double>>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vschar>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vuchar>) );
+      VERIFY((is_substitution_failure<vi8<schar>, uchar>) );
+      VERIFY((is_substitution_failure<vi8<schar>, short>) );
+      VERIFY((is_substitution_failure<vi8<schar>, ushort>) );
+      VERIFY((is_substitution_failure<vi8<schar>, uint>) );
+      VERIFY((is_substitution_failure<vi8<schar>, long>) );
+      VERIFY((is_substitution_failure<vi8<schar>, ulong>) );
+      VERIFY((is_substitution_failure<vi8<schar>, ullong>) );
+      VERIFY((is_substitution_failure<vi8<schar>, float>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vi8<uchar>>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vi8<ushort>>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vi8<uint>>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vi8<ulong>>) );
+      VERIFY((is_substitution_failure<vi8<schar>, vi8<ullong>>) );
+    }
+  else if constexpr (std::is_same_v<V, vuchar>)
+    { //{{{2
+      VERIFY((is_substitution_failure<vi8<uchar>, llong>) );
+
+      binary_op_return_type<vuchar, uint, vuchar>();
+      binary_op_return_type<vi8<uchar>, uchar, vi8<uchar>>();
+      binary_op_return_type<vi8<uchar>, int, vi8<uchar>>();
+      binary_op_return_type<vi8<uchar>, uint, vi8<uchar>>();
+      binary_op_return_type<vi8<uchar>, vi8<uchar>, vi8<uchar>>();
+
+      if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+	{
+	  binary_op_return_type<vi8<short>, vi8<uchar>>();
+	  binary_op_return_type<vi8<ushort>, vi8<uchar>>();
+	  binary_op_return_type<vi8<int>, vi8<uchar>>();
+	  binary_op_return_type<vi8<uint>, vi8<uchar>>();
+	  binary_op_return_type<vi8<long>, vi8<uchar>>();
+	  binary_op_return_type<vi8<ulong>, vi8<uchar>>();
+	  binary_op_return_type<vi8<llong>, vi8<uchar>>();
+	  binary_op_return_type<vi8<ullong>, vi8<uchar>>();
+	  binary_op_return_type<vi8<float>, vi8<uchar>>();
+	  binary_op_return_type<vi8<double>, vi8<uchar>>();
+	}
+
+      VERIFY((is_substitution_failure<vi8<uchar>, ullong>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, double>) );
+      VERIFY((is_substitution_failure<vuchar, schar>) );
+      VERIFY((is_substitution_failure<vuchar, vschar>) );
+      VERIFY((is_substitution_failure<vuchar, short>) );
+      VERIFY((is_substitution_failure<vuchar, ushort>) );
+      VERIFY((is_substitution_failure<vuchar, long>) );
+      VERIFY((is_substitution_failure<vuchar, ulong>) );
+      VERIFY((is_substitution_failure<vuchar, llong>) );
+      VERIFY((is_substitution_failure<vuchar, ullong>) );
+      VERIFY((is_substitution_failure<vuchar, float>) );
+      VERIFY((is_substitution_failure<vuchar, double>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<schar>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<uchar>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<short>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<ushort>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<int>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<uint>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<long>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<ulong>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<llong>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<ullong>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<float>>) );
+      VERIFY((is_substitution_failure<vuchar, vi8<double>>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, schar>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, vschar>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, vuchar>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, short>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, ushort>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, long>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, ulong>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, float>) );
+      VERIFY((is_substitution_failure<vi8<uchar>, vi8<schar>>) );
+    } //}}}2
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/operators.h
new file mode 100644
index 00000000000..2388bcfd166
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/operators.h
@@ -0,0 +1,285 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+
+// operators helpers  //{{{1
+template <class T>
+constexpr T
+genHalfBits()
+{
+  return std::numeric_limits<T>::max() >> (std::numeric_limits<T>::digits / 2);
+}
+template <>
+constexpr long double
+genHalfBits<long double>()
+{
+  return 0;
+}
+template <>
+constexpr double
+genHalfBits<double>()
+{
+  return 0;
+}
+template <>
+constexpr float
+genHalfBits<float>()
+{
+  return 0;
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  using T = typename V::value_type;
+  constexpr auto min = std::numeric_limits<T>::min();
+  constexpr auto max = std::numeric_limits<T>::max();
+  { // compares{{{2
+    COMPARE(V(0) == make_vec<V>({0, 1}, 0), make_mask<M>({1, 0}));
+    COMPARE(V(0) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({1, 0, 0}));
+    COMPARE(V(1) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 1, 0}));
+    COMPARE(V(2) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 0, 1}));
+    COMPARE(V(0) < make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 1, 1}));
+
+    constexpr T half = genHalfBits<T>();
+    for (T lo_ : {min, T(min + 1), T(-1), T(0), T(1), T(half - 1), half,
+		  T(half + 1), T(max - 1)})
+      {
+	for (T hi_ : {T(min + 1), T(-1), T(0), T(1), T(half - 1), half,
+		      T(half + 1), T(max - 1), max})
+	  {
+	    if (hi_ <= lo_)
+	      {
+		continue;
+	      }
+	    for (std::size_t pos = 0; pos < V::size(); ++pos)
+	      {
+		V lo = lo_;
+		V hi = hi_;
+		lo[pos] = 0; // have a different value in the vector in case
+		hi[pos] = 1; // this affects neighbors
+		COMPARE(hi, hi);
+		VERIFY(all_of(hi != lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(lo != hi)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(none_of(hi != hi)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(none_of(hi == lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(none_of(lo == hi)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(lo < hi)) << "hi: " << hi << ", lo: " << lo
+					<< ", lo < hi: " << (lo < hi);
+		VERIFY(none_of(hi < lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(none_of(hi <= lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(hi <= hi)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(hi > lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(none_of(lo > hi)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(hi >= lo)) << "hi: " << hi << ", lo: " << lo;
+		VERIFY(all_of(hi >= hi)) << "hi: " << hi << ", lo: " << lo;
+	      }
+	  }
+      }
+  }
+  { // subscripting{{{2
+    V x = max;
+    for (std::size_t i = 0; i < V::size(); ++i)
+      {
+	COMPARE(x[i], max);
+	x[i] = 0;
+      }
+    COMPARE(x, V{0});
+    for (std::size_t i = 0; i < V::size(); ++i)
+      {
+	COMPARE(x[i], T(0));
+	x[i] = max;
+      }
+    COMPARE(x, V{max});
+    COMPARE(typeid(x[0] * x[0]), typeid(T() * T()));
+    COMPARE(typeid(x[0] * T()), typeid(T() * T()));
+    COMPARE(typeid(T() * x[0]), typeid(T() * T()));
+    COMPARE(typeid(x * x[0]), typeid(x));
+    COMPARE(typeid(x[0] * x), typeid(x));
+
+    x = V([](auto i) -> T { return i; });
+    for (std::size_t i = 0; i < V::size(); ++i)
+      {
+	COMPARE(x[i], T(i));
+      }
+    for (std::size_t i = 0; i + 1 < V::size(); i += 2)
+      {
+	using std::swap;
+	swap(x[i], x[i + 1]);
+      }
+    for (std::size_t i = 0; i + 1 < V::size(); i += 2)
+      {
+	COMPARE(x[i], T(i + 1)) << x;
+	COMPARE(x[i + 1], T(i)) << x;
+      }
+    x = 1;
+    V y = 0;
+    COMPARE(x[0], T(1));
+    x[0] = y[0]; // make sure non-const smart_reference assignment works
+    COMPARE(x[0], T(0));
+    x = 1;
+    x[0] = x[0]; // self-assignment on smart_reference
+    COMPARE(x[0], T(1));
+
+    std::experimental::simd<typename V::value_type,
+			    std::experimental::simd_abi::scalar>
+      z = 2;
+    x[0] = z[0];
+    COMPARE(x[0], T(2));
+    x = 3;
+    z[0] = x[0];
+    COMPARE(z[0], T(3));
+
+    // TODO: check that only value-preserving conversions happen on subscript
+    // assignment
+  }
+  { // not{{{2
+    V x = 0;
+    COMPARE(!x, M{true});
+    V y = 1;
+    COMPARE(!y, M{false});
+  }
+
+  { // unary minus{{{2
+    V x = 0;
+    COMPARE(-x, V(T(-T(0))));
+    V y = 1;
+    COMPARE(-y, V(T(-T(1))));
+  }
+
+  { // plus{{{2
+    V x = 0;
+    V y = 0;
+    COMPARE(x + y, x);
+    COMPARE(x = x + T(1), V(1));
+    COMPARE(x + x, V(2));
+    y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+    COMPARE(x = x + y, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+    COMPARE(x = x + -y, V(1));
+    COMPARE(x += y, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+    COMPARE(x, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+    COMPARE(x += -y, V(1));
+    COMPARE(x, V(1));
+  }
+
+  { // minus{{{2
+    V x = 1;
+    V y = 0;
+    COMPARE(x - y, x);
+    COMPARE(x - T(1), y);
+    COMPARE(y, x - T(1));
+    COMPARE(x - x, y);
+    y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+    COMPARE(x = y - x, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+    COMPARE(x = y - x, V(1));
+    COMPARE(y -= x, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+    COMPARE(y, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+    COMPARE(y -= y, V(0));
+    COMPARE(y, V(0));
+  }
+
+  { // multiplies{{{2
+    V x = 1;
+    V y = 0;
+    COMPARE(x * y, y);
+    COMPARE(x = x * T(2), V(2));
+    COMPARE(x * x, V(4));
+    y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+    COMPARE(x = x * y, make_vec<V>({2, 4, 6, 8, 10, 12, 14}));
+    y = 2;
+    for (T n :
+	 {T(std::numeric_limits<T>::max() - 1), std::numeric_limits<T>::min()})
+      {
+	x = n / 2;
+	COMPARE(x * y, V(n));
+      }
+    if (std::is_integral<T>::value && std::is_unsigned<T>::value)
+      {
+	// test modulo arithmetics
+	T n = std::numeric_limits<T>::max();
+	x = n;
+	for (T m : {T(2), T(7), T(std::numeric_limits<T>::max() / 127),
+		    std::numeric_limits<T>::max()})
+	  {
+	    y = m;
+	    // if T is of lower rank than int, `n * m` will promote to int
+	    // before executing the multiplication. In this case an overflow
+	    // will be UB (and ubsan will warn about it). The solution is to
+	    // cast to uint in that case.
+	    using U
+	      = std::conditional_t<(sizeof(T) < sizeof(int)), unsigned, T>;
+	    COMPARE(x * y, V(T(U(n) * U(m))));
+	  }
+      }
+    x = 2;
+    COMPARE(x *= make_vec<V>({1, 2, 3}), make_vec<V>({2, 4, 6}));
+    COMPARE(x, make_vec<V>({2, 4, 6}));
+  }
+
+  { // divides{{{2
+    V x = 2;
+    COMPARE(x / x, V(1));
+    COMPARE(T(3) / x, V(T(3) / T(2)));
+    COMPARE(x / T(3), V(T(2) / T(3)));
+    V y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+    COMPARE(y / x,
+	    make_vec<V>({T(.5), T(1), T(1.5), T(2), T(2.5), T(3), T(3.5)}));
+
+    y = make_vec<V>(
+      {std::numeric_limits<T>::max(), std::numeric_limits<T>::min()});
+    V ref = make_vec<V>({T(std::numeric_limits<T>::max() / 2),
+			 T(std::numeric_limits<T>::min() / 2)});
+    COMPARE(y / x, ref);
+
+    y = make_vec<V>(
+      {std::numeric_limits<T>::min(), std::numeric_limits<T>::max()});
+    ref = make_vec<V>({T(std::numeric_limits<T>::min() / 2),
+		       T(std::numeric_limits<T>::max() / 2)});
+    COMPARE(y / x, ref);
+
+    y = make_vec<V>(
+      {std::numeric_limits<T>::max(), T(std::numeric_limits<T>::min() + 1)});
+    COMPARE(y / y, V(1));
+
+    ref = make_vec<V>({T(2 / std::numeric_limits<T>::max()),
+		       T(2 / (std::numeric_limits<T>::min() + 1))});
+    COMPARE(x / y, ref);
+    COMPARE(x /= y, ref);
+    COMPARE(x, ref);
+  }
+
+  { // increment & decrement {{{2
+    const V from0 = make_vec<V>({0, 1, 2, 3}, 4);
+    V x = from0;
+    COMPARE(x++, from0);
+    COMPARE(x, from0 + 1);
+    COMPARE(++x, from0 + 2);
+    COMPARE(x, from0 + 2);
+
+    COMPARE(x--, from0 + 2);
+    COMPARE(x, from0 + 1);
+    COMPARE(--x, from0);
+    COMPARE(x, from0);
+  }
+  // }}}2
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h
new file mode 100644
index 00000000000..e367b692201
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h
@@ -0,0 +1,82 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include <random>
+
+static std::mt19937 g_mt_gen{0};
+template <typename V>
+void
+test()
+{
+  using T = typename V::value_type;
+  COMPARE(reduce(V(1)), T(V::size()));
+  {
+    V x = 1;
+    COMPARE(reduce(x, std::multiplies<>()), T(1));
+    x[0] = 2;
+    COMPARE(reduce(x, std::multiplies<>()), T(2));
+    if constexpr (V::size() > 1)
+      {
+	x[V::size() - 1] = 3;
+	COMPARE(reduce(x, std::multiplies<>()), T(6));
+      }
+  }
+  COMPARE(reduce(V([](int i) { return i & 1; })), T(V::size() / 2));
+  COMPARE(reduce(V([](int i) { return i % 3; })),
+	  T(3 * (V::size() / 3)   // 0+1+2 for every complete 3 elements in V
+	    + (V::size() % 3) / 2 // 0->0, 1->0, 2->1 adjustment
+	    ));
+  if ((1 + V::size()) * V::size() / 2 <= std::numeric_limits<T>::max())
+    {
+      COMPARE(reduce(V([](int i) { return i + 1; })),
+	      T((1 + V::size()) * V::size() / 2));
+    }
+
+  {
+    const V y = 2;
+    COMPARE(reduce(y), T(2 * V::size()));
+    COMPARE(reduce(where(y > 2, y)), T(0));
+    COMPARE(reduce(where(y == 2, y)), T(2 * V::size()));
+  }
+
+  {
+    const V z([](T i) { return i + 1; });
+    COMPARE(std::experimental::reduce(z,
+				      [](auto a, auto b) {
+					using std::min;
+					return min(a, b);
+				      }),
+	    T(1))
+      << "z: " << z;
+    COMPARE(std::experimental::reduce(z,
+				      [](auto a, auto b) {
+					using std::max;
+					return max(a, b);
+				      }),
+	    T(V::size()))
+      << "z: " << z;
+    COMPARE(std::experimental::reduce(where(z > 1, z), 117,
+				      [](auto a, auto b) {
+					using std::min;
+					return min(a, b);
+				      }),
+	    T(V::size() == 1 ? 117 : 2))
+      << "z: " << z;
+  }
+
+  {
+    std::conditional_t<std::is_floating_point_v<T>,
+		       std::uniform_real_distribution<T>,
+		       std::uniform_int_distribution<T>>
+      dist(std::numeric_limits<T>::lowest(), std::numeric_limits<T>::max());
+    for (int repeat = 0; repeat < 100; ++repeat)
+      {
+	const V x([&](int) { return dist(g_mt_gen); });
+	T acc = x[0];
+	for (size_t i = 1; i < V::size(); ++i)
+	  acc += x[i];
+	FUZZY_COMPARE(reduce(x), acc);
+      }
+  }
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h b/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h
new file mode 100644
index 00000000000..5cac268808d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h
@@ -0,0 +1,48 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(0);
+  vir::test::setFuzzyness<double>(0);
+
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values_2arg<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+      limits::denorm_min(), limits::min() / 3, -0.,
+#endif
+      +0., limits::min(), limits::max()},
+    {10000, -limits::max() / 2, limits::max() / 2}, [](const V a, const V b) {
+      using IV = std::experimental::fixed_size_simd<int, V::size()>;
+      IV quo = {}; // the type is wrong, this should fail
+      const V totest = remquo(a, b, &quo);
+      auto&& expected
+	= [&](const auto& v, const auto& w) -> std::pair<const V, const IV> {
+	std::pair<V, IV> tmp = {};
+	using std::remquo;
+	for (std::size_t i = 0; i < V::size(); ++i)
+	  {
+	    int tmp2;
+	    tmp.first[i] = remquo(v[i], w[i], &tmp2);
+	    tmp.second[i] = tmp2;
+	  }
+	return tmp;
+      };
+      const auto expect1 = expected(a, b);
+      COMPARE(isnan(totest), isnan(expect1.first))
+	<< "remquo(" << a << ", " << b << ", quo) = " << totest
+	<< " != " << expect1.first;
+      const V clean_a = iif(isnan(totest), 0, a);
+      const V clean_b = iif(isnan(totest), 1, b);
+      const auto expect2 = expected(clean_a, clean_b);
+      COMPARE(remquo(clean_a, clean_b, &quo), expect2.first)
+	<< "\nclean_a/b = " << clean_a << ", " << clean_b;
+      COMPARE(quo, expect2.second);
+    });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/simd.h b/libstdc++-v3/testsuite/experimental/simd/tests/simd.h
new file mode 100644
index 00000000000..87b6815f832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/simd.h
@@ -0,0 +1,22 @@
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+  using T = typename V::value_type;
+
+  // V must store V::size() values of type T giving us the lower bound on the
+  // sizeof
+  VERIFY(sizeof(V) >= sizeof(T) * V::size());
+
+  // V should not pad more than to the next-power-of-2 of V::size() values of
+  // type T giving us the upper bound on the sizeof
+  auto n = V::size();
+  n = ((n << 1) & ~n) & ~((n >> 1) | (n >> 3));
+  while (n & (n - 1))
+    {
+      n &= n - 1;
+    }
+  VERIFY(sizeof(V) <= sizeof(T) * n);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h b/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h
new file mode 100644
index 00000000000..87b7a505d51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h
@@ -0,0 +1,31 @@
+// test only floattypes
+// { dg-additional-files "reference-sincos-sp.dat" }
+// { dg-additional-files "reference-sincos-ep.dat" }
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/mathreference.h"
+#include "bits/simd_view.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  using std::cos;
+  using std::sin;
+  using T = typename V::value_type;
+
+  vir::test::setFuzzyness<float>(2);
+  vir::test::setFuzzyness<double>(1);
+
+  const auto& testdata = referenceData<function::sincos, T>();
+  std::experimental::experimental::simd_view<V>(testdata).for_each(
+    [&](const V input, const V expected_sin, const V expected_cos) {
+      FUZZY_COMPARE(sin(input), expected_sin) << " input = " << input;
+      FUZZY_COMPARE(sin(-input), -expected_sin) << " input = " << input;
+      FUZZY_COMPARE(cos(input), expected_cos) << " input = " << input;
+      FUZZY_COMPARE(cos(-input), expected_cos) << " input = " << input;
+    });
+}
+
+// vim: sw=2 sts=2 noet ts=8
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h b/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h
new file mode 100644
index 00000000000..ea15c7ff1f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h
@@ -0,0 +1,168 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/conversions.h"
+
+using std::experimental::simd_cast;
+
+template <typename V, bool ConstProp, typename F>
+auto
+gen(const F& fun)
+{
+  if constexpr (ConstProp)
+    return V(fun);
+  else
+    return make_value_unknown(V(fun));
+}
+
+template <typename V, bool ConstProp>
+void
+split_concat()
+{
+  using T = typename V::value_type;
+  if constexpr (V::size() * 3 <= std::experimental::simd_abi::max_fixed_size<T>)
+    {
+      V a(0), b(1), c(2);
+      auto x = concat(a, b, c);
+      COMPARE(x.size(), a.size() * 3);
+      std::size_t i = 0;
+      for (; i < a.size(); ++i)
+	{
+	  COMPARE(x[i], T(0));
+	}
+      for (; i < 2 * a.size(); ++i)
+	{
+	  COMPARE(x[i], T(1));
+	}
+      for (; i < 3 * a.size(); ++i)
+	{
+	  COMPARE(x[i], T(2));
+	}
+    }
+
+  if constexpr (V::size() >= 4)
+    {
+      const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+      constexpr auto N0 = V::size() / 4u;
+      constexpr auto N1 = V::size() - 2 * N0;
+      using V0
+	= std::experimental::simd<T,
+				  std::experimental::simd_abi::deduce_t<T, N0>>;
+      using V1
+	= std::experimental::simd<T,
+				  std::experimental::simd_abi::deduce_t<T, N1>>;
+      {
+	auto x = std::experimental::split<N0, N0, N1>(a);
+	COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+	COMPARE(std::get<0>(x), V0([](auto i) -> T { return i; }));
+	COMPARE(std::get<1>(x), V0([](auto i) -> T { return i + N0; }));
+	COMPARE(std::get<2>(x), V1([](auto i) -> T { return i + 2 * N0; }));
+	auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+	// a and b may have different types if a was fixed_size<N> such that
+	// another ABI tag exists with equal N, then b will have the
+	// non-fixed-size ABI tag.
+	COMPARE(a.size(), b.size());
+	COMPARE(b,
+		decltype(b)([](auto i) -> T { return (N0 + i) % V::size(); }));
+      }
+      {
+	auto x = std::experimental::split<N0, N1, N0>(a);
+	COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+	COMPARE(std::get<0>(x), V0([](auto i) -> T { return i; }));
+	COMPARE(std::get<1>(x), V1([](auto i) -> T { return i + N0; }));
+	COMPARE(std::get<2>(x), V0([](auto i) -> T { return i + N0 + N1; }));
+	auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+	// a and b may have different types if a was fixed_size<N> such that
+	// another ABI tag exists with equal N, then b will have the
+	// non-fixed-size ABI tag.
+	COMPARE(a.size(), b.size());
+	COMPARE(b,
+		decltype(b)([](auto i) -> T { return (N0 + i) % V::size(); }));
+      }
+      {
+	auto x = std::experimental::split<N1, N0, N0>(a);
+	COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+	COMPARE(std::get<0>(x), V1([](auto i) -> T { return i; }));
+	COMPARE(std::get<1>(x), V0([](auto i) -> T { return i + N1; }));
+	COMPARE(std::get<2>(x), V0([](auto i) -> T { return i + N0 + N1; }));
+	auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+	// a and b may have different types if a was fixed_size<N> such that
+	// another ABI tag exists with equal N, then b will have the
+	// non-fixed-size ABI tag.
+	COMPARE(a.size(), b.size());
+	COMPARE(b,
+		decltype(b)([](auto i) -> T { return (N1 + i) % V::size(); }));
+      }
+    }
+
+  if constexpr (V::size() % 3 == 0)
+    {
+      const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+      constexpr auto N0 = V::size() / 3;
+      using V0
+	= std::experimental::simd<T,
+				  std::experimental::simd_abi::deduce_t<T, N0>>;
+      using V1 = std::experimental::simd<
+	T, std::experimental::simd_abi::deduce_t<T, 2 * N0>>;
+      {
+	auto [x, y, z] = std::experimental::split<N0, N0, N0>(a);
+	COMPARE(x, V0([](auto i) -> T { return i; }));
+	COMPARE(y, V0([](auto i) -> T { return i + N0; }));
+	COMPARE(z, V0([](auto i) -> T { return i + N0 * 2; }));
+	auto b = concat(x, y, z);
+	COMPARE(a.size(), b.size());
+	COMPARE(b, simd_cast<decltype(b)>(a));
+	COMPARE(simd_cast<V>(b), a);
+      }
+      {
+	auto [x, y] = std::experimental::split<N0, 2 * N0>(a);
+	COMPARE(x, V0([](auto i) -> T { return i; }));
+	COMPARE(y, V1([](auto i) -> T { return i + N0; }));
+	auto b = concat(x, y);
+	COMPARE(a.size(), b.size());
+	COMPARE(b, simd_cast<decltype(b)>(a));
+	COMPARE(simd_cast<V>(b), a);
+      }
+      {
+	auto [x, y] = std::experimental::split<2 * N0, N0>(a);
+	COMPARE(x, V1([](auto i) -> T { return i; }));
+	COMPARE(y, V0([](auto i) -> T { return i + 2 * N0; }));
+	auto b = concat(x, y);
+	COMPARE(a.size(), b.size());
+	COMPARE(b, simd_cast<decltype(b)>(a));
+	COMPARE(simd_cast<V>(b), a);
+      }
+    }
+
+  if constexpr ((V::size() & 1) == 0)
+    {
+      using std::experimental::simd;
+      using std::experimental::simd_abi::deduce_t;
+      using V0 = simd<T, deduce_t<T, V::size()>>;
+      using V2 = simd<T, deduce_t<T, 2>>;
+      using V3 = simd<T, deduce_t<T, V::size() / 2>>;
+
+      const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+
+      std::array<V2, V::size() / 2> v2s = std::experimental::split<V2>(a);
+      int offset = 0;
+      for (V2 test : v2s)
+	{
+	  COMPARE(test, V2([&](auto i) -> T { return i + offset; }));
+	  offset += 2;
+	}
+      COMPARE(concat(v2s), simd_cast<V0>(a));
+
+      std::array<V3, 2> v3s = std::experimental::split<V3>(a);
+      COMPARE(v3s[0], V3([](auto i) -> T { return i; }));
+      COMPARE(v3s[1], V3([](auto i) -> T { return i + V3::size(); }));
+      COMPARE(concat(v3s), simd_cast<V0>(a));
+    }
+}
+
+template <typename V>
+void
+test()
+{
+  split_concat<V, true>();
+  split_concat<V, false>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/splits.h b/libstdc++-v3/testsuite/experimental/simd/tests/splits.h
new file mode 100644
index 00000000000..2b8c03bbcdc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/splits.h
@@ -0,0 +1,21 @@
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  using namespace std::experimental::parallelism_v2;
+  using T = typename V::value_type;
+  if constexpr (V::size() / simd_size_v<T> * simd_size_v<T> == V::size())
+    {
+      M k(true);
+      VERIFY(all_of(k)) << k;
+      const auto parts = split<simd_mask<T>>(k);
+      for (auto k2 : parts)
+	{
+	  VERIFY(all_of(k2)) << k2;
+	  COMPARE(typeid(k2), typeid(simd_mask<T>));
+	}
+    }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h b/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h
new file mode 100644
index 00000000000..b137a6bba49
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h
@@ -0,0 +1,25 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+  vir::test::setFuzzyness<float>(1);
+  vir::test::setFuzzyness<double>(1);
+
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>(
+    {
+#ifdef __STDC_IEC_559__
+      limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+      limits::denorm_min(), limits::min() / 3,
+#endif
+      +0., limits::min(), limits::max()},
+    {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(acos),
+    MAKE_TESTER(tan), MAKE_TESTER(acosh), MAKE_TESTER(asinh),
+    MAKE_TESTER(atanh), MAKE_TESTER(cosh), MAKE_TESTER(sinh),
+    MAKE_TESTER(tanh));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h b/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h
new file mode 100644
index 00000000000..4a65606c6e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h
@@ -0,0 +1,88 @@
+// test only floattypes
+#include "bits/test_values.h"
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+  using limits = std::numeric_limits<typename V::value_type>;
+  test_values<V>(
+    {2.1,
+     2.0,
+     2.9,
+     2.5,
+     2.499,
+     1.5,
+     1.499,
+     1.99,
+     0.99,
+     0.5,
+     0.499,
+     0.,
+     -2.1,
+     -2.0,
+     -2.9,
+     -2.5,
+     -2.499,
+     -1.5,
+     -1.499,
+     -1.99,
+     -0.99,
+     -0.5,
+     -0.499,
+     3 << 21,
+     3 << 22,
+     3 << 23,
+     -(3 << 21),
+     -(3 << 22),
+     -(3 << 23),
+#ifdef __STDC_IEC_559__
+     -0.,
+     limits::infinity(),
+     -limits::infinity(),
+     limits::denorm_min(),
+     limits::min() * 0.9,
+     -limits::denorm_min(),
+     -limits::min() * 0.9,
+#endif
+     limits::max(),
+     limits::min(),
+     limits::lowest(),
+     -limits::max(),
+     -limits::min(),
+     -limits::lowest()},
+    [](const V input) {
+      const V expected([&](auto i) { return std::trunc(input[i]); });
+      COMPARE(trunc(input), expected) << input;
+    },
+    [](const V input) {
+      const V expected([&](auto i) { return std::ceil(input[i]); });
+      COMPARE(ceil(input), expected) << input;
+    },
+    [](const V input) {
+      const V expected([&](auto i) { return std::floor(input[i]); });
+      COMPARE(floor(input), expected) << input;
+    });
+
+#ifdef __STDC_IEC_559__
+  test_values<V>(
+    {
+#ifdef __SUPPORT_SNAN__
+      limits::signaling_NaN(),
+#endif
+      limits::quiet_NaN()},
+    [](const V input) {
+      const V expected([&](auto i) { return std::trunc(input[i]); });
+      COMPARE(isnan(trunc(input)), isnan(expected)) << input;
+    },
+    [](const V input) {
+      const V expected([&](auto i) { return std::ceil(input[i]); });
+      COMPARE(isnan(ceil(input)), isnan(expected)) << input;
+    },
+    [](const V input) {
+      const V expected([&](auto i) { return std::floor(input[i]); });
+      COMPARE(isnan(floor(input)), isnan(expected)) << input;
+    });
+#endif
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/where.h b/libstdc++-v3/testsuite/experimental/simd/tests/where.h
new file mode 100644
index 00000000000..c502fa9a89e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/where.h
@@ -0,0 +1,108 @@
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/metahelpers.h"
+
+template <class V> struct Convertible
+{
+  operator V() const { return V(4); }
+};
+
+template <class M, class T>
+constexpr bool
+where_is_ill_formed_impl(M, const T&, float)
+{
+  return true;
+}
+template <class M, class T>
+constexpr auto
+where_is_ill_formed_impl(M m, const T& v, int)
+  -> std::conditional_t<true, bool, decltype(std::experimental::where(m, v))>
+{
+  return false;
+}
+
+template <class M, class T>
+constexpr bool
+where_is_ill_formed(M m, const T& v)
+{
+  return where_is_ill_formed_impl(m, v, int());
+}
+
+template <typename T>
+void
+where_fundamental()
+{
+  using std::experimental::where;
+  T x = T();
+  where(true, x) = x + 1;
+  COMPARE(x, T(1));
+  where(false, x) = x - 1;
+  COMPARE(x, T(1));
+  where(true, x) += T(1);
+  COMPARE(x, T(2));
+}
+
+template <typename V>
+void
+test()
+{
+  using M = typename V::mask_type;
+  using T = typename V::value_type;
+  where_fundamental<T>();
+  VERIFY(!(sfinae_is_callable<V>(
+    [](auto x) -> decltype(where(true, x))* { return nullptr; })));
+
+  const V indexes([](int i) { return i + 1; });
+  const M alternating_mask = make_mask<M>({true, false});
+  V x = 0;
+  where(alternating_mask, x) = indexes;
+  COMPARE(alternating_mask, x == indexes);
+
+  where(!alternating_mask, x) = T(2);
+  COMPARE(!alternating_mask, x == T(2)) << x;
+
+  where(!alternating_mask, x) = Convertible<V>();
+  COMPARE(!alternating_mask, x == T(4));
+
+  x = 0;
+  COMPARE(x, T(0));
+  where(alternating_mask, x) += indexes;
+  COMPARE(alternating_mask, x == indexes);
+
+  x = 10;
+  COMPARE(x, T(10));
+  where(!alternating_mask, x) += T(1);
+  COMPARE(!alternating_mask, x == T(11));
+  where(alternating_mask, x) -= Convertible<V>();
+  COMPARE(alternating_mask, x == T(6));
+  where(alternating_mask, x) /= T(2);
+  COMPARE(alternating_mask, x == T(3)) << x;
+  where(alternating_mask, x) *= T(3);
+  COMPARE(alternating_mask, x == T(9));
+  COMPARE(!alternating_mask, x == T(11));
+
+  x = 10;
+  where(alternating_mask, x)++;
+  COMPARE(alternating_mask, x == T(11));
+  ++where(alternating_mask, x);
+  COMPARE(alternating_mask, x == T(12));
+  where(alternating_mask, x)--;
+  COMPARE(alternating_mask, x == T(11));
+  --where(alternating_mask, x);
+  --where(alternating_mask, x);
+  COMPARE(alternating_mask, x == T(9));
+  COMPARE(alternating_mask, -where(alternating_mask, x) == T(-T(9)));
+
+  const auto y = x;
+  VERIFY(where_is_ill_formed(true, y));
+  VERIFY(where_is_ill_formed(true, x));
+  VERIFY(where_is_ill_formed(true, V(x)));
+
+  M test = alternating_mask;
+  where(alternating_mask, test) = M(true);
+  COMPARE(test, alternating_mask);
+  where(alternating_mask, test) = M(false);
+  COMPARE(test, M(false));
+  where(alternating_mask, test) = M(true);
+  COMPARE(test, alternating_mask);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc
new file mode 100644
index 00000000000..b062117b78e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc
new file mode 100644
index 00000000000..3de35f26667
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc
new file mode 100644
index 00000000000..3b3b44678ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc
new file mode 100644
index 00000000000..2c297aaeb48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc
new file mode 100644
index 00000000000..27f56c929d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc
new file mode 100644
index 00000000000..d9b0e07e3ca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc
new file mode 100644
index 00000000000..b15a7a58244
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc
new file mode 100644
index 00000000000..2f40098232f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc
new file mode 100644
index 00000000000..d231dd3742f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc
new file mode 100644
index 00000000000..173aca4e406
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc
new file mode 100644
index 00000000000..9263aff80d4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc
new file mode 100644
index 00000000000..4fd5be6ff3a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc
new file mode 100644
index 00000000000..4548e25a634
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc
new file mode 100644
index 00000000000..27aa8d8263d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc
new file mode 100644
index 00000000000..2fd1f5ff24d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc
new file mode 100644
index 00000000000..a1fc0f60fce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc
new file mode 100644
index 00000000000..8fea4cd5894
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc
new file mode 100644
index 00000000000..8ce8b9cc9e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc
new file mode 100644
index 00000000000..0af0734bbc1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc
new file mode 100644
index 00000000000..56c695c5957
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char.cc b/libstdc++-v3/testsuite/experimental/simd/where-char.cc
new file mode 100644
index 00000000000..a5e0e2a89a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc
new file mode 100644
index 00000000000..02902b72841
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ca286bbc363
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc
new file mode 100644
index 00000000000..53b51f17b04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char16_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc
new file mode 100644
index 00000000000..4f4f6aa7d4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..11d12a709fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc
new file mode 100644
index 00000000000..3771d9a50fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<char32_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc
new file mode 100644
index 00000000000..aeb094b5ced
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc
new file mode 100644
index 00000000000..34348c2144f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double.cc b/libstdc++-v3/testsuite/experimental/simd/where-double.cc
new file mode 100644
index 00000000000..bb6a54368e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc
new file mode 100644
index 00000000000..e0ea665a6a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc
new file mode 100644
index 00000000000..3c95886ebb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float.cc b/libstdc++-v3/testsuite/experimental/simd/where-float.cc
new file mode 100644
index 00000000000..ffb1ddb6005
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<float>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc
new file mode 100644
index 00000000000..a94f8bedfe1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc
new file mode 100644
index 00000000000..5c3e1ccdb93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int.cc b/libstdc++-v3/testsuite/experimental/simd/where-int.cc
new file mode 100644
index 00000000000..79c6896f7f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc
new file mode 100644
index 00000000000..cf381cc7e27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc
new file mode 100644
index 00000000000..7b82702c58a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long.cc b/libstdc++-v3/testsuite/experimental/simd/where-long.cc
new file mode 100644
index 00000000000..9bae1730067
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc
new file mode 100644
index 00000000000..cd22a01642e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b0226f3fa6e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc
new file mode 100644
index 00000000000..367de972d31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long double>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc
new file mode 100644
index 00000000000..6113181ed29
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc
new file mode 100644
index 00000000000..3ed4c6cab9f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc
new file mode 100644
index 00000000000..1b6cd741d0f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc
new file mode 100644
index 00000000000..51fda21394a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc
new file mode 100644
index 00000000000..9a437df485d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short.cc b/libstdc++-v3/testsuite/experimental/simd/where-short.cc
new file mode 100644
index 00000000000..54d0e5b5dde
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc
new file mode 100644
index 00000000000..23d57151ad2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..d2126eeb612
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc
new file mode 100644
index 00000000000..d671e6f2523
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<signed char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..724751ff278
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..468bb34a8ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc
new file mode 100644
index 00000000000..fb063f44160
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned char>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..af40c8b99f5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..5588269f066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc
new file mode 100644
index 00000000000..33f4c289a70
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned int>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..5519953f1bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..bd4d64738d1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc
new file mode 100644
index 00000000000..542caf46e27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..5cfa099626f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..95e5ae020d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc
new file mode 100644
index 00000000000..3b7d60b4fb0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned long long>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..763528d5acd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..2dac8828348
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc
new file mode 100644
index 00000000000..f83c61d2091
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<unsigned short>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..485c6a7d11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..1aa7a8f3b1a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc
new file mode 100644
index 00000000000..07f879bb5ed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+  iterate_abis<wchar_t>();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
index 11fdc8d340b..90264f5bfa9 100644
--- a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
+++ b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
@@ -89,12 +89,14 @@ if {[info exists tests_file] && [file exists $tests_file]} {
 	    # 3. wchar_t tests, if not supported.
 	    # 4. thread tests, if not supported. 
 	    # 5. *_filebuf, if file I/O is not supported.
+	    # 6. simd tests.
 	    if { [string first _xin $t] == -1
 		 && [string first performance $t] == -1
 		 && (${v3-wchar_t} || [string first wchar_t $t] == -1) 
 		 && (${v3-threads} || [string first thread $t] == -1)  
 		 && ([string first "_filebuf" $t] == -1
-		     || [check_v3_target_fileio]) } {
+		     || [check_v3_target_fileio])
+		 && [string first "/experimental/simd/" $t] == -1 } {
 		lappend tests $t
 	    }
 	}
@@ -107,5 +109,19 @@ global DEFAULT_CXXFLAGS
 global PCH_CXXFLAGS
 dg-runtest $tests "" "$DEFAULT_CXXFLAGS $PCH_CXXFLAGS"
 
+# Finally run simd tests with extra SIMD-relevant flags
+global DEFAULT_VECTCFLAGS
+global EFFECTIVE_TARGETS
+set DEFAULT_VECTCFLAGS ""
+set EFFECTIVE_TARGETS ""
+
+if [check_vect_support_and_set_flags] {
+  lappend DEFAULT_VECTCFLAGS "-O2"
+  lappend DEFAULT_VECTCFLAGS "-Wno-psabi"
+  et-dg-runtest dg-runtest [lsort \
+    [glob -nocomplain $srcdir/experimental/simd/*.cc]] \
+    "$DEFAULT_VECTCFLAGS" "$DEFAULT_CXXFLAGS $PCH_CXXFLAGS"
+}
+
 # All done.
 dg-finish

  parent reply	other threads:[~2020-05-08 19:03 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-14 12:12 Matthias Kretz
2019-10-15  3:52 ` Thomas Rodgers
2019-10-24  8:26 ` Dr. Matthias Kretz
2020-02-10 16:49   ` Thomas Rodgers
2020-02-10 20:14     ` Thomas Rodgers
2020-01-07 11:01 ` Matthias Kretz
2020-01-07 11:17   ` Andrew Pinski
2020-01-07 13:19     ` Dr. Matthias Kretz
     [not found] ` <3486545.znU0eCzeS4@excalibur>
     [not found]   ` <xkqeo8qyl8y8.fsf@trodgers.remote>
2020-05-08 19:03     ` Matthias Kretz [this message]
2020-11-11 23:43       ` Jonathan Wakely
2020-11-14  1:11         ` Matthias Kretz
2020-11-15 19:11           ` Matthias Kretz
2020-12-10 21:13             ` Matthias Kretz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33105491.xCRyjBS7g1@excalibur \
    --to=m.kretz@gsi.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    --cc=trodgers@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).