* Re: [PATCH] std::experimental::simd
[not found] ` <xkqeo8qyl8y8.fsf@trodgers.remote>
@ 2020-05-08 19:03 ` Matthias Kretz
2020-11-11 23:43 ` Jonathan Wakely
0 siblings, 1 reply; 13+ messages in thread
From: Matthias Kretz @ 2020-05-08 19:03 UTC (permalink / raw)
To: Thomas Rodgers, libstdc++, Gcc-patches
[-- Attachment #1: Type: text/plain, Size: 797 bytes --]
Here's my last update to the std::experimental::simd patch. It's currently
based on the gcc-10 branch.
Cheers,
Matthias
--
──────────────────────────────────────────────────────────────────────────
Dr. Matthias Kretz https://mattkretz.github.io
GSI Helmholtz Centre for Heavy Ion Research https://gsi.de
std::experimental::simd https://github.com/VcDevel/std-simd
──────────────────────────────────────────────────────────────────────────
[-- Attachment #2: simd.patch --]
[-- Type: text/x-patch, Size: 1583956 bytes --]
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 0f03126db1c..c7ac33faaf5 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -2869,6 +2869,17 @@ since C++14 and the implementation is complete.
<entry>Library Fundamentals 2 TS</entry>
</row>
+ <row>
+ <entry>
+ <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0214r9.pdf">
+ P0214R9
+ </link>
+ </entry>
+ <entry>Data-Parallel Types</entry>
+ <entry>Y</entry>
+ <entry>Parallelism 2 TS</entry>
+ </row>
+
</tbody>
</tgroup>
</table>
@@ -3014,6 +3025,185 @@ since C++14 and the implementation is complete.
If <code>!is_regular_file(p)</code>, an error is reported.
</para>
+ <section xml:id="iso.2017.par2ts" xreflabel="Implementation Specific Behavior of the Parallelism 2 TS"><info><title>Parallelism 2 TS</title></info>
+
+ <para>
+ <emphasis>9.3 [parallel.simd.abi]</emphasis>
+ <code>max_fixed_size<T></code> is 32, except when targetting
+ AVX512BW and <code>sizeof(T)</code> is 1.
+ </para>
+
+ <para>
+ When targeting 32-bit x86,
+ <classname>simd_abi::compatible<T></classname> is an alias for
+ <classname>simd_abi::scalar</classname>. When targeting 64-bit x86
+ (including x32), <classname>simd_abi::compatible<T></classname> is
+ an alias for <classname>simd_abi::_VecBuiltin<16></classname>,
+ unless <code>T</code> is <code>long double</code>, in which case it is
+ an alias for <classname>simd_abi::scalar</classname>.
+ </para>
+
+ <para>
+ When targeting x86 (both 32-bit and 64-bit),
+ <classname>simd_abi::native<T></classname> is an alias for one of
+ <classname>simd_abi::_VecBuiltin<16></classname>,
+ <classname>simd_abi::_VecBuiltin<32></classname>, or
+ <classname>simd_abi::_VecBltnBtmsk<64></classname>, depending on
+ the machine options the compiler was invoked with.
+ </para>
+
+ <para>
+ For any other targeted machine
+ <classname>simd_abi::compatible<T></classname> and
+ <classname>simd_abi::native<T></classname> are aliases for
+ <classname>simd_abi::scalar</classname>. (subject to change)
+ </para>
+
+ <para>
+ The extended ABI tag types defined in the
+ <code>std::experimental::parallelism_v2::simd_abi</code> namespace are:
+ <classname>simd_abi::_VecBuiltin<Bytes></classname>, and
+ <classname>simd_abi::_VecBltnBtmsk<Bytes></classname>.
+ </para>
+
+ <para>
+ <classname>simd_abi::deduce<T, N, Abis...>::type</classname>,
+ with <code>N > 1</code> is an alias for an extended ABI tag, if a
+ supported extended ABI tag exists. Otherwise it is an alias for
+ <classname>simd_abi::fixed_size<N></classname>. The <classname>
+ simd_abi::_VecBltnBtmsk</classname> ABI tag is preferred over
+ <classname>simd_abi::_VecBuiltin</classname>.
+ </para>
+
+ <para>
+ <emphasis>9.4 [parallel.simd.traits]</emphasis>
+ <classname>memory_alignment<T, U>::value</classname> is
+ <code>sizeof(U) * T::size()</code> rounded up to the next power-of-two
+ value.
+ </para>
+
+ <para>
+ <emphasis>9.6.1 [parallel.simd.overview]</emphasis>
+ On ARM, <classname>simd<T, _VecBuiltin<Bytes>></classname>
+ is supported if <code>__ARM_NEON</code> is defined and
+ <code>sizeof(T) <= 4</code>. Additionally,
+ <code>sizeof(T) == 8</code> with integral <code>T</code> is supported if
+ <code>__ARM_ARCH >= 8</code>, and <code>double</code> is supported if
+ <code>__aarch64__</code> is defined.
+ On x86, given an extended ABI tag <code>Abi</code>,
+ <classname>simd<T, Abi></classname> is supported according to the
+ following table:
+ <table frame="all" xml:id="table.par2ts_simd_support">
+ <title>Support for Extended ABI Tags</title>
+
+ <tgroup cols="4" align="left" colsep="0" rowsep="1">
+ <colspec colname="c1"/>
+ <colspec colname="c2"/>
+ <colspec colname="c3"/>
+ <colspec colname="c4"/>
+ <thead>
+ <row>
+ <entry>ABI tag <code>Abi</code></entry>
+ <entry>value type <code>T</code></entry>
+ <entry>values for <code>Bytes</code></entry>
+ <entry>required machine option</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry morerows="5">
+ <classname>_VecBuiltin<Bytes></classname>
+ </entry>
+ <entry morerows="1"><code>float</code></entry>
+ <entry>8, 12, 16</entry>
+ <entry>"-msse"</entry>
+ </row>
+
+ <row>
+ <entry>20, 24, 28, 32</entry>
+ <entry>"-mavx"</entry>
+ </row>
+
+ <row>
+ <entry morerows="1"><code>double</code></entry>
+ <entry>16</entry>
+ <entry>"-msse2"</entry>
+ </row>
+
+ <row>
+ <entry>24, 32</entry>
+ <entry>"-mavx"</entry>
+ </row>
+
+ <row>
+ <entry morerows="1">
+ integral types other than <code>bool</code>
+ </entry>
+ <entry>
+ <code>Bytes</code> ≤ 16 and <code>Bytes</code> divisible by
+ <code>sizeof(T)</code>
+ </entry>
+ <entry>"-msse2"</entry>
+ </row>
+
+ <row>
+ <entry>
+ 16 < <code>Bytes</code> ≤ 32 and <code>Bytes</code>
+ divisible by <code>sizeof(T)</code>
+ </entry>
+ <entry>"-mavx2"</entry>
+ </row>
+
+ <row>
+ <entry morerows="1">
+ <classname>_VecBuiltin<Bytes></classname> and
+ <classname>_VecBltnBtmsk<Bytes></classname>
+ </entry>
+ <entry>
+ vectorizable types with <code>sizeof(T)</code> ≥ 4
+ </entry>
+ <entry morerows="1">
+ 32 < <code>Bytes</code> ≤ 64 and <code>Bytes</code>
+ divisible by <code>sizeof(T)</code>
+ </entry>
+ <entry>"-mavx512f"</entry>
+ </row>
+
+ <row>
+ <entry>
+ vectorizable types with <code>sizeof(T)</code> < 4
+ </entry>
+ <entry>"-mavx512bw"</entry>
+ </row>
+
+ <row>
+ <entry morerows="1">
+ <classname>_VecBltnBtmsk<Bytes></classname>
+ </entry>
+ <entry>
+ vectorizable types with <code>sizeof(T)</code> ≥ 4
+ </entry>
+ <entry morerows="1">
+ <code>Bytes</code> ≤ 32 and <code>Bytes</code> divisible by
+ <code>sizeof(T)</code>
+ </entry>
+ <entry>"-mavx512vl"</entry>
+ </row>
+
+ <row>
+ <entry>
+ vectorizable types with <code>sizeof(T)</code> < 4
+ </entry>
+ <entry>"-mavx512bw" and "-mavx512vl"</entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+ </para>
+
+ </section>
</section>
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 80aeb3f8959..d1c870f620c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -734,6 +734,7 @@ experimental_headers = \
${experimental_srcdir}/ratio \
${experimental_srcdir}/regex \
${experimental_srcdir}/set \
+ ${experimental_srcdir}/simd \
${experimental_srcdir}/socket \
${experimental_srcdir}/source_location \
${experimental_srcdir}/string \
@@ -754,6 +755,16 @@ experimental_bits_headers = \
${experimental_bits_srcdir}/lfts_config.h \
${experimental_bits_srcdir}/net.h \
${experimental_bits_srcdir}/shared_ptr.h \
+ ${experimental_bits_srcdir}/simd.h \
+ ${experimental_bits_srcdir}/simd_builtin.h \
+ ${experimental_bits_srcdir}/simd_converter.h \
+ ${experimental_bits_srcdir}/simd_detail.h \
+ ${experimental_bits_srcdir}/simd_fixed_size.h \
+ ${experimental_bits_srcdir}/simd_math.h \
+ ${experimental_bits_srcdir}/simd_neon.h \
+ ${experimental_bits_srcdir}/simd_scalar.h \
+ ${experimental_bits_srcdir}/simd_x86.h \
+ ${experimental_bits_srcdir}/simd_x86_conversions.h \
${experimental_bits_srcdir}/string_view.tcc \
${experimental_bits_filesystem_headers}
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index eb437ad8d8d..686331fd15c 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1079,6 +1079,7 @@ experimental_headers = \
${experimental_srcdir}/ratio \
${experimental_srcdir}/regex \
${experimental_srcdir}/set \
+ ${experimental_srcdir}/simd \
${experimental_srcdir}/socket \
${experimental_srcdir}/source_location \
${experimental_srcdir}/string \
@@ -1099,6 +1100,16 @@ experimental_bits_headers = \
${experimental_bits_srcdir}/lfts_config.h \
${experimental_bits_srcdir}/net.h \
${experimental_bits_srcdir}/shared_ptr.h \
+ ${experimental_bits_srcdir}/simd.h \
+ ${experimental_bits_srcdir}/simd_builtin.h \
+ ${experimental_bits_srcdir}/simd_converter.h \
+ ${experimental_bits_srcdir}/simd_detail.h \
+ ${experimental_bits_srcdir}/simd_fixed_size.h \
+ ${experimental_bits_srcdir}/simd_math.h \
+ ${experimental_bits_srcdir}/simd_neon.h \
+ ${experimental_bits_srcdir}/simd_scalar.h \
+ ${experimental_bits_srcdir}/simd_x86.h \
+ ${experimental_bits_srcdir}/simd_x86_conversions.h \
${experimental_bits_srcdir}/string_view.tcc \
${experimental_bits_filesystem_headers}
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
new file mode 100644
index 00000000000..298ff5957a1
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -0,0 +1,5031 @@
+// Definition of the public simd interfaces -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_H
+#define _GLIBCXX_EXPERIMENTAL_SIMD_H
+
+#if __cplusplus >= 201703L
+
+#include "simd_detail.h"
+#include <bitset>
+#include <climits>
+#include <cstring>
+#include <functional>
+#include <iosfwd>
+#include <limits>
+#include <utility>
+
+#if _GLIBCXX_SIMD_X86INTRIN
+#include <x86intrin.h>
+#elif _GLIBCXX_SIMD_HAVE_NEON
+#include <arm_neon.h>
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+#if !_GLIBCXX_SIMD_X86INTRIN
+using __m128 [[__gnu__::__vector_size__(16)]] = float;
+using __m128d [[__gnu__::__vector_size__(16)]] = double;
+using __m128i [[__gnu__::__vector_size__(16)]] = long long;
+using __m256 [[__gnu__::__vector_size__(32)]] = float;
+using __m256d [[__gnu__::__vector_size__(32)]] = double;
+using __m256i [[__gnu__::__vector_size__(32)]] = long long;
+using __m512 [[__gnu__::__vector_size__(64)]] = float;
+using __m512d [[__gnu__::__vector_size__(64)]] = double;
+using __m512i [[__gnu__::__vector_size__(64)]] = long long;
+#endif
+
+// __next_power_of_2{{{
+/**
+ * \internal
+ * Returns the next power of 2 larger than or equal to \p __x.
+ */
+constexpr std::size_t
+__next_power_of_2(std::size_t __x)
+{
+ return (__x & (__x - 1)) == 0 ? __x
+ : __next_power_of_2((__x | (__x >> 1)) + 1);
+}
+
+// }}}
+namespace simd_abi {
+// {{{
+// implementation details:
+struct _Scalar;
+template <int _Np> struct _Fixed;
+
+// There are two major ABIs that appear on different architectures.
+// Both have non-boolean values packed into an N Byte register
+// -> #elements = N / sizeof(T)
+// Masks differ:
+// 1. Use value vector registers for masks (all 0 or all 1)
+// 2. Use bitmasks (mask registers) with one bit per value in the corresponding
+// value vector
+//
+// Both can be partially used, masking off the rest when doing horizontal
+// operations or operations that can trap (e.g. FP_INVALID or integer division
+// by 0). This is encoded as the number of used bytes.
+template <int _UsedBytes> struct _VecBuiltin;
+template <int _UsedBytes> struct _VecBltnBtmsk;
+
+template <typename _Tp, int _Np> using _VecN = _VecBuiltin<sizeof(_Tp) * _Np>;
+
+template <int _UsedBytes = 16> using _Sse = _VecBuiltin<_UsedBytes>;
+template <int _UsedBytes = 32> using _Avx = _VecBuiltin<_UsedBytes>;
+template <int _UsedBytes = 64> using _Avx512 = _VecBltnBtmsk<_UsedBytes>;
+template <int _UsedBytes = 16> using _Neon = _VecBuiltin<_UsedBytes>;
+
+// implementation-defined:
+using __sse = _Sse<>;
+using __avx = _Avx<>;
+using __avx512 = _Avx512<>;
+using __neon = _Neon<>;
+
+using __neon128 = _Neon<16>;
+using __neon64 = _Neon<8>;
+
+// standard:
+template <typename _Tp, size_t _Np, typename...> struct deduce;
+template <int _Np> using fixed_size = _Fixed<_Np>;
+using scalar = _Scalar;
+// }}}
+} // namespace simd_abi
+// forward declarations is_simd(_mask), simd(_mask), simd_size {{{
+template <typename _Tp> struct is_simd;
+template <typename _Tp> struct is_simd_mask;
+template <typename _Tp, typename _Abi> class simd;
+template <typename _Tp, typename _Abi> class simd_mask;
+template <typename _Tp, typename _Abi> struct simd_size;
+// }}}
+// load/store flags {{{
+struct element_aligned_tag
+{
+};
+struct vector_aligned_tag
+{
+};
+template <size_t _Np> struct overaligned_tag
+{
+ static constexpr size_t _S_alignment = _Np;
+};
+inline constexpr element_aligned_tag element_aligned = {};
+inline constexpr vector_aligned_tag vector_aligned = {};
+template <size_t _Np> inline constexpr overaligned_tag<_Np> overaligned = {};
+// }}}
+
+// vvv ---- type traits ---- vvv
+// integer type aliases{{{
+using _UChar = unsigned char;
+using _SChar = signed char;
+using _UShort = unsigned short;
+using _UInt = unsigned int;
+using _ULong = unsigned long;
+using _ULLong = unsigned long long;
+using _LLong = long long;
+//}}}
+// __identity/__id{{{
+template <typename _Tp> struct __identity
+{
+ using type = _Tp;
+};
+template <typename _Tp> using __id = typename __identity<_Tp>::type;
+
+// }}}
+// __first_of_pack{{{
+template <typename _T0, typename...> struct __first_of_pack
+{
+ using type = _T0;
+};
+template <typename... _Ts>
+using __first_of_pack_t = typename __first_of_pack<_Ts...>::type;
+
+//}}}
+// __value_type_or_identity_t {{{
+template <typename _Tp>
+typename _Tp::value_type
+__value_type_or_identity_impl(int);
+template <typename _Tp>
+_Tp
+__value_type_or_identity_impl(float);
+template <typename _Tp>
+using __value_type_or_identity_t
+ = decltype(__value_type_or_identity_impl<_Tp>(int()));
+
+// }}}
+// __is_vectorizable {{{
+template <typename _Tp>
+struct __is_vectorizable : public std::is_arithmetic<_Tp>
+{
+};
+template <> struct __is_vectorizable<bool> : public false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_vectorizable_v = __is_vectorizable<_Tp>::value;
+// Deduces to a vectorizable type
+template <typename _Tp, typename = enable_if_t<__is_vectorizable_v<_Tp>>>
+using _Vectorizable = _Tp;
+
+// }}}
+// _LoadStorePtr / __is_possible_loadstore_conversion {{{
+template <typename _Ptr, typename _ValueType>
+struct __is_possible_loadstore_conversion
+ : conjunction<__is_vectorizable<_Ptr>, __is_vectorizable<_ValueType>>
+{
+};
+template <> struct __is_possible_loadstore_conversion<bool, bool> : true_type
+{
+};
+// Deduces to a type allowed for load/store with the given value type.
+template <typename _Ptr, typename _ValueType,
+ typename = enable_if_t<
+ __is_possible_loadstore_conversion<_Ptr, _ValueType>::value>>
+using _LoadStorePtr = _Ptr;
+
+// }}}
+// _SizeConstant{{{
+template <size_t _X> using _SizeConstant = integral_constant<size_t, _X>;
+// }}}
+// __is_bitmask{{{
+template <typename _Tp, typename = std::void_t<>>
+struct __is_bitmask : false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_bitmask_v = __is_bitmask<_Tp>::value;
+
+// the __mmaskXX case:
+template <typename _Tp>
+struct __is_bitmask<_Tp, std::void_t<decltype(std::declval<unsigned&>()
+ = std::declval<_Tp>() & 1u)>>
+ : true_type
+{
+};
+
+// }}}
+// __int_for_sizeof{{{
+template <size_t> struct __int_for_sizeof;
+template <> struct __int_for_sizeof<1>
+{
+ using type = signed char;
+ static_assert(sizeof(type) == 1);
+};
+template <> struct __int_for_sizeof<2>
+{
+ using type = signed short;
+ static_assert(sizeof(type) == 2);
+};
+template <> struct __int_for_sizeof<4>
+{
+ using type = signed int;
+ static_assert(sizeof(type) == 4);
+};
+template <> struct __int_for_sizeof<8>
+{
+ using type = signed long long;
+ static_assert(sizeof(type) == 8);
+};
+#ifdef __SIZEOF_INT128__
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wpedantic"
+template <> struct __int_for_sizeof<16>
+{
+ using type = __int128;
+ static_assert(sizeof(type) == 16);
+};
+#pragma GCC diagnostic pop
+#endif // __SIZEOF_INT128__
+template <typename _Tp>
+using __int_for_sizeof_t = typename __int_for_sizeof<sizeof(_Tp)>::type;
+template <size_t _Np>
+using __int_with_sizeof_t = typename __int_for_sizeof<_Np>::type;
+
+// }}}
+// __is_fixed_size_abi{{{
+template <typename _Tp> struct __is_fixed_size_abi : false_type
+{
+};
+template <int _Np>
+struct __is_fixed_size_abi<simd_abi::fixed_size<_Np>> : true_type
+{
+};
+
+template <typename _Tp>
+inline constexpr bool __is_fixed_size_abi_v = __is_fixed_size_abi<_Tp>::value;
+
+// }}}
+// constexpr feature detection{{{
+constexpr inline bool __have_mmx = _GLIBCXX_SIMD_HAVE_MMX;
+constexpr inline bool __have_sse = _GLIBCXX_SIMD_HAVE_SSE;
+constexpr inline bool __have_sse2 = _GLIBCXX_SIMD_HAVE_SSE2;
+constexpr inline bool __have_sse3 = _GLIBCXX_SIMD_HAVE_SSE3;
+constexpr inline bool __have_ssse3 = _GLIBCXX_SIMD_HAVE_SSSE3;
+constexpr inline bool __have_sse4_1 = _GLIBCXX_SIMD_HAVE_SSE4_1;
+constexpr inline bool __have_sse4_2 = _GLIBCXX_SIMD_HAVE_SSE4_2;
+constexpr inline bool __have_xop = _GLIBCXX_SIMD_HAVE_XOP;
+constexpr inline bool __have_avx = _GLIBCXX_SIMD_HAVE_AVX;
+constexpr inline bool __have_avx2 = _GLIBCXX_SIMD_HAVE_AVX2;
+constexpr inline bool __have_bmi = _GLIBCXX_SIMD_HAVE_BMI1;
+constexpr inline bool __have_bmi2 = _GLIBCXX_SIMD_HAVE_BMI2;
+constexpr inline bool __have_lzcnt = _GLIBCXX_SIMD_HAVE_LZCNT;
+constexpr inline bool __have_sse4a = _GLIBCXX_SIMD_HAVE_SSE4A;
+constexpr inline bool __have_fma = _GLIBCXX_SIMD_HAVE_FMA;
+constexpr inline bool __have_fma4 = _GLIBCXX_SIMD_HAVE_FMA4;
+constexpr inline bool __have_f16c = _GLIBCXX_SIMD_HAVE_F16C;
+constexpr inline bool __have_popcnt = _GLIBCXX_SIMD_HAVE_POPCNT;
+constexpr inline bool __have_avx512f = _GLIBCXX_SIMD_HAVE_AVX512F;
+constexpr inline bool __have_avx512dq = _GLIBCXX_SIMD_HAVE_AVX512DQ;
+constexpr inline bool __have_avx512vl = _GLIBCXX_SIMD_HAVE_AVX512VL;
+constexpr inline bool __have_avx512bw = _GLIBCXX_SIMD_HAVE_AVX512BW;
+constexpr inline bool __have_avx512dq_vl = __have_avx512dq && __have_avx512vl;
+constexpr inline bool __have_avx512bw_vl = __have_avx512bw && __have_avx512vl;
+
+constexpr inline bool __have_neon = _GLIBCXX_SIMD_HAVE_NEON;
+constexpr inline bool __have_neon_a32 = _GLIBCXX_SIMD_HAVE_NEON_A32;
+constexpr inline bool __have_neon_a64 = _GLIBCXX_SIMD_HAVE_NEON_A64;
+
+#ifdef __POWER9_VECTOR__
+constexpr inline bool __have_power9vec = true;
+#else
+constexpr inline bool __have_power9vec = false;
+#endif
+#if defined __POWER8_VECTOR__
+constexpr inline bool __have_power8vec = true;
+#else
+constexpr inline bool __have_power8vec = __have_power9vec;
+#endif
+#if defined __VSX__
+constexpr inline bool __have_power_vsx = true;
+#else
+constexpr inline bool __have_power_vsx = __have_power8vec;
+#endif
+#if defined __ALTIVEC__
+constexpr inline bool __have_power_vmx = true;
+#else
+constexpr inline bool __have_power_vmx = __have_power_vsx;
+#endif
+
+// }}}
+// __is_scalar_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_scalar_abi()
+{
+ return std::is_same_v<simd_abi::scalar, _Abi>;
+}
+
+// }}}
+// __abi_bytes_v {{{
+template <template <int> class _Abi, int _Bytes>
+constexpr int
+__abi_bytes_impl(_Abi<_Bytes>*)
+{
+ return _Bytes;
+}
+template <typename _Tp>
+constexpr int
+__abi_bytes_impl(_Tp*)
+{
+ return -1;
+}
+template <typename _Abi>
+inline constexpr int __abi_bytes_v
+ = __abi_bytes_impl(static_cast<_Abi*>(nullptr));
+
+// }}}
+// __is_builtin_bitmask_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_builtin_bitmask_abi()
+{
+ return std::is_same_v<simd_abi::_VecBltnBtmsk<__abi_bytes_v<_Abi>>, _Abi>;
+}
+
+// }}}
+// __is_sse_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_sse_abi()
+{
+ constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+ return _Bytes <= 16 && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_avx_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_avx_abi()
+{
+ constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+ return _Bytes > 16 && _Bytes <= 32
+ && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_avx512_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_avx512_abi()
+{
+ constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+ return _Bytes <= 64 && std::is_same_v<simd_abi::_Avx512<_Bytes>, _Abi>;
+}
+
+// }}}
+// __is_neon_abi {{{
+template <typename _Abi>
+constexpr bool
+__is_neon_abi()
+{
+ constexpr auto _Bytes = __abi_bytes_v<_Abi>;
+ return _Bytes <= 16 && std::is_same_v<simd_abi::_VecBuiltin<_Bytes>, _Abi>;
+}
+
+// }}}
+// __make_dependent_t {{{
+template <typename, typename _Up> struct __make_dependent
+{
+ using type = _Up;
+};
+template <typename _Tp, typename _Up>
+using __make_dependent_t = typename __make_dependent<_Tp, _Up>::type;
+
+// }}}
+// ^^^ ---- type traits ---- ^^^
+
+// __assert_unreachable{{{
+template <typename _Tp> struct __assert_unreachable
+{
+ static_assert(!std::is_same_v<_Tp, _Tp>, "this should be unreachable");
+};
+
+// }}}
+// __size_or_zero_v {{{
+template <typename _Tp, typename _Ap, size_t _Np = simd_size<_Tp, _Ap>::value>
+constexpr size_t
+__size_or_zero_dispatch(int)
+{
+ return _Np;
+}
+template <typename _Tp, typename _Ap>
+constexpr size_t
+__size_or_zero_dispatch(float)
+{
+ return 0;
+}
+template <typename _Tp, typename _Ap>
+inline constexpr size_t __size_or_zero_v = __size_or_zero_dispatch<_Tp, _Ap>(0);
+
+// }}}
+// __bit_cast {{{
+template <typename _To, typename _From>
+_GLIBCXX_SIMD_INTRINSIC _To
+__bit_cast(const _From __x)
+{
+ static_assert(sizeof(_To) == sizeof(_From));
+ _To __r;
+ __builtin_memcpy(reinterpret_cast<char*>(&__r),
+ reinterpret_cast<const char*>(&__x), sizeof(_To));
+ return __r;
+}
+
+// }}}
+// __div_roundup {{{
+inline constexpr std::size_t
+__div_roundup(std::size_t __a, std::size_t __b)
+{
+ return (__a + __b - 1) / __b;
+}
+
+// }}}
+// _ExactBool{{{
+class _ExactBool
+{
+ const bool _M_data;
+
+public:
+ _GLIBCXX_SIMD_INTRINSIC constexpr _ExactBool(bool __b) : _M_data(__b) {}
+ _ExactBool(int) = delete;
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator bool() const { return _M_data; }
+};
+
+// }}}
+// __execute_n_times{{{
+template <typename _Fp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_on_index_sequence(_Fp&& __f, std::index_sequence<_I...>)
+{
+ [[maybe_unused]] auto&& __x = {(__f(_SizeConstant<_I>()), 0)...};
+}
+
+template <typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_on_index_sequence(_Fp&&, std::index_sequence<>)
+{}
+
+template <size_t _Np, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__execute_n_times(_Fp&& __f)
+{
+ __execute_on_index_sequence(static_cast<_Fp&&>(__f),
+ std::make_index_sequence<_Np>{});
+}
+
+// }}}
+// __generate_from_n_evaluations{{{
+template <typename _R, typename _Fp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__execute_on_index_sequence_with_return(_Fp&& __f, std::index_sequence<_I...>)
+{
+ return _R{__f(_SizeConstant<_I>())...};
+}
+
+template <size_t _Np, typename _R, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__generate_from_n_evaluations(_Fp&& __f)
+{
+ return __execute_on_index_sequence_with_return<_R>(
+ static_cast<_Fp&&>(__f), std::make_index_sequence<_Np>{});
+}
+
+// }}}
+// __call_with_n_evaluations{{{
+template <size_t... _I, typename _F0, typename _FArgs>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_n_evaluations(std::index_sequence<_I...>, _F0&& __f0,
+ _FArgs&& __fargs)
+{
+ return __f0(__fargs(_SizeConstant<_I>())...);
+}
+
+template <size_t _Np, typename _F0, typename _FArgs>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_n_evaluations(_F0&& __f0, _FArgs&& __fargs)
+{
+ return __call_with_n_evaluations(std::make_index_sequence<_Np>{},
+ static_cast<_F0&&>(__f0),
+ static_cast<_FArgs&&>(__fargs));
+}
+
+// }}}
+// __call_with_subscripts{{{
+template <size_t _First = 0, size_t... _It, typename _Tp, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_subscripts(_Tp&& __x, index_sequence<_It...>, _Fp&& __fun)
+{
+ return __fun(__x[_First + _It]...);
+}
+
+template <size_t _Np, size_t _First = 0, typename _Tp, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__call_with_subscripts(_Tp&& __x, _Fp&& __fun)
+{
+ return __call_with_subscripts<_First>(static_cast<_Tp&&>(__x),
+ std::make_index_sequence<_Np>(),
+ static_cast<_Fp&&>(__fun));
+}
+
+// }}}
+// __may_alias{{{
+/**\internal
+ * Helper __may_alias<_Tp> that turns _Tp into the type to be used for an
+ * aliasing pointer. This adds the __may_alias attribute to _Tp (with compilers
+ * that support it).
+ */
+template <typename _Tp> using __may_alias [[__gnu__::__may_alias__]] = _Tp;
+
+// }}}
+// _UnsupportedBase {{{
+// simd and simd_mask base for unsupported <_Tp, _Abi>
+struct _UnsupportedBase
+{
+ _UnsupportedBase() = delete;
+ _UnsupportedBase(const _UnsupportedBase&) = delete;
+ _UnsupportedBase& operator=(const _UnsupportedBase&) = delete;
+ ~_UnsupportedBase() = delete;
+};
+
+// }}}
+// _InvalidTraits {{{
+/**
+ * \internal
+ * Defines the implementation of __a given <_Tp, _Abi>.
+ *
+ * Implementations must ensure that only valid <_Tp, _Abi> instantiations are
+ * possible. Static assertions in the type definition do not suffice. It is
+ * important that SFINAE works.
+ */
+struct _InvalidTraits
+{
+ using _IsValid = false_type;
+ using _SimdBase = _UnsupportedBase;
+ using _MaskBase = _UnsupportedBase;
+
+ static constexpr size_t _S_simd_align = 1;
+ struct _SimdImpl;
+ struct _SimdMember
+ {
+ };
+ struct _SimdCastType;
+
+ static constexpr size_t _S_mask_align = 1;
+ struct _MaskImpl;
+ struct _MaskMember
+ {
+ };
+ struct _MaskCastType;
+};
+// }}}
+// _SimdTraits {{{
+template <typename _Tp, typename _Abi, typename = std::void_t<>>
+struct _SimdTraits : _InvalidTraits
+{
+};
+
+// }}}
+// __private_init, __bitset_init{{{
+/**
+ * \internal
+ * Tag used for private init constructor of simd and simd_mask
+ */
+inline constexpr struct _PrivateInit
+{
+} __private_init = {};
+inline constexpr struct _BitsetInit
+{
+} __bitset_init = {};
+
+// }}}
+// __is_narrowing_conversion<_From, _To>{{{
+template <typename _From, typename _To, bool = std::is_arithmetic<_From>::value,
+ bool = std::is_arithmetic<_To>::value>
+struct __is_narrowing_conversion;
+
+// ignore "warning C4018: '<': signed/unsigned mismatch" in the following trait.
+// The implicit conversions will do the right thing here.
+template <typename _From, typename _To>
+struct __is_narrowing_conversion<_From, _To, true, true>
+ : public __bool_constant<(
+ std::numeric_limits<_From>::digits > std::numeric_limits<_To>::digits
+ || std::numeric_limits<_From>::max() > std::numeric_limits<_To>::max()
+ || std::numeric_limits<_From>::lowest()
+ < std::numeric_limits<_To>::lowest()
+ || (std::is_signed<_From>::value && std::is_unsigned<_To>::value))>
+{
+};
+
+template <typename _Tp>
+struct __is_narrowing_conversion<bool, _Tp, true, true> : public true_type
+{
+};
+template <>
+struct __is_narrowing_conversion<bool, bool, true, true> : public false_type
+{
+};
+template <typename _Tp>
+struct __is_narrowing_conversion<_Tp, _Tp, true, true> : public false_type
+{
+};
+
+template <typename _From, typename _To>
+struct __is_narrowing_conversion<_From, _To, false, true>
+ : public negation<std::is_convertible<_From, _To>>
+{
+};
+
+// }}}
+// __converts_to_higher_integer_rank{{{
+template <typename _From, typename _To, bool = (sizeof(_From) < sizeof(_To))>
+struct __converts_to_higher_integer_rank : public true_type
+{
+};
+// this may fail for char -> short if sizeof(char) == sizeof(short)
+template <typename _From, typename _To>
+struct __converts_to_higher_integer_rank<_From, _To, false>
+ : public std::is_same<decltype(std::declval<_From>() + std::declval<_To>()),
+ _To>
+{
+};
+
+// }}}
+// __is_aligned(_v){{{
+template <typename _Flag, size_t _Alignment> struct __is_aligned;
+template <size_t _Alignment>
+struct __is_aligned<vector_aligned_tag, _Alignment> : public true_type
+{
+};
+template <size_t _Alignment>
+struct __is_aligned<element_aligned_tag, _Alignment> : public false_type
+{
+};
+template <size_t _GivenAlignment, size_t _Alignment>
+struct __is_aligned<overaligned_tag<_GivenAlignment>, _Alignment>
+ : public std::integral_constant<bool, (_GivenAlignment % _Alignment == 0)>
+{
+};
+template <typename _Flag, size_t _Alignment>
+inline constexpr bool __is_aligned_v = __is_aligned<_Flag, _Alignment>::value;
+
+// }}}
+// __data(simd/simd_mask) {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd<_Tp, _Ap>& __x);
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd<_Tp, _Ap>& __x);
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd_mask<_Tp, _Ap>& __x);
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd_mask<_Tp, _Ap>& __x);
+
+// }}}
+// _SimdConverter {{{
+template <typename _FromT, typename _FromA, typename _ToT, typename _ToA,
+ typename = void>
+struct _SimdConverter;
+
+template <typename _Tp, typename _Ap>
+struct _SimdConverter<_Tp, _Ap, _Tp, _Ap, void>
+{
+ template <typename _Up>
+ _GLIBCXX_SIMD_INTRINSIC const _Up& operator()(const _Up& __x)
+ {
+ return __x;
+ }
+};
+
+// }}}
+// __to_value_type_or_member_type {{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__to_value_type_or_member_type(const _V& __x) -> decltype(__data(__x))
+{
+ return __data(__x);
+}
+
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr const typename _V::value_type&
+__to_value_type_or_member_type(const typename _V::value_type& __x)
+{
+ return __x;
+}
+
+// }}}
+// __bool_storage_member_type{{{
+template <size_t _Size> struct __bool_storage_member_type;
+
+template <size_t _Size>
+using __bool_storage_member_type_t =
+ typename __bool_storage_member_type<_Size>::type;
+
+// }}}
+// _SimdTuple {{{
+// why not std::tuple?
+// 1. std::tuple gives no guarantee about the storage order, but I require
+// storage
+// equivalent to std::array<_Tp, _Np>
+// 2. direct access to the element type (first template argument)
+// 3. enforces equal element type, only different _Abi types are allowed
+template <typename _Tp, typename... _Abis> struct _SimdTuple;
+
+//}}}
+// __fixed_size_storage_t {{{
+template <typename _Tp, int _Np> struct __fixed_size_storage;
+
+template <typename _Tp, int _Np>
+using __fixed_size_storage_t = typename __fixed_size_storage<_Tp, _Np>::type;
+
+// }}}
+// _SimdWrapper fwd decl{{{
+template <typename _Tp, size_t _Size, typename = std::void_t<>>
+struct _SimdWrapper;
+
+template <typename _Tp>
+using _SimdWrapper8 = _SimdWrapper<_Tp, 8 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper16 = _SimdWrapper<_Tp, 16 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper32 = _SimdWrapper<_Tp, 32 / sizeof(_Tp)>;
+template <typename _Tp>
+using _SimdWrapper64 = _SimdWrapper<_Tp, 64 / sizeof(_Tp)>;
+
+// }}}
+// __is_simd_wrapper {{{
+template <typename _Tp> struct __is_simd_wrapper : false_type
+{
+};
+template <typename _Tp, size_t _Np>
+struct __is_simd_wrapper<_SimdWrapper<_Tp, _Np>> : true_type
+{
+};
+template <typename _Tp>
+inline constexpr bool __is_simd_wrapper_v = __is_simd_wrapper<_Tp>::value;
+
+// }}}
+// _BitOps {{{
+struct _BitOps
+{
+ // __popcount {{{
+ static constexpr _UInt __popcount(_UInt __x)
+ {
+ return __builtin_popcount(__x);
+ }
+ static constexpr _ULong __popcount(_ULong __x)
+ {
+ return __builtin_popcountl(__x);
+ }
+ static constexpr _ULLong __popcount(_ULLong __x)
+ {
+ return __builtin_popcountll(__x);
+ }
+
+ // }}}
+ // __ctz/__clz {{{
+ static constexpr _UInt __ctz(_UInt __x) { return __builtin_ctz(__x); }
+ static constexpr _ULong __ctz(_ULong __x) { return __builtin_ctzl(__x); }
+ static constexpr _ULLong __ctz(_ULLong __x) { return __builtin_ctzll(__x); }
+ static constexpr _UInt __clz(_UInt __x) { return __builtin_clz(__x); }
+ static constexpr _ULong __clz(_ULong __x) { return __builtin_clzl(__x); }
+ static constexpr _ULLong __clz(_ULLong __x) { return __builtin_clzll(__x); }
+
+ // }}}
+ // __bit_iteration {{{
+ template <typename _Tp, typename _Fp>
+ static void __bit_iteration(_Tp __mask, _Fp&& __f)
+ {
+ static_assert(sizeof(_ULLong) >= sizeof(_Tp));
+ std::conditional_t<sizeof(_Tp) <= sizeof(_UInt), _UInt, _ULLong> __k;
+ if constexpr (std::is_convertible_v<_Tp, decltype(__k)>)
+ __k = __mask;
+ else
+ __k = __mask.to_ullong();
+ switch (__popcount(__k))
+ {
+ default:
+ do
+ {
+ __f(__ctz(__k));
+ __k &= (__k - 1);
+ }
+ while (__k);
+ break;
+ /*case 3:
+ __f(__ctz(__k));
+ __k &= (__k - 1);
+ [[fallthrough]];*/
+ case 2:
+ __f(__ctz(__k));
+ [[fallthrough]];
+ case 1:
+ __f(__popcount(~decltype(__k)()) - 1 - __clz(__k));
+ [[fallthrough]];
+ case 0:
+ break;
+ }
+ }
+
+ //}}}
+ // __firstbit{{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST static auto __firstbit(_Tp __bits)
+ {
+ static_assert(std::is_integral_v<_Tp>,
+ "__firstbit requires an integral argument");
+ if constexpr (sizeof(_Tp) <= sizeof(int))
+ return __builtin_ctz(__bits);
+ else if constexpr (alignof(_ULLong) == 8)
+ return __builtin_ctzll(__bits);
+ else
+ {
+ _UInt __lo = __bits;
+ return __lo == 0 ? 32 + __builtin_ctz(__bits >> 32)
+ : __builtin_ctz(__lo);
+ }
+ }
+
+ // }}}
+ // __lastbit{{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST static auto __lastbit(_Tp __bits)
+ {
+ static_assert(std::is_integral_v<_Tp>,
+ "__lastbit requires an integral argument");
+ if constexpr (sizeof(_Tp) <= sizeof(int))
+ return 31 - __builtin_clz(__bits);
+ else if constexpr (alignof(_ULLong) == 8)
+ return 63 - __builtin_clzll(__bits);
+ else
+ {
+ _UInt __lo = __bits;
+ _UInt __hi = __bits >> 32u;
+ return __hi == 0 ? 31 - __builtin_clz(__lo) : 63 - __builtin_clz(__hi);
+ }
+ }
+
+ // }}}
+};
+
+//}}}
+// __increment, __decrement {{{
+template <typename _Tp = void> struct __increment
+{
+ constexpr _Tp operator()(_Tp __a) const { return ++__a; }
+};
+template <> struct __increment<void>
+{
+ template <typename _Tp> constexpr _Tp operator()(_Tp __a) const
+ {
+ return ++__a;
+ }
+};
+template <typename _Tp = void> struct __decrement
+{
+ constexpr _Tp operator()(_Tp __a) const { return --__a; }
+};
+template <> struct __decrement<void>
+{
+ template <typename _Tp> constexpr _Tp operator()(_Tp __a) const
+ {
+ return --__a;
+ }
+};
+
+// }}}
+// _ValuePreserving(OrInt) {{{
+template <typename _From, typename _To,
+ typename = enable_if_t<negation<
+ __is_narrowing_conversion<__remove_cvref_t<_From>, _To>>::value>>
+using _ValuePreserving = _From;
+
+template <typename _From, typename _To,
+ typename _DecayedFrom = __remove_cvref_t<_From>,
+ typename = enable_if_t<conjunction<
+ is_convertible<_From, _To>,
+ disjunction<
+ is_same<_DecayedFrom, _To>, is_same<_DecayedFrom, int>,
+ conjunction<is_same<_DecayedFrom, _UInt>, is_unsigned<_To>>,
+ negation<__is_narrowing_conversion<_DecayedFrom, _To>>>>::value>>
+using _ValuePreservingOrInt = _From;
+
+// }}}
+// __intrinsic_type {{{
+template <typename _Tp, size_t _Bytes, typename = std::void_t<>>
+struct __intrinsic_type;
+template <typename _Tp, size_t _Size>
+using __intrinsic_type_t =
+ typename __intrinsic_type<_Tp, _Size * sizeof(_Tp)>::type;
+template <typename _Tp>
+using __intrinsic_type2_t = typename __intrinsic_type<_Tp, 2>::type;
+template <typename _Tp>
+using __intrinsic_type4_t = typename __intrinsic_type<_Tp, 4>::type;
+template <typename _Tp>
+using __intrinsic_type8_t = typename __intrinsic_type<_Tp, 8>::type;
+template <typename _Tp>
+using __intrinsic_type16_t = typename __intrinsic_type<_Tp, 16>::type;
+template <typename _Tp>
+using __intrinsic_type32_t = typename __intrinsic_type<_Tp, 32>::type;
+template <typename _Tp>
+using __intrinsic_type64_t = typename __intrinsic_type<_Tp, 64>::type;
+template <typename _Tp>
+using __intrinsic_type128_t = typename __intrinsic_type<_Tp, 128>::type;
+
+// }}}
+// _BitMask {{{
+template <size_t _Np, bool _Sanitized = false> struct _BitMask;
+
+template <size_t _Np, bool _Sanitized>
+struct __is_bitmask<_BitMask<_Np, _Sanitized>, void> : true_type
+{
+};
+
+template <size_t _Np> using _SanitizedBitMask = _BitMask<_Np, true>;
+
+template <size_t _Np, bool _Sanitized> struct _BitMask
+{
+ static_assert(_Np > 0);
+ static constexpr size_t _NBytes = __div_roundup(_Np, CHAR_BIT);
+ using _Tp = conditional_t<_Np == 1, bool,
+ make_unsigned_t<__int_with_sizeof_t<std::min(
+ sizeof(_ULLong), __next_power_of_2(_NBytes))>>>;
+ static constexpr int _S_array_size = __div_roundup(_NBytes, sizeof(_Tp));
+ _Tp _M_bits[_S_array_size];
+ static constexpr int _S_unused_bits
+ = _Np == 1 ? 0 : _S_array_size * sizeof(_Tp) * CHAR_BIT - _Np;
+ static constexpr _Tp _S_bitmask = +_Tp(~_Tp()) >> _S_unused_bits;
+
+ constexpr _BitMask() noexcept = default;
+ constexpr _BitMask(unsigned long long __x) noexcept
+ : _M_bits{static_cast<_Tp>(__x)}
+ {}
+ _BitMask(std::bitset<_Np> __x) noexcept : _BitMask(__x.to_ullong()) {}
+
+ constexpr _BitMask(const _BitMask&) noexcept = default;
+
+ template <bool _RhsSanitized, typename = enable_if_t<_RhsSanitized == false
+ && _Sanitized == true>>
+ constexpr _BitMask(const _BitMask<_Np, _RhsSanitized>& __rhs) noexcept
+ : _BitMask(__rhs._M_sanitized())
+ {}
+
+ constexpr operator _SimdWrapper<bool, _Np>() const noexcept
+ {
+ static_assert(_S_array_size == 1);
+ return _M_bits[0];
+ }
+
+ // precondition: is sanitized
+ constexpr _Tp _M_to_bits() const noexcept
+ {
+ static_assert(_S_array_size == 1);
+ return _M_bits[0];
+ }
+ // precondition: is sanitized
+ constexpr unsigned long long to_ullong() const noexcept
+ {
+ static_assert(_S_array_size == 1);
+ return _M_bits[0];
+ }
+ // precondition: is sanitized
+ constexpr unsigned long to_ulong() const noexcept
+ {
+ static_assert(_S_array_size == 1);
+ return _M_bits[0];
+ }
+ constexpr std::bitset<_Np> _M_to_bitset() const noexcept
+ {
+ static_assert(_S_array_size == 1);
+ return _M_bits[0];
+ }
+
+ constexpr decltype(auto) _M_sanitized() const noexcept
+ {
+ if constexpr (_Sanitized)
+ return *this;
+ else if constexpr (_Np == 1)
+ return _SanitizedBitMask<_Np>(_M_bits[0]);
+ else
+ {
+ _SanitizedBitMask<_Np> __r = {};
+ for (int __i = 0; __i < _S_array_size; ++__i)
+ __r._M_bits[__i] = _M_bits[__i];
+ if constexpr (_S_unused_bits > 0)
+ __r._M_bits[_S_array_size - 1] &= _S_bitmask;
+ return __r;
+ }
+ }
+
+ template <size_t _Mp, bool _LSanitized>
+ constexpr _BitMask<_Np + _Mp, _Sanitized>
+ _M_prepend(_BitMask<_Mp, _LSanitized> __lsb) const noexcept
+ {
+ constexpr size_t _RN = _Np + _Mp;
+ using _Rp = _BitMask<_RN, _Sanitized>;
+ if constexpr (_Rp::_S_array_size == 1)
+ {
+ _Rp __r{{_M_bits[0]}};
+ __r._M_bits[0] <<= _Mp;
+ __r._M_bits[0] |= __lsb._M_sanitized()._M_bits[0];
+ return __r;
+ }
+ else
+ __assert_unreachable<_Rp>();
+ }
+
+ // Return a new _BitMask with size _NewSize while dropping _DropLsb least
+ // significant bits. If the operation implicitly produces a sanitized bitmask,
+ // the result type will have _Sanitized set.
+ template <size_t _DropLsb, size_t _NewSize = _Np - _DropLsb>
+ constexpr auto _M_extract() const noexcept
+ {
+ static_assert(_Np > _DropLsb);
+ static_assert(_DropLsb + _NewSize <= sizeof(_ULLong) * CHAR_BIT,
+ "not implemented for bitmasks larger than one ullong");
+ if constexpr (_NewSize == 1) // must sanitize because the return _Tp is bool
+ return _SanitizedBitMask<1>{
+ {static_cast<bool>(_M_bits[0] & (_Tp(1) << _DropLsb))}};
+ else
+ return _BitMask<_NewSize,
+ ((_NewSize + _DropLsb == sizeof(_Tp) * CHAR_BIT
+ && _NewSize + _DropLsb <= _Np)
+ || ((_Sanitized || _Np == sizeof(_Tp) * CHAR_BIT)
+ && _NewSize + _DropLsb >= _Np))>(_M_bits[0]
+ >> _DropLsb);
+ }
+
+ // True if all bits are set. Implicitly sanitizes if _Sanitized == false.
+ constexpr bool all() const noexcept
+ {
+ if constexpr (_Np == 1)
+ return _M_bits[0];
+ else if constexpr (!_Sanitized)
+ return _M_sanitized().all();
+ else
+ {
+ constexpr _Tp __allbits = ~_Tp();
+ for (int __i = 0; __i < _S_array_size - 1; ++__i)
+ if (_M_bits[__i] != __allbits)
+ return false;
+ return _M_bits[_S_array_size - 1] == _S_bitmask;
+ }
+ }
+
+ // True if at least one bit is set. Implicitly sanitizes if _Sanitized ==
+ // false.
+ constexpr bool any() const noexcept
+ {
+ if constexpr (_Np == 1)
+ return _M_bits[0];
+ else if constexpr (!_Sanitized)
+ return _M_sanitized().any();
+ else
+ {
+ for (int __i = 0; __i < _S_array_size - 1; ++__i)
+ if (_M_bits[__i] != 0)
+ return true;
+ return _M_bits[_S_array_size - 1] != 0;
+ }
+ }
+
+ // True if no bit is set. Implicitly sanitizes if _Sanitized == false.
+ constexpr bool none() const noexcept
+ {
+ if constexpr (_Np == 1)
+ return !_M_bits[0];
+ else if constexpr (!_Sanitized)
+ return _M_sanitized().none();
+ else
+ {
+ for (int __i = 0; __i < _S_array_size - 1; ++__i)
+ if (_M_bits[__i] != 0)
+ return false;
+ return _M_bits[_S_array_size - 1] == 0;
+ }
+ }
+
+ // Returns the number of set bits. Implicitly sanitizes if _Sanitized ==
+ // false.
+ constexpr int count() const noexcept
+ {
+ if constexpr (_Np == 1)
+ return _M_bits[0];
+ else if constexpr (!_Sanitized)
+ return _M_sanitized().none();
+ else
+ {
+ int __result = __builtin_popcountll(_M_bits[0]);
+ for (int __i = 1; __i < _S_array_size; ++__i)
+ __result += __builtin_popcountll(_M_bits[__i]);
+ return __result;
+ }
+ }
+
+ // Returns the bit at offset __i as bool.
+ constexpr bool operator[](size_t __i) const noexcept
+ {
+ if constexpr (_Np == 1)
+ return _M_bits[0];
+ else if constexpr (_S_array_size == 1)
+ return (_M_bits[0] >> __i) & 1;
+ else
+ {
+ const size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+ const size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+ return (_M_bits[__j] >> __shift) & 1;
+ }
+ }
+ template <size_t __i>
+ constexpr bool operator[](_SizeConstant<__i>) const noexcept
+ {
+ static_assert(__i < _Np);
+ constexpr size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+ constexpr size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+ return static_cast<bool>(_M_bits[__j] & (_Tp(1) << __shift));
+ }
+
+ // Set the bit at offset __i to __x.
+ constexpr void set(size_t __i, bool __x) noexcept
+ {
+ if constexpr (_Np == 1)
+ _M_bits[0] = __x;
+ else if constexpr (_S_array_size == 1)
+ {
+ _M_bits[0] &= ~_Tp(_Tp(1) << __i);
+ _M_bits[0] |= _Tp(_Tp(__x) << __i);
+ }
+ else
+ {
+ const size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+ const size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+ _M_bits[__j] &= ~_Tp(_Tp(1) << __shift);
+ _M_bits[__j] |= _Tp(_Tp(__x) << __shift);
+ }
+ }
+ template <size_t __i>
+ constexpr void set(_SizeConstant<__i>, bool __x) noexcept
+ {
+ static_assert(__i < _Np);
+ if constexpr (_Np == 1)
+ _M_bits[0] = __x;
+ else
+ {
+ constexpr size_t __j = __i / (sizeof(_Tp) * CHAR_BIT);
+ constexpr size_t __shift = __i % (sizeof(_Tp) * CHAR_BIT);
+ constexpr _Tp __mask = ~_Tp(_Tp(1) << __shift);
+ _M_bits[__j] &= __mask;
+ _M_bits[__j] |= _Tp(_Tp(__x) << __shift);
+ }
+ }
+
+ // Inverts all bits. Sanitized input leads to sanitized output.
+ constexpr _BitMask operator~() const noexcept
+ {
+ if constexpr (_Np == 1)
+ return !_M_bits[0];
+ else
+ {
+ _BitMask __result{};
+ for (int __i = 0; __i < _S_array_size - 1; ++__i)
+ __result._M_bits[__i] = ~_M_bits[__i];
+ if constexpr (_Sanitized)
+ __result._M_bits[_S_array_size - 1]
+ = _M_bits[_S_array_size - 1] ^ _S_bitmask;
+ else
+ __result._M_bits[_S_array_size - 1] = ~_M_bits[_S_array_size - 1];
+ return __result;
+ }
+ }
+
+ constexpr _BitMask& operator^=(const _BitMask& __b) & noexcept
+ {
+ __execute_n_times<_S_array_size>(
+ [&](auto __i) { _M_bits[__i] ^= __b._M_bits[__i]; });
+ return *this;
+ }
+ constexpr _BitMask& operator|=(const _BitMask& __b) & noexcept
+ {
+ __execute_n_times<_S_array_size>(
+ [&](auto __i) { _M_bits[__i] |= __b._M_bits[__i]; });
+ return *this;
+ }
+ constexpr _BitMask& operator&=(const _BitMask& __b) & noexcept
+ {
+ __execute_n_times<_S_array_size>(
+ [&](auto __i) { _M_bits[__i] &= __b._M_bits[__i]; });
+ return *this;
+ }
+ friend constexpr _BitMask operator^(const _BitMask& __a,
+ const _BitMask& __b) noexcept
+ {
+ _BitMask __r = __a;
+ __r ^= __b;
+ return __r;
+ }
+ friend constexpr _BitMask operator|(const _BitMask& __a,
+ const _BitMask& __b) noexcept
+ {
+ _BitMask __r = __a;
+ __r |= __b;
+ return __r;
+ }
+ friend constexpr _BitMask operator&(const _BitMask& __a,
+ const _BitMask& __b) noexcept
+ {
+ _BitMask __r = __a;
+ __r &= __b;
+ return __r;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ if constexpr (_S_array_size == 0)
+ return __builtin_constant_p(_M_bits[0]);
+ else
+ {
+ for (int __i = 0; __i < _S_array_size; ++__i)
+ if (!__builtin_constant_p(_M_bits[__i]))
+ return false;
+ return true;
+ }
+ }
+};
+
+// }}}
+
+// vvv ---- builtin vector types [[gnu::vector_size(N)]] and operations ---- vvv
+// __min_vector_size {{{
+template <typename _Tp = void>
+static inline constexpr int __min_vector_size = 2 * sizeof(_Tp);
+#if _GLIBCXX_SIMD_HAVE_NEON
+template <> inline constexpr int __min_vector_size<void> = 8;
+#else
+template <> inline constexpr int __min_vector_size<void> = 16;
+#endif
+
+// }}}
+// __vector_type {{{
+template <typename _Tp, size_t _Np, typename = void> struct __vector_type_n
+{
+};
+
+// substition failure for 0-element case
+template <typename _Tp> struct __vector_type_n<_Tp, 0, void>
+{
+};
+
+// special case 1-element to be _Tp itself
+template <typename _Tp>
+struct __vector_type_n<_Tp, 1, enable_if_t<__is_vectorizable_v<_Tp>>>
+{
+ using type = _Tp;
+};
+
+// else, use GNU-style builtin vector types
+template <typename _Tp, size_t _Np>
+struct __vector_type_n<_Tp, _Np,
+ enable_if_t<__is_vectorizable_v<_Tp> && _Np >= 2>>
+{
+ static constexpr size_t _Bytes = _Np * sizeof(_Tp) < __min_vector_size<_Tp>
+ ? __min_vector_size<_Tp>
+ : __next_power_of_2(_Np * sizeof(_Tp));
+ using type [[__gnu__::__vector_size__(_Bytes)]] = _Tp;
+};
+
+template <typename _Tp, size_t _Bytes, size_t = _Bytes % sizeof(_Tp)>
+struct __vector_type;
+
+template <typename _Tp, size_t _Bytes>
+struct __vector_type<_Tp, _Bytes, 0>
+ : __vector_type_n<_Tp, _Bytes / sizeof(_Tp)>
+{
+};
+
+template <typename _Tp, size_t _Size>
+using __vector_type_t = typename __vector_type_n<_Tp, _Size>::type;
+template <typename _Tp>
+using __vector_type2_t = typename __vector_type<_Tp, 2>::type;
+template <typename _Tp>
+using __vector_type4_t = typename __vector_type<_Tp, 4>::type;
+template <typename _Tp>
+using __vector_type8_t = typename __vector_type<_Tp, 8>::type;
+template <typename _Tp>
+using __vector_type16_t = typename __vector_type<_Tp, 16>::type;
+template <typename _Tp>
+using __vector_type32_t = typename __vector_type<_Tp, 32>::type;
+template <typename _Tp>
+using __vector_type64_t = typename __vector_type<_Tp, 64>::type;
+template <typename _Tp>
+using __vector_type128_t = typename __vector_type<_Tp, 128>::type;
+
+// }}}
+// __is_vector_type {{{
+template <typename _Tp, typename = std::void_t<>>
+struct __is_vector_type : false_type
+{
+};
+template <typename _Tp>
+struct __is_vector_type<
+ _Tp, std::void_t<typename __vector_type<decltype(std::declval<_Tp>()[0]),
+ sizeof(_Tp)>::type>>
+ : std::is_same<_Tp, typename __vector_type<decltype(std::declval<_Tp>()[0]),
+ sizeof(_Tp)>::type>
+{
+};
+
+template <typename _Tp>
+inline constexpr bool __is_vector_type_v = __is_vector_type<_Tp>::value;
+
+// }}}
+// _VectorTraits{{{
+template <typename _Tp, typename = std::void_t<>> struct _VectorTraitsImpl;
+template <typename _Tp>
+struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>>>
+{
+ using type = _Tp;
+ using value_type = decltype(std::declval<_Tp>()[0]);
+ static constexpr int _S_width = sizeof(_Tp) / sizeof(value_type);
+ using _Wrapper = _SimdWrapper<value_type, _S_width>;
+ template <typename _Up, int _W = _S_width>
+ static constexpr bool __is = std::is_same_v<value_type, _Up>&& _W == _S_width;
+};
+template <typename _Tp, size_t _Np>
+struct _VectorTraitsImpl<_SimdWrapper<_Tp, _Np>,
+ std::void_t<__vector_type_t<_Tp, _Np>>>
+{
+ using type = __vector_type_t<_Tp, _Np>;
+ using value_type = _Tp;
+ static constexpr int _S_width = sizeof(type) / sizeof(value_type);
+ using _Wrapper = _SimdWrapper<_Tp, _Np>;
+ static constexpr bool _S_is_partial = (_Np == _S_width);
+ static constexpr int _S_partial_width = _Np;
+ template <typename _Up, int _W = _S_width>
+ static constexpr bool __is = std::is_same_v<value_type, _Up>&& _W == _S_width;
+};
+
+template <typename _Tp, typename = typename _VectorTraitsImpl<_Tp>::type>
+using _VectorTraits = _VectorTraitsImpl<_Tp>;
+
+// }}}
+// __as_vector{{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__as_vector(_V __x)
+{
+ if constexpr (__is_vector_type_v<_V>)
+ return __x;
+ else if constexpr (is_simd<_V>::value || is_simd_mask<_V>::value)
+ return __data(__x)._M_data;
+ else if constexpr (__is_vectorizable_v<_V>)
+ return __vector_type_t<_V, 2>{__x};
+ else
+ return __x._M_data;
+}
+
+// }}}
+// __as_wrapper{{{
+template <typename _V>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__as_wrapper(_V __x)
+{
+ if constexpr (__is_vector_type_v<_V>)
+ return _SimdWrapper<typename _VectorTraits<_V>::value_type,
+ _VectorTraits<_V>::_S_width>(__x);
+ else if constexpr (is_simd<_V>::value || is_simd_mask<_V>::value)
+ return __data(__x);
+ else
+ return __x;
+}
+
+// }}}
+// __intrin_bitcast{{{
+template <typename _To, typename _From>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__intrin_bitcast(_From __v)
+{
+ static_assert(__is_vector_type_v<_From> && __is_vector_type_v<_To>);
+ if constexpr (sizeof(_To) == sizeof(_From))
+ return reinterpret_cast<_To>(__v);
+ else if constexpr (sizeof(_From) > sizeof(_To))
+ if constexpr (sizeof(_To) >= 16)
+ return reinterpret_cast<const __may_alias<_To>&>(__v);
+ else
+ {
+ _To __r;
+ __builtin_memcpy(&__r, &__v, sizeof(_To));
+ return __r;
+ }
+#if _GLIBCXX_SIMD_X86INTRIN
+ else if constexpr (__have_avx && sizeof(_From) == 16 && sizeof(_To) == 32)
+ return reinterpret_cast<_To>(__builtin_ia32_ps256_ps(
+ reinterpret_cast<__vector_type_t<float, 4>>(__v)));
+ else if constexpr (__have_avx512f && sizeof(_From) == 16 && sizeof(_To) == 64)
+ return reinterpret_cast<_To>(__builtin_ia32_ps512_ps(
+ reinterpret_cast<__vector_type_t<float, 4>>(__v)));
+ else if constexpr (__have_avx512f && sizeof(_From) == 32 && sizeof(_To) == 64)
+ return reinterpret_cast<_To>(__builtin_ia32_ps512_256ps(
+ reinterpret_cast<__vector_type_t<float, 8>>(__v)));
+#endif // _GLIBCXX_SIMD_X86INTRIN
+ else if constexpr (sizeof(__v) <= 8)
+ return reinterpret_cast<_To>(
+ __vector_type_t<__int_for_sizeof_t<_From>, sizeof(_To) / sizeof(_From)>{
+ reinterpret_cast<__int_for_sizeof_t<_From>>(__v)});
+ else
+ {
+ static_assert(sizeof(_To) > sizeof(_From));
+ _To __r = {};
+ __builtin_memcpy(&__r, &__v, sizeof(_From));
+ return __r;
+ }
+}
+
+// }}}
+// __vector_bitcast{{{
+template <typename _To, size_t _NN = 0, typename _From,
+ typename _FromVT = _VectorTraits<_From>,
+ size_t _Np = _NN == 0 ? sizeof(_From) / sizeof(_To) : _NN>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_To, _Np>
+__vector_bitcast(_From __x)
+{
+ using _R = __vector_type_t<_To, _Np>;
+ return __intrin_bitcast<_R>(__x);
+}
+template <typename _To, size_t _NN = 0, typename _Tp, size_t _Nx,
+ size_t _Np
+ = _NN == 0 ? sizeof(_SimdWrapper<_Tp, _Nx>) / sizeof(_To) : _NN>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_To, _Np>
+__vector_bitcast(const _SimdWrapper<_Tp, _Nx>& __x)
+{
+ static_assert(_Np > 1);
+ return __intrin_bitcast<__vector_type_t<_To, _Np>>(__x._M_data);
+}
+
+// }}}
+// __convert_x86 declarations {{{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp);
+
+template <typename _To, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_To __convert_x86(_Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp, _Tp,
+ _Tp, _Tp, _Tp, _Tp);
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR85048
+
+//}}}
+// __to_intrin {{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+ typename _R
+ = __intrinsic_type_t<typename _TVT::value_type, _TVT::_S_width>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__to_intrin(_Tp __x)
+{
+ static_assert(sizeof(__x) <= sizeof(_R),
+ "__to_intrin may never drop values off the end");
+ if constexpr (sizeof(__x) == sizeof(_R))
+ return reinterpret_cast<_R>(__as_vector(__x));
+ else
+ {
+ using _Up = __int_for_sizeof_t<_Tp>;
+ return reinterpret_cast<_R>(
+ __vector_type_t<_Up, sizeof(_R) / sizeof(_Up)>{__bit_cast<_Up>(__x)});
+ }
+}
+
+// }}}
+// __make_vector{{{
+template <typename _Tp, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, sizeof...(_Args)>
+__make_vector(const _Args&... __args)
+{
+ return __vector_type_t<_Tp, sizeof...(_Args)>{static_cast<_Tp>(__args)...};
+}
+
+// }}}
+// __vector_broadcast{{{
+template <size_t _Np, typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__vector_broadcast(_Tp __x)
+{
+ return __call_with_n_evaluations<_Np>(
+ [](auto... __xx) { return __vector_type_t<_Tp, _Np>{__xx...}; },
+ [&__x](int) { return __x; });
+}
+
+// }}}
+// __generate_vector{{{
+template <typename _Tp, size_t _Np, typename _Gp, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__generate_vector_impl(_Gp&& __gen, std::index_sequence<_I...>)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR89229
+ // Using -S -fverbose-asm this function turned up as the place where the
+ // invalid instruction was produced. Using some arbitrary memory clobbers to
+ // kill the optimizer and thus avoid the problem.
+ if constexpr (__have_avx512f && !__have_avx512vl && sizeof(_Tp) == 8
+ && std::is_integral_v<_Tp>)
+ if (!__builtin_is_constant_evaluated())
+ [] { asm("" ::: "memory"); }();
+#endif
+ return __vector_type_t<_Tp, _Np>{
+ static_cast<_Tp>(__gen(_SizeConstant<_I>()))...};
+}
+
+template <typename _V, typename _VVT = _VectorTraits<_V>, typename _Gp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _V
+__generate_vector(_Gp&& __gen)
+{
+ if constexpr (__is_vector_type_v<_V>)
+ return __generate_vector_impl<typename _VVT::value_type, _VVT::_S_width>(
+ static_cast<_Gp&&>(__gen), std::make_index_sequence<_VVT::_S_width>());
+ else
+ return __generate_vector_impl<typename _VVT::value_type,
+ _VVT::_S_partial_width>(
+ static_cast<_Gp&&>(__gen),
+ std::make_index_sequence<_VVT::_S_partial_width>());
+}
+
+template <typename _Tp, size_t _Np, typename _Gp>
+_GLIBCXX_SIMD_INTRINSIC constexpr __vector_type_t<_Tp, _Np>
+__generate_vector(_Gp&& __gen)
+{
+ return __generate_vector_impl<_Tp, _Np>(static_cast<_Gp&&>(__gen),
+ std::make_index_sequence<_Np>());
+}
+
+// }}}
+// __xor{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__xor(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+ static_assert(sizeof...(_Dummy) == 0);
+ using _Up = typename _TVT::value_type;
+ using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+ return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+ ^ __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() ^ _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__xor(_Tp __a, _Tp __b) noexcept
+{
+ return __a ^ __b;
+}
+
+// }}}
+// __or{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__or(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+ static_assert(sizeof...(_Dummy) == 0);
+ using _Up = typename _TVT::value_type;
+ using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+ return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+ | __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() | _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__or(_Tp __a, _Tp __b) noexcept
+{
+ return __a | __b;
+}
+
+// }}}
+// __and{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__and(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+ static_assert(sizeof...(_Dummy) == 0);
+ using _Up = typename _TVT::value_type;
+ using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+ return __vector_bitcast<_Up>(__vector_bitcast<_Ip>(__a)
+ & __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(_Tp() & _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__and(_Tp __a, _Tp __b) noexcept
+{
+ return __a & __b;
+}
+
+// }}}
+// __andnot{{{
+#if _GLIBCXX_SIMD_X86INTRIN && !defined __clang__
+static constexpr struct
+{
+ _GLIBCXX_SIMD_INTRINSIC __v4sf operator()(__v4sf __a,
+ __v4sf __b) const noexcept
+ {
+ return __builtin_ia32_andnps(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v2df operator()(__v2df __a,
+ __v2df __b) const noexcept
+ {
+ return __builtin_ia32_andnpd(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v2di operator()(__v2di __a,
+ __v2di __b) const noexcept
+ {
+ return __builtin_ia32_pandn128(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v8sf operator()(__v8sf __a,
+ __v8sf __b) const noexcept
+ {
+ return __builtin_ia32_andnps256(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v4df operator()(__v4df __a,
+ __v4df __b) const noexcept
+ {
+ return __builtin_ia32_andnpd256(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v4di operator()(__v4di __a,
+ __v4di __b) const noexcept
+ {
+ return __builtin_ia32_andnotsi256(__a, __b);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v16sf operator()(__v16sf __a,
+ __v16sf __b) const noexcept
+ {
+ if constexpr (__have_avx512dq)
+ return _mm512_andnot_ps(__a, __b);
+ else
+ return reinterpret_cast<__v16sf>(
+ _mm512_andnot_si512(reinterpret_cast<__v8di>(__a),
+ reinterpret_cast<__v8di>(__b)));
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v8df operator()(__v8df __a,
+ __v8df __b) const noexcept
+ {
+ if constexpr (__have_avx512dq)
+ return _mm512_andnot_pd(__a, __b);
+ else
+ return reinterpret_cast<__v8df>(
+ _mm512_andnot_si512(reinterpret_cast<__v8di>(__a),
+ reinterpret_cast<__v8di>(__b)));
+ }
+ _GLIBCXX_SIMD_INTRINSIC __v8di operator()(__v8di __a,
+ __v8di __b) const noexcept
+ {
+ return _mm512_andnot_si512(__a, __b);
+ }
+} _S_x86_andnot;
+#endif // _GLIBCXX_SIMD_X86INTRIN && !__clang__
+
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>, typename... _Dummy>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__andnot(_Tp __a, typename _TVT::type __b, _Dummy...) noexcept
+{
+ static_assert(sizeof...(_Dummy) == 0);
+#if _GLIBCXX_SIMD_X86INTRIN && !defined __clang__
+ if constexpr (sizeof(_Tp) >= 16)
+ {
+ const auto __ai = __to_intrin(__a);
+ const auto __bi = __to_intrin(__b);
+ if (!__builtin_is_constant_evaluated()
+ && !(__builtin_constant_p(__ai) && __builtin_constant_p(__bi)))
+ {
+ const auto __r = _S_x86_andnot(__ai, __bi);
+ if constexpr (is_convertible_v<decltype(__r), _Tp>)
+ return __r;
+ else
+ return reinterpret_cast<_Tp>(__r);
+ }
+ }
+#endif // _GLIBCXX_SIMD_X86INTRIN
+ using _Up = typename _TVT::value_type;
+ using _Ip = make_unsigned_t<__int_for_sizeof_t<_Up>>;
+ return __vector_bitcast<_Up>(~__vector_bitcast<_Ip>(__a)
+ & __vector_bitcast<_Ip>(__b));
+}
+
+template <typename _Tp, typename = decltype(~_Tp() & _Tp())>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__andnot(_Tp __a, _Tp __b) noexcept
+{
+ return ~__a & __b;
+}
+
+// }}}
+// __not{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__not(_Tp __a) noexcept
+{
+ if constexpr (std::is_floating_point_v<typename _TVT::value_type>)
+ return reinterpret_cast<typename _TVT::type>(
+ ~__vector_bitcast<unsigned>(__a));
+ else
+ return ~__a;
+}
+
+// }}}
+// __concat{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+ typename _R
+ = __vector_type_t<typename _TVT::value_type, _TVT::_S_width * 2>>
+constexpr _R
+__concat(_Tp a_, _Tp b_)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_1
+ using _W
+ = std::conditional_t<std::is_floating_point_v<typename _TVT::value_type>,
+ double,
+ conditional_t<(sizeof(_Tp) >= 2 * sizeof(long long)),
+ long long, typename _TVT::value_type>>;
+ constexpr int input_width = sizeof(_Tp) / sizeof(_W);
+ const auto __a = __vector_bitcast<_W>(a_);
+ const auto __b = __vector_bitcast<_W>(b_);
+ using _Up = __vector_type_t<_W, sizeof(_R) / sizeof(_W)>;
+#else
+ constexpr int input_width = _TVT::_S_width;
+ const _Tp& __a = a_;
+ const _Tp& __b = b_;
+ using _Up = _R;
+#endif
+ if constexpr (input_width == 2)
+ return reinterpret_cast<_R>(_Up{__a[0], __a[1], __b[0], __b[1]});
+ else if constexpr (input_width == 4)
+ return reinterpret_cast<_R>(
+ _Up{__a[0], __a[1], __a[2], __a[3], __b[0], __b[1], __b[2], __b[3]});
+ else if constexpr (input_width == 8)
+ return reinterpret_cast<_R>(
+ _Up{__a[0], __a[1], __a[2], __a[3], __a[4], __a[5], __a[6], __a[7],
+ __b[0], __b[1], __b[2], __b[3], __b[4], __b[5], __b[6], __b[7]});
+ else if constexpr (input_width == 16)
+ return reinterpret_cast<_R>(
+ _Up{__a[0], __a[1], __a[2], __a[3], __a[4], __a[5], __a[6],
+ __a[7], __a[8], __a[9], __a[10], __a[11], __a[12], __a[13],
+ __a[14], __a[15], __b[0], __b[1], __b[2], __b[3], __b[4],
+ __b[5], __b[6], __b[7], __b[8], __b[9], __b[10], __b[11],
+ __b[12], __b[13], __b[14], __b[15]});
+ else if constexpr (input_width == 32)
+ return reinterpret_cast<_R>(_Up{
+ __a[0], __a[1], __a[2], __a[3], __a[4], __a[5], __a[6], __a[7],
+ __a[8], __a[9], __a[10], __a[11], __a[12], __a[13], __a[14], __a[15],
+ __a[16], __a[17], __a[18], __a[19], __a[20], __a[21], __a[22], __a[23],
+ __a[24], __a[25], __a[26], __a[27], __a[28], __a[29], __a[30], __a[31],
+ __b[0], __b[1], __b[2], __b[3], __b[4], __b[5], __b[6], __b[7],
+ __b[8], __b[9], __b[10], __b[11], __b[12], __b[13], __b[14], __b[15],
+ __b[16], __b[17], __b[18], __b[19], __b[20], __b[21], __b[22], __b[23],
+ __b[24], __b[25], __b[26], __b[27], __b[28], __b[29], __b[30], __b[31]});
+}
+
+// }}}
+// __zero_extend {{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+struct _ZeroExtendProxy
+{
+ using value_type = typename _TVT::value_type;
+ static constexpr size_t _Np = _TVT::_S_width;
+ const _Tp __x;
+
+ template <typename _To, typename _ToVT = _VectorTraits<_To>,
+ typename
+ = enable_if_t<is_same_v<typename _ToVT::value_type, value_type>>>
+ _GLIBCXX_SIMD_INTRINSIC operator _To() const
+ {
+ constexpr size_t _ToN = _ToVT::_S_width;
+ if constexpr (_ToN == _Np)
+ return __x;
+ else if constexpr (_ToN == 2 * _Np)
+ {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_3
+ if constexpr (__have_avx && _TVT::template __is<float, 4>)
+ return __vector_bitcast<value_type>(
+ _mm256_insertf128_ps(__m256(), __x, 0));
+ else if constexpr (__have_avx && _TVT::template __is<double, 2>)
+ return __vector_bitcast<value_type>(
+ _mm256_insertf128_pd(__m256d(), __x, 0));
+ else if constexpr (__have_avx2 && _Np * sizeof(value_type) == 16)
+ return __vector_bitcast<value_type>(
+ _mm256_insertf128_si256(__m256i(), __to_intrin(__x), 0));
+ else if constexpr (__have_avx512f && _TVT::template __is<float, 8>)
+ {
+ if constexpr (__have_avx512dq)
+ return __vector_bitcast<value_type>(
+ _mm512_insertf32x8(__m512(), __x, 0));
+ else
+ return reinterpret_cast<__m512>(
+ _mm512_insertf64x4(__m512d(), reinterpret_cast<__m256d>(__x),
+ 0));
+ }
+ else if constexpr (__have_avx512f && _TVT::template __is<double, 4>)
+ return __vector_bitcast<value_type>(
+ _mm512_insertf64x4(__m512d(), __x, 0));
+ else if constexpr (__have_avx512f && _Np * sizeof(value_type) == 32)
+ return __vector_bitcast<value_type>(
+ _mm512_inserti64x4(__m512i(), __to_intrin(__x), 0));
+#endif
+ return __concat(__x, _Tp());
+ }
+ else if constexpr (_ToN == 4 * _Np)
+ {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_3
+ if constexpr (__have_avx512dq && _TVT::template __is<double, 2>)
+ {
+ return __vector_bitcast<value_type>(
+ _mm512_insertf64x2(__m512d(), __x, 0));
+ }
+ else if constexpr (__have_avx512f
+ && std::is_floating_point_v<value_type>)
+ {
+ return __vector_bitcast<value_type>(
+ _mm512_insertf32x4(__m512(), reinterpret_cast<__m128>(__x), 0));
+ }
+ else if constexpr (__have_avx512f && _Np * sizeof(value_type) == 16)
+ {
+ return __vector_bitcast<value_type>(
+ _mm512_inserti32x4(__m512i(), __to_intrin(__x), 0));
+ }
+#endif
+ return __concat(__concat(__x, _Tp()),
+ __vector_type_t<value_type, _Np * 2>());
+ }
+ else if constexpr (_ToN == 8 * _Np)
+ return __concat(operator __vector_type_t<value_type, _Np * 4>(),
+ __vector_type_t<value_type, _Np * 4>());
+ else if constexpr (_ToN == 16 * _Np)
+ return __concat(operator __vector_type_t<value_type, _Np * 8>(),
+ __vector_type_t<value_type, _Np * 8>());
+ else
+ __assert_unreachable<_Tp>();
+ }
+};
+
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _ZeroExtendProxy<_Tp, _TVT>
+__zero_extend(_Tp __x)
+{
+ return {__x};
+}
+
+// }}}
+// __extract<_Np, By>{{{
+template <
+ int _Offset, int _SplitBy, typename _Tp, typename _TVT = _VectorTraits<_Tp>,
+ typename _R
+ = __vector_type_t<typename _TVT::value_type, _TVT::_S_width / _SplitBy>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__extract(_Tp __in)
+{
+ using value_type = typename _TVT::value_type;
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ if constexpr (sizeof(_Tp) == 64 && _SplitBy == 4 && _Offset > 0)
+ {
+ if constexpr (__have_avx512dq && std::is_same_v<double, value_type>)
+ return _mm512_extractf64x2_pd(__to_intrin(__in), _Offset);
+ else if constexpr (std::is_floating_point_v<value_type>)
+ return __vector_bitcast<value_type>(
+ _mm512_extractf32x4_ps(__intrin_bitcast<__m512>(__in), _Offset));
+ else
+ return reinterpret_cast<_R>(
+ _mm512_extracti32x4_epi32(__intrin_bitcast<__m512i>(__in), _Offset));
+ }
+ else
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_1
+ using _W = std::conditional_t<
+ std::is_floating_point_v<value_type>, double,
+ std::conditional_t<(sizeof(_R) >= 16), long long, value_type>>;
+ static_assert(sizeof(_R) % sizeof(_W) == 0);
+ constexpr int __return_width = sizeof(_R) / sizeof(_W);
+ using _Up = __vector_type_t<_W, __return_width>;
+ const auto __x = __vector_bitcast<_W>(__in);
+#else
+ constexpr int __return_width = _TVT::_S_width / _SplitBy;
+ using _Up = _R;
+ const __vector_type_t<value_type, _TVT::_S_width>& __x
+ = __in; // only needed for _Tp = _SimdWrapper<value_type, _Np>
+#endif
+ constexpr int _O = _Offset * __return_width;
+ return __call_with_subscripts<__return_width, _O>(
+ __x, [](auto... __entries) {
+ return reinterpret_cast<_R>(_Up{__entries...});
+ });
+ }
+}
+
+// }}}
+// __lo/__hi64[z]{{{
+template <typename _Tp,
+ typename _R
+ = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__lo64(_Tp __x)
+{
+ _R __r{};
+ __builtin_memcpy(&__r, &__x, 8);
+ return __r;
+}
+
+template <typename _Tp,
+ typename _R
+ = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__hi64(_Tp __x)
+{
+ static_assert(sizeof(_Tp) == 16, "use __hi64z if you meant it");
+ _R __r{};
+ __builtin_memcpy(&__r, reinterpret_cast<const char*>(&__x) + 8, 8);
+ return __r;
+}
+
+template <typename _Tp,
+ typename _R
+ = __vector_type8_t<typename _VectorTraits<_Tp>::value_type>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _R
+__hi64z([[maybe_unused]] _Tp __x)
+{
+ _R __r{};
+ if constexpr (sizeof(_Tp) == 16)
+ __builtin_memcpy(&__r, reinterpret_cast<const char*>(&__x) + 8, 8);
+ return __r;
+}
+
+// }}}
+// __lo/__hi128{{{
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__lo128(_Tp __x)
+{
+ return __extract<0, sizeof(_Tp) / 16>(__x);
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__hi128(_Tp __x)
+{
+ static_assert(sizeof(__x) == 32);
+ return __extract<1, 2>(__x);
+}
+
+// }}}
+// __lo/__hi256{{{
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__lo256(_Tp __x)
+{
+ static_assert(sizeof(__x) == 64);
+ return __extract<0, 2>(__x);
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__hi256(_Tp __x)
+{
+ static_assert(sizeof(__x) == 64);
+ return __extract<1, 2>(__x);
+}
+
+// }}}
+// __auto_bitcast{{{
+template <typename _Tp> struct _AutoCast
+{
+ static_assert(__is_vector_type_v<_Tp>);
+ const _Tp __x;
+ template <typename _Up, typename _UVT = _VectorTraits<_Up>>
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator _Up() const
+ {
+ return __intrin_bitcast<typename _UVT::type>(__x);
+ }
+};
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr _AutoCast<_Tp>
+__auto_bitcast(const _Tp& __x)
+{
+ return {__x};
+}
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC constexpr _AutoCast<
+ typename _SimdWrapper<_Tp, _Np>::_BuiltinType>
+__auto_bitcast(const _SimdWrapper<_Tp, _Np>& __x)
+{
+ return {__x._M_data};
+}
+
+// }}}
+// ^^^ ---- builtin vector types [[gnu::vector_size(N)]] and operations ---- ^^^
+
+#if _GLIBCXX_SIMD_HAVE_SSE_ABI
+// __bool_storage_member_type{{{
+#if _GLIBCXX_SIMD_HAVE_AVX512F && _GLIBCXX_SIMD_X86INTRIN
+template <size_t _Size> struct __bool_storage_member_type
+{
+ static_assert((_Size & (_Size - 1)) != 0,
+ "This trait may only be used for non-power-of-2 sizes. "
+ "Power-of-2 sizes must be specialized.");
+ using type =
+ typename __bool_storage_member_type<__next_power_of_2(_Size)>::type;
+};
+template <> struct __bool_storage_member_type<1>
+{
+ using type = bool;
+};
+template <> struct __bool_storage_member_type<2>
+{
+ using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<4>
+{
+ using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<8>
+{
+ using type = __mmask8;
+};
+template <> struct __bool_storage_member_type<16>
+{
+ using type = __mmask16;
+};
+template <> struct __bool_storage_member_type<32>
+{
+ using type = __mmask32;
+};
+template <> struct __bool_storage_member_type<64>
+{
+ using type = __mmask64;
+};
+#endif // _GLIBCXX_SIMD_HAVE_AVX512F
+
+// }}}
+// __intrinsic_type (x86){{{
+// the following excludes bool via __is_vectorizable
+#if _GLIBCXX_SIMD_HAVE_SSE
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<
+ _Tp, _Bytes, std::enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 64>>
+{
+ static_assert(!std::is_same_v<_Tp, long double>,
+ "no __intrinsic_type support for long double on x86");
+ static constexpr std::size_t _VBytes
+ = _Bytes <= 16 ? 16 : _Bytes <= 32 ? 32 : 64;
+ using type [[__gnu__::__vector_size__(_VBytes)]]
+ = std::conditional_t<std::is_integral_v<_Tp>, long long int, _Tp>;
+};
+#endif // _GLIBCXX_SIMD_HAVE_SSE
+
+// }}}
+#endif // _GLIBCXX_SIMD_HAVE_SSE_ABI
+// __intrinsic_type (ARM){{{
+#if _GLIBCXX_SIMD_HAVE_NEON
+#define _GLIBCXX_SIMD_NEON_INTRIN(_Tp) \
+ template <> \
+ struct __intrinsic_type<__remove_cvref_t<decltype(_Tp()[0])>, sizeof(_Tp), \
+ void> \
+ { \
+ using type = _Tp; \
+ }
+_GLIBCXX_SIMD_NEON_INTRIN(int8x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int8x16_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int16x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(int32x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint8x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint8x16_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint16x8_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint32x4_t);
+#if defined _ARM_FEATURE_FP16_VECTOR_ARITHMETIC
+_GLIBCXX_SIMD_NEON_INTRIN(float16x4_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float16x8_t);
+#endif
+_GLIBCXX_SIMD_NEON_INTRIN(float32x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float32x4_t);
+#if defined __aarch64__
+_GLIBCXX_SIMD_NEON_INTRIN(int64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float64x1_t);
+_GLIBCXX_SIMD_NEON_INTRIN(float64x2_t);
+#endif
+_GLIBCXX_SIMD_NEON_INTRIN(int64x2_t);
+_GLIBCXX_SIMD_NEON_INTRIN(uint64x2_t);
+#undef _GLIBCXX_SIMD_NEON_INTRIN
+
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<_Tp, _Bytes,
+ enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+{
+ static constexpr int _VBytes = _Bytes <= 8 ? 8 : 16;
+ using _Tmp = conditional_t<
+ sizeof(_Tp) == 1, __remove_cvref_t<decltype(int8x16_t()[0])>,
+ conditional_t<
+ sizeof(_Tp) == 2, short,
+ conditional_t<
+ sizeof(_Tp) == 4, int,
+ conditional_t<sizeof(_Tp) == 8,
+ __remove_cvref_t<decltype(int64x2_t()[0])>, void>>>>;
+ using _Up = conditional_t<
+ is_floating_point_v<_Tp>, _Tp,
+ conditional_t<is_unsigned_v<_Tp>, make_unsigned_t<_Tmp>, _Tmp>>;
+ using type = typename __intrinsic_type<_Up, _VBytes>::type;
+};
+#endif // _GLIBCXX_SIMD_HAVE_NEON
+
+// }}}
+// __intrinsic_type (PPC){{{
+#ifdef __ALTIVEC__
+template <typename _Tp> struct __intrinsic_type_impl;
+#define _GLIBCXX_SIMD_PPC_INTRIN(_Tp) \
+ template <> struct __intrinsic_type_impl<_Tp> \
+ { \
+ using type = __vector _Tp; \
+ }
+_GLIBCXX_SIMD_PPC_INTRIN(float);
+_GLIBCXX_SIMD_PPC_INTRIN(double);
+_GLIBCXX_SIMD_PPC_INTRIN(signed char);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned char);
+_GLIBCXX_SIMD_PPC_INTRIN(signed short);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned short);
+_GLIBCXX_SIMD_PPC_INTRIN(signed int);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned int);
+_GLIBCXX_SIMD_PPC_INTRIN(signed long long);
+_GLIBCXX_SIMD_PPC_INTRIN(unsigned long long);
+#undef _GLIBCXX_SIMD_PPC_INTRIN
+
+template <typename _Tp, size_t _Bytes>
+struct __intrinsic_type<
+ _Tp, _Bytes, std::enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
+{
+ static_assert(!std::is_same_v<_Tp, long double>,
+ "no __intrinsic_type support for long double on PPC");
+#ifndef __VSX__
+ static_assert(!std::is_same_v<_Tp, double>,
+ "no __intrinsic_type support for double on PPC w/o VSX");
+#endif
+#ifndef __POWER8_VECTOR__
+ static_assert(!(std::is_integral_v<_Tp> && sizeof(_Tp) > 4),
+ "no __intrinsic_type support for integers larger than 4 Bytes "
+ "on PPC w/o POWER8 vectors");
+#endif
+ using type = typename __intrinsic_type_impl<conditional_t<
+ is_floating_point_v<_Tp>, _Tp, __int_for_sizeof_t<_Tp>>>::type;
+};
+#endif // __ALTIVEC__
+
+// }}}
+// _SimdWrapper<bool>{{{1
+template <size_t _Width>
+struct _SimdWrapper<
+ bool, _Width, std::void_t<typename __bool_storage_member_type<_Width>::type>>
+{
+ using _BuiltinType = typename __bool_storage_member_type<_Width>::type;
+ using value_type = bool;
+ static constexpr size_t _S_width = sizeof(_BuiltinType) * CHAR_BIT;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<bool, _S_width>
+ __as_full_vector() const
+ {
+ return _M_data;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_BuiltinType __k)
+ : _M_data(__k){};
+
+ _GLIBCXX_SIMD_INTRINSIC operator const _BuiltinType &() const
+ {
+ return _M_data;
+ }
+ _GLIBCXX_SIMD_INTRINSIC operator _BuiltinType&() { return _M_data; }
+
+ _GLIBCXX_SIMD_INTRINSIC _BuiltinType __intrin() const { return _M_data; }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator[](size_t __i) const
+ {
+ return _M_data & (_BuiltinType(1) << __i);
+ }
+ template <size_t __i>
+ _GLIBCXX_SIMD_INTRINSIC constexpr value_type
+ operator[](_SizeConstant<__i>) const
+ {
+ return _M_data & (_BuiltinType(1) << __i);
+ }
+ _GLIBCXX_SIMD_INTRINSIC constexpr void __set(size_t __i, value_type __x)
+ {
+ if (__x)
+ _M_data |= (_BuiltinType(1) << __i);
+ else
+ _M_data &= ~(_BuiltinType(1) << __i);
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ return __builtin_constant_p(_M_data);
+ }
+
+ _BuiltinType _M_data;
+};
+
+// _SimdWrapperBase{{{1
+template <bool> struct _SimdWrapperBase;
+
+template <> struct _SimdWrapperBase<true> // no padding or no SNaNs
+{
+};
+
+#ifdef __SUPPORT_SNAN__
+template <>
+struct _SimdWrapperBase<false> // with padding that needs to never become SNaN
+{
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapperBase() : _M_data() {}
+};
+#endif // __SUPPORT_SNAN__
+
+// }}}
+// _SimdWrapper{{{
+template <typename _Tp, size_t _Width>
+struct _SimdWrapper<
+ _Tp, _Width,
+ std::void_t<__vector_type_t<_Tp, _Width>, __intrinsic_type_t<_Tp, _Width>>>
+ : _SimdWrapperBase<
+#ifdef __SUPPORT_SNAN__
+ !std::numeric_limits<_Tp>::has_signaling_NaN
+ || sizeof(_Tp) * _Width == sizeof(__vector_type_t<_Tp, _Width>)
+#else
+ true
+#endif
+ >
+{
+ static_assert(__is_vectorizable_v<_Tp>);
+ static_assert(_Width >= 2); // 1 doesn't make sense, use _Tp directly then
+ using _BuiltinType = __vector_type_t<_Tp, _Width>;
+ using value_type = _Tp;
+ static constexpr size_t _S_width = sizeof(_BuiltinType) / sizeof(value_type);
+ static inline constexpr int __size = _Width;
+
+ _BuiltinType _M_data;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, _S_width>
+ __as_full_vector() const
+ {
+ return _M_data;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(
+ std::initializer_list<_Tp> __init)
+ : _M_data(__generate_from_n_evaluations<_Width, _BuiltinType>(
+ [&](auto __i) { return __init.begin()[__i.value]; }))
+ {}
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper() = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(const _SimdWrapper&) = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_SimdWrapper&&) = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper& operator=(const _SimdWrapper&)
+ = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper& operator=(_SimdWrapper&&)
+ = default;
+
+ template <typename _V, typename = std::enable_if_t<std::disjunction_v<
+ is_same<_V, __vector_type_t<_Tp, _Width>>,
+ is_same<_V, __intrinsic_type_t<_Tp, _Width>>>>>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_V __x)
+ : _M_data(__vector_bitcast<_Tp, _Width>(
+ __x)) // __vector_bitcast can convert e.g. __m128 to __vector(2) float
+ {}
+
+ template <typename... _As,
+ typename
+ = enable_if_t<((std::is_same_v<simd_abi::scalar, _As> && ...)
+ && sizeof...(_As) <= _Width)>>
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator _SimdTuple<_Tp, _As...>() const
+ {
+ const auto& dd = _M_data; // workaround for GCC7 ICE
+ return __generate_from_n_evaluations<sizeof...(_As),
+ _SimdTuple<_Tp, _As...>>([&](
+ auto __i) constexpr { return dd[int(__i)]; });
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator const _BuiltinType &() const
+ {
+ return _M_data;
+ }
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator _BuiltinType&() { return _M_data; }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](size_t __i) const
+ {
+ return _M_data[__i];
+ }
+ template <size_t __i>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Tp operator[](_SizeConstant<__i>) const
+ {
+ return _M_data[__i];
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr void __set(size_t __i, _Tp __x)
+ {
+ _M_data[__i] = __x;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ return __builtin_constant_p(_M_data);
+ }
+};
+
+// }}}
+
+// __vectorized_sizeof {{{
+template <typename _Tp>
+constexpr size_t
+__vectorized_sizeof()
+{
+ if constexpr (!__is_vectorizable_v<_Tp>)
+ return 0;
+
+ if constexpr (sizeof(_Tp) <= 8)
+ {
+ // X86:
+ if constexpr (__have_avx512bw)
+ return 64;
+ if constexpr (__have_avx512f && sizeof(_Tp) >= 4)
+ return 64;
+ if constexpr (__have_avx2)
+ return 32;
+ if constexpr (__have_avx && std::is_floating_point_v<_Tp>)
+ return 32;
+ if constexpr (__have_sse2)
+ return 16;
+ if constexpr (__have_sse && std::is_same_v<_Tp, float>)
+ return 16;
+ if constexpr (__have_mmx && sizeof(_Tp) <= 4 && std::is_integral_v<_Tp>)
+ return 8;
+
+ // PowerPC:
+ if constexpr (__have_power8vec || (__have_power_vmx && (sizeof(_Tp) < 8))
+ || (__have_power_vsx && std::is_floating_point_v<_Tp>) )
+ return 16;
+
+ // ARM:
+ if constexpr (__have_neon_a64
+ || (__have_neon_a32 && !is_same_v<_Tp, double>) )
+ return 16;
+ if constexpr (__have_neon
+ && sizeof(_Tp) < 8
+ // Only allow fp if the user allows non-ICE559 fp (e.g. via
+ // -ffast-math). ARMv7 NEON fp is not conforming to IEC559.
+ && (__GCC_IEC_559 == 0 || !std::is_floating_point_v<_Tp>) )
+ return 16;
+ }
+
+ return sizeof(_Tp);
+};
+
+// }}}
+namespace simd_abi {
+// most of simd_abi is defined in simd_detail.h
+template <typename _Tp>
+inline constexpr int max_fixed_size
+ = (__have_avx512bw && sizeof(_Tp) == 1) ? 64 : 32;
+// compatible {{{
+#if defined __x86_64__ || defined __aarch64__
+template <typename _Tp>
+using compatible
+ = std::conditional_t<(sizeof(_Tp) <= 8), _VecBuiltin<16>, scalar>;
+#elif defined __ARM_NEON
+// FIXME: not sure, probably needs to be scalar (or dependent on the hard-float
+// ABI?)
+template <typename _Tp>
+using compatible
+ = std::conditional_t<(sizeof(_Tp) < 8), _VecBuiltin<16>, scalar>;
+#else
+template <typename> using compatible = scalar;
+#endif
+
+// }}}
+// native {{{
+template <typename _Tp>
+constexpr auto
+__determine_native_abi()
+{
+ constexpr size_t __bytes = __vectorized_sizeof<_Tp>();
+ if constexpr (__bytes == sizeof(_Tp))
+ return static_cast<scalar*>(nullptr);
+ else if constexpr (__have_avx512vl || (__have_avx512f && __bytes == 64))
+ return static_cast<_VecBltnBtmsk<__bytes>*>(nullptr);
+ else
+ return static_cast<_VecBuiltin<__bytes>*>(nullptr);
+}
+
+template <typename _Tp, typename = enable_if_t<__is_vectorizable_v<_Tp>>>
+using native = std::remove_pointer_t<decltype(__determine_native_abi<_Tp>())>;
+
+// }}}
+// __default_abi {{{
+#if defined _GLIBCXX_SIMD_DEFAULT_ABI
+template <typename _Tp> using __default_abi = _GLIBCXX_SIMD_DEFAULT_ABI<_Tp>;
+#else
+template <typename _Tp> using __default_abi = compatible<_Tp>;
+#endif
+
+// }}}
+} // namespace simd_abi
+
+// traits {{{1
+// is_abi_tag {{{2
+template <typename _Tp, typename = std::void_t<>> struct is_abi_tag : false_type
+{
+};
+template <typename _Tp>
+struct is_abi_tag<_Tp, std::void_t<typename _Tp::_IsValidAbiTag>>
+ : public _Tp::_IsValidAbiTag
+{
+};
+template <typename _Tp>
+inline constexpr bool is_abi_tag_v = is_abi_tag<_Tp>::value;
+
+// is_simd(_mask) {{{2
+template <typename _Tp> struct is_simd : public false_type
+{
+};
+template <typename _Tp> inline constexpr bool is_simd_v = is_simd<_Tp>::value;
+
+template <typename _Tp> struct is_simd_mask : public false_type
+{
+};
+template <typename _Tp>
+inline constexpr bool is_simd_mask_v = is_simd_mask<_Tp>::value;
+
+// simd_size {{{2
+template <typename _Tp, typename _Abi, typename = void> struct __simd_size_impl
+{
+};
+template <typename _Tp, typename _Abi>
+struct __simd_size_impl<
+ _Tp, _Abi,
+ enable_if_t<std::conjunction_v<__is_vectorizable<_Tp>,
+ std::experimental::is_abi_tag<_Abi>>>>
+ : _SizeConstant<_Abi::template size<_Tp>>
+{
+};
+
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+struct simd_size : __simd_size_impl<_Tp, _Abi>
+{
+};
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+inline constexpr size_t simd_size_v = simd_size<_Tp, _Abi>::value;
+
+// simd_abi::deduce {{{2
+template <typename _Tp, std::size_t _Np, typename = void> struct __deduce_impl;
+namespace simd_abi {
+/**
+ * \tparam _Tp The requested `value_type` for the elements.
+ * \tparam _Np The requested number of elements.
+ * \tparam _Abis This parameter is ignored, since this implementation cannot
+ * make any use of it. Either __a good native ABI is matched and used as `type`
+ * alias, or the `fixed_size<_Np>` ABI is used, which internally is built from
+ * the best matching native ABIs.
+ */
+template <typename _Tp, std::size_t _Np, typename...>
+struct deduce : std::experimental::__deduce_impl<_Tp, _Np>
+{
+};
+
+template <typename _Tp, size_t _Np, typename... _Abis>
+using deduce_t = typename deduce<_Tp, _Np, _Abis...>::type;
+} // namespace simd_abi
+
+// }}}2
+// rebind_simd {{{2
+template <typename _Tp, typename _V, typename = void> struct rebind_simd;
+template <typename _Tp, typename _Up, typename _Abi>
+struct rebind_simd<
+ _Tp, simd<_Up, _Abi>,
+ void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+{
+ using type = simd<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
+};
+template <typename _Tp, typename _Up, typename _Abi>
+struct rebind_simd<
+ _Tp, simd_mask<_Up, _Abi>,
+ void_t<simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>>
+{
+ using type
+ = simd_mask<_Tp, simd_abi::deduce_t<_Tp, simd_size_v<_Up, _Abi>, _Abi>>;
+};
+template <typename _Tp, typename _V>
+using rebind_simd_t = typename rebind_simd<_Tp, _V>::type;
+
+// resize_simd {{{2
+template <int _Np, typename _V, typename = void> struct resize_simd;
+template <int _Np, typename _Tp, typename _Abi>
+struct resize_simd<_Np, simd<_Tp, _Abi>,
+ void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+{
+ using type = simd<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>;
+};
+template <int _Np, typename _Tp, typename _Abi>
+struct resize_simd<_Np, simd_mask<_Tp, _Abi>,
+ void_t<simd_abi::deduce_t<_Tp, _Np, _Abi>>>
+{
+ using type = simd_mask<_Tp, simd_abi::deduce_t<_Tp, _Np, _Abi>>;
+};
+template <int _Np, typename _V>
+using resize_simd_t = typename resize_simd<_Np, _V>::type;
+
+// }}}2
+// memory_alignment {{{2
+template <typename _Tp, typename _Up = typename _Tp::value_type>
+struct memory_alignment
+ : public _SizeConstant<__next_power_of_2(sizeof(_Up) * _Tp::size())>
+{
+};
+template <typename _Tp, typename _Up = typename _Tp::value_type>
+inline constexpr size_t memory_alignment_v = memory_alignment<_Tp, _Up>::value;
+
+// class template simd [simd] {{{1
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+class simd;
+template <typename _Tp, typename _Abi>
+struct is_simd<simd<_Tp, _Abi>> : public true_type
+{
+};
+template <typename _Tp> using native_simd = simd<_Tp, simd_abi::native<_Tp>>;
+template <typename _Tp, int _Np>
+using fixed_size_simd = simd<_Tp, simd_abi::fixed_size<_Np>>;
+template <typename _Tp, size_t _Np>
+using __deduced_simd = simd<_Tp, simd_abi::deduce_t<_Tp, _Np>>;
+
+// class template simd_mask [simd_mask] {{{1
+template <typename _Tp, typename _Abi = simd_abi::__default_abi<_Tp>>
+class simd_mask;
+template <typename _Tp, typename _Abi>
+struct is_simd_mask<simd_mask<_Tp, _Abi>> : public true_type
+{
+};
+template <typename _Tp>
+using native_simd_mask = simd_mask<_Tp, simd_abi::native<_Tp>>;
+template <typename _Tp, int _Np>
+using fixed_size_simd_mask = simd_mask<_Tp, simd_abi::fixed_size<_Np>>;
+template <typename _Tp, size_t _Np>
+using __deduced_simd_mask = simd_mask<_Tp, simd_abi::deduce_t<_Tp, _Np>>;
+
+// casts [simd.casts] {{{1
+// static_simd_cast {{{2
+template <typename _Tp, typename _Up, typename _Ap, bool = is_simd_v<_Tp>,
+ typename = void>
+struct __static_simd_cast_return_type;
+
+template <typename _Tp, typename _A0, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<simd_mask<_Tp, _A0>, _Up, _Ap, false,
+ void>
+ : __static_simd_cast_return_type<simd<_Tp, _A0>, _Up, _Ap>
+{
+};
+
+template <typename _Tp, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<
+ _Tp, _Up, _Ap, true, enable_if_t<_Tp::size() == simd_size_v<_Up, _Ap>>>
+{
+ using type = _Tp;
+};
+
+template <typename _Tp, typename _Ap>
+struct __static_simd_cast_return_type<_Tp, _Tp, _Ap, false,
+#ifdef _GLIBCXX_SIMD_FIX_P2TS_ISSUE66
+ enable_if_t<__is_vectorizable_v<_Tp>>
+#else
+ void
+#endif
+ >
+{
+ using type = simd<_Tp, _Ap>;
+};
+
+template <typename _Tp, typename = void> struct __safe_make_signed
+{
+ using type = _Tp;
+};
+template <typename _Tp>
+struct __safe_make_signed<_Tp, enable_if_t<std::is_integral_v<_Tp>>>
+{
+ // the extra make_unsigned_t is because of PR85951
+ using type = std::make_signed_t<std::make_unsigned_t<_Tp>>;
+};
+template <typename _Tp>
+using safe_make_signed_t = typename __safe_make_signed<_Tp>::type;
+
+template <typename _Tp, typename _Up, typename _Ap>
+struct __static_simd_cast_return_type<_Tp, _Up, _Ap, false,
+#ifdef _GLIBCXX_SIMD_FIX_P2TS_ISSUE66
+ enable_if_t<__is_vectorizable_v<_Tp>>
+#else
+ void
+#endif
+ >
+{
+ using type = std::conditional_t<
+ (std::is_integral_v<_Up> && std::is_integral_v<_Tp> &&
+#ifndef _GLIBCXX_SIMD_FIX_P2TS_ISSUE65
+ std::is_signed_v<_Up> != std::is_signed_v<_Tp> &&
+#endif
+ std::is_same_v<safe_make_signed_t<_Up>, safe_make_signed_t<_Tp>>),
+ simd<_Tp, _Ap>, fixed_size_simd<_Tp, simd_size_v<_Up, _Ap>>>;
+};
+
+template <typename _Tp, typename _Up, typename _Ap,
+ typename _R
+ = typename __static_simd_cast_return_type<_Tp, _Up, _Ap>::type>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _R
+static_simd_cast(const simd<_Up, _Ap>& __x)
+{
+ if constexpr (std::is_same<_R, simd<_Up, _Ap>>::value)
+ {
+ return __x;
+ }
+ else
+ {
+ _SimdConverter<_Up, _Ap, typename _R::value_type, typename _R::abi_type>
+ __c;
+ return _R(__private_init, __c(__data(__x)));
+ }
+}
+
+namespace __proposed {
+template <typename _Tp, typename _Up, typename _Ap,
+ typename _R
+ = typename __static_simd_cast_return_type<_Tp, _Up, _Ap>::type>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR typename _R::mask_type
+static_simd_cast(const simd_mask<_Up, _Ap>& __x)
+{
+ using _RM = typename _R::mask_type;
+ return {__private_init, _RM::abi_type::_MaskImpl::template __convert<
+ typename _RM::simd_type::value_type>(__x)};
+}
+} // namespace __proposed
+
+// simd_cast {{{2
+template <typename _Tp, typename _Up, typename _Ap,
+ typename _To = __value_type_or_identity_t<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR auto
+simd_cast(const simd<_ValuePreserving<_Up, _To>, _Ap>& __x)
+ -> decltype(static_simd_cast<_Tp>(__x))
+{
+ return static_simd_cast<_Tp>(__x);
+}
+
+namespace __proposed {
+template <typename _Tp, typename _Up, typename _Ap,
+ typename _To = __value_type_or_identity_t<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR auto
+simd_cast(const simd_mask<_ValuePreserving<_Up, _To>, _Ap>& __x)
+ -> decltype(static_simd_cast<_Tp>(__x))
+{
+ return static_simd_cast<_Tp>(__x);
+}
+} // namespace __proposed
+
+// }}}2
+// resizing_simd_cast {{{
+namespace __proposed {
+/* Proposed spec:
+
+template <class T, class U, class Abi>
+T resizing_simd_cast(const simd<U, Abi>& x)
+
+p1 Constraints:
+ - is_simd_v<T> is true and
+ - T::value_type is the same type as U
+
+p2 Returns:
+ A simd object with the i^th element initialized to x[i] for all i in the
+ range of [0, min(T::size(), simd_size_v<U, Abi>)). If T::size() is larger
+ than simd_size_v<U, Abi>, the remaining elements are value-initialized.
+
+template <class T, class U, class Abi>
+T resizing_simd_cast(const simd_mask<U, Abi>& x)
+
+p1 Constraints: is_simd_mask_v<T> is true
+
+p2 Returns:
+ A simd_mask object with the i^th element initialized to x[i] for all i in
+the range of [0, min(T::size(), simd_size_v<U, Abi>)). If T::size() is larger
+ than simd_size_v<U, Abi>, the remaining elements are initialized to false.
+
+ */
+
+template <typename _Tp, typename _Up, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR enable_if_t<
+ conjunction_v<is_simd<_Tp>, is_same<typename _Tp::value_type, _Up>>, _Tp>
+resizing_simd_cast(const simd<_Up, _Ap>& __x)
+{
+ if constexpr (is_same_v<typename _Tp::abi_type, _Ap>)
+ return __x;
+ else if constexpr (simd_size_v<_Up, _Ap> == 1)
+ {
+ _Tp __r{};
+ __r[0] = __x[0];
+ return __r;
+ }
+ else if constexpr (_Tp::size() == 1)
+ return __x[0];
+ else if constexpr (sizeof(_Tp) == sizeof(__x) && !__is_fixed_size_abi_v<_Ap>)
+ return {__private_init,
+ __vector_bitcast<typename _Tp::value_type, _Tp::size()>(
+ _Ap::__masked(__data(__x))._M_data)};
+ else
+ {
+ _Tp __r{};
+ __builtin_memcpy(&__data(__r), &__data(__x),
+ sizeof(_Up)
+ * std::min(_Tp::size(), simd_size_v<_Up, _Ap>));
+ return __r;
+ }
+}
+
+template <typename _Tp, typename _Up, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+ _GLIBCXX_SIMD_CONSTEXPR enable_if_t<is_simd_mask_v<_Tp>, _Tp>
+ resizing_simd_cast(const simd_mask<_Up, _Ap>& __x)
+{
+ return {__private_init, _Tp::abi_type::_MaskImpl::template __convert<
+ typename _Tp::simd_type::value_type>(__x)};
+}
+} // namespace __proposed
+
+// }}}
+// to_fixed_size {{{2
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC fixed_size_simd<_Tp, _Np>
+to_fixed_size(const fixed_size_simd<_Tp, _Np>& __x)
+{
+ return __x;
+}
+
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC fixed_size_simd_mask<_Tp, _Np>
+to_fixed_size(const fixed_size_simd_mask<_Tp, _Np>& __x)
+{
+ return __x;
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC auto
+to_fixed_size(const simd<_Tp, _Ap>& __x)
+{
+ return simd<_Tp, simd_abi::fixed_size<simd_size_v<_Tp, _Ap>>>([&__x](
+ auto __i) constexpr { return __x[__i]; });
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC auto
+to_fixed_size(const simd_mask<_Tp, _Ap>& __x)
+{
+ constexpr int _Np = simd_mask<_Tp, _Ap>::size();
+ fixed_size_simd_mask<_Tp, _Np> __r;
+ __execute_n_times<_Np>([&](auto __i) constexpr { __r[__i] = __x[__i]; });
+ return __r;
+}
+
+// to_native {{{2
+template <typename _Tp, int _Np>
+_GLIBCXX_SIMD_INTRINSIC
+ enable_if_t<(_Np == native_simd<_Tp>::size()), native_simd<_Tp>>
+ to_native(const fixed_size_simd<_Tp, _Np>& __x)
+{
+ alignas(memory_alignment_v<native_simd<_Tp>>) _Tp __mem[_Np];
+ __x.copy_to(__mem, vector_aligned);
+ return {__mem, vector_aligned};
+}
+
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+ enable_if_t<(_Np == native_simd_mask<_Tp>::size()), native_simd_mask<_Tp>>
+ to_native(const fixed_size_simd_mask<_Tp, _Np>& __x)
+{
+ return native_simd_mask<_Tp>([&](auto __i) constexpr { return __x[__i]; });
+}
+
+// to_compatible {{{2
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC enable_if_t<(_Np == simd<_Tp>::size()), simd<_Tp>>
+to_compatible(const simd<_Tp, simd_abi::fixed_size<_Np>>& __x)
+{
+ alignas(memory_alignment_v<simd<_Tp>>) _Tp __mem[_Np];
+ __x.copy_to(__mem, vector_aligned);
+ return {__mem, vector_aligned};
+}
+
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+ enable_if_t<(_Np == simd_mask<_Tp>::size()), simd_mask<_Tp>>
+ to_compatible(const simd_mask<_Tp, simd_abi::fixed_size<_Np>>& __x)
+{
+ return simd_mask<_Tp>([&](auto __i) constexpr { return __x[__i]; });
+}
+
+// masked assignment [simd_mask.where] {{{1
+
+// where_expression {{{1
+template <typename _M, typename _Tp> class const_where_expression //{{{2
+{
+ using _V = _Tp;
+ static_assert(std::is_same_v<_V, __remove_cvref_t<_Tp>>);
+ struct Wrapper
+ {
+ using value_type = _V;
+ };
+
+protected:
+ using _Impl = typename _V::_Impl;
+
+ using value_type = typename std::conditional_t<std::is_arithmetic<_V>::value,
+ Wrapper, _V>::value_type;
+ _GLIBCXX_SIMD_INTRINSIC friend const _M&
+ __get_mask(const const_where_expression& __x)
+ {
+ return __x._M_k;
+ }
+ _GLIBCXX_SIMD_INTRINSIC friend const _Tp&
+ __get_lvalue(const const_where_expression& __x)
+ {
+ return __x._M_value;
+ }
+ const _M& _M_k;
+ _Tp& _M_value;
+
+public:
+ const_where_expression(const const_where_expression&) = delete;
+ const_where_expression& operator=(const const_where_expression&) = delete;
+
+ _GLIBCXX_SIMD_INTRINSIC const_where_expression(const _M& __kk, const _Tp& dd)
+ : _M_k(__kk), _M_value(const_cast<_Tp&>(dd))
+ {}
+
+ _GLIBCXX_SIMD_INTRINSIC _V operator-() const&&
+ {
+ return {__private_init,
+ _Impl::template __masked_unary<std::negate>(__data(_M_k),
+ __data(_M_value))};
+ }
+
+ template <typename _Up, typename _Flags>
+ [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+ copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags __f) const&&
+ {
+ return {__private_init,
+ _Impl::__masked_load(__data(_M_value), __data(_M_k), __mem, __f)};
+ }
+
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC void copy_to(_LoadStorePtr<_Up, value_type>* __mem,
+ _Flags __f) const&&
+ {
+ _Impl::__masked_store(__data(_M_value), __mem, __f, __data(_M_k));
+ }
+};
+
+template <typename _Tp> class const_where_expression<bool, _Tp> //{{{2
+{
+ using _M = bool;
+ using _V = _Tp;
+ static_assert(std::is_same_v<_V, __remove_cvref_t<_Tp>>);
+ struct Wrapper
+ {
+ using value_type = _V;
+ };
+
+protected:
+ using value_type = typename std::conditional_t<std::is_arithmetic<_V>::value,
+ Wrapper, _V>::value_type;
+ _GLIBCXX_SIMD_INTRINSIC friend const _M&
+ __get_mask(const const_where_expression& __x)
+ {
+ return __x._M_k;
+ }
+ _GLIBCXX_SIMD_INTRINSIC friend const _Tp&
+ __get_lvalue(const const_where_expression& __x)
+ {
+ return __x._M_value;
+ }
+ const bool _M_k;
+ _Tp& _M_value;
+
+public:
+ const_where_expression(const const_where_expression&) = delete;
+ const_where_expression& operator=(const const_where_expression&) = delete;
+
+ _GLIBCXX_SIMD_INTRINSIC const_where_expression(const bool __kk, const _Tp& dd)
+ : _M_k(__kk), _M_value(const_cast<_Tp&>(dd))
+ {}
+
+ _GLIBCXX_SIMD_INTRINSIC _V operator-() const&&
+ {
+ return _M_k ? -_M_value : _M_value;
+ }
+
+ template <typename _Up, typename _Flags>
+ [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+ copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags) const&&
+ {
+ return _M_k ? static_cast<_V>(__mem[0]) : _M_value;
+ }
+
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC void copy_to(_LoadStorePtr<_Up, value_type>* __mem,
+ _Flags) const&&
+ {
+ if (_M_k)
+ {
+ __mem[0] = _M_value;
+ }
+ }
+};
+
+// where_expression {{{2
+template <typename _M, typename _Tp>
+class where_expression : public const_where_expression<_M, _Tp>
+{
+ using _Impl = typename const_where_expression<_M, _Tp>::_Impl;
+
+ static_assert(!std::is_const<_Tp>::value,
+ "where_expression may only be instantiated with __a non-const "
+ "_Tp parameter");
+ using typename const_where_expression<_M, _Tp>::value_type;
+ using const_where_expression<_M, _Tp>::_M_k;
+ using const_where_expression<_M, _Tp>::_M_value;
+ static_assert(
+ std::is_same<typename _M::abi_type, typename _Tp::abi_type>::value, "");
+ static_assert(_M::size() == _Tp::size(), "");
+
+ _GLIBCXX_SIMD_INTRINSIC friend _Tp& __get_lvalue(where_expression& __x)
+ {
+ return __x._M_value;
+ }
+
+public:
+ where_expression(const where_expression&) = delete;
+ where_expression& operator=(const where_expression&) = delete;
+
+ _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
+ : const_where_expression<_M, _Tp>(__kk, dd)
+ {}
+
+ template <typename _Up> _GLIBCXX_SIMD_INTRINSIC void operator=(_Up&& __x) &&
+ {
+ _Impl::__masked_assign(__data(_M_k), __data(_M_value),
+ __to_value_type_or_member_type<_Tp>(
+ static_cast<_Up&&>(__x)));
+ }
+
+#define _GLIBCXX_SIMD_OP_(__op, __name) \
+ template <typename _Up> \
+ _GLIBCXX_SIMD_INTRINSIC void operator __op##=(_Up&& __x)&& \
+ { \
+ _Impl::template __masked_cassign( \
+ __data(_M_k), __data(_M_value), \
+ __to_value_type_or_member_type<_Tp>(static_cast<_Up&&>(__x)), \
+ [](auto __impl, auto __lhs, auto __rhs) constexpr { \
+ return __impl.__name(__lhs, __rhs); \
+ }); \
+ } \
+ static_assert(true)
+ _GLIBCXX_SIMD_OP_(+, __plus);
+ _GLIBCXX_SIMD_OP_(-, __minus);
+ _GLIBCXX_SIMD_OP_(*, __multiplies);
+ _GLIBCXX_SIMD_OP_(/, __divides);
+ _GLIBCXX_SIMD_OP_(%, __modulus);
+ _GLIBCXX_SIMD_OP_(&, __bit_and);
+ _GLIBCXX_SIMD_OP_(|, __bit_or);
+ _GLIBCXX_SIMD_OP_(^, __bit_xor);
+ _GLIBCXX_SIMD_OP_(<<, __shift_left);
+ _GLIBCXX_SIMD_OP_(>>, __shift_right);
+#undef _GLIBCXX_SIMD_OP_
+
+ _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+ {
+ __data(_M_value)
+ = _Impl::template __masked_unary<__increment>(__data(_M_k),
+ __data(_M_value));
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+ {
+ __data(_M_value)
+ = _Impl::template __masked_unary<__increment>(__data(_M_k),
+ __data(_M_value));
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+ {
+ __data(_M_value)
+ = _Impl::template __masked_unary<__decrement>(__data(_M_k),
+ __data(_M_value));
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+ {
+ __data(_M_value)
+ = _Impl::template __masked_unary<__decrement>(__data(_M_k),
+ __data(_M_value));
+ }
+
+ // intentionally hides const_where_expression::copy_from
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC void
+ copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags __f) &&
+ {
+ __data(_M_value)
+ = _Impl::__masked_load(__data(_M_value), __data(_M_k), __mem, __f);
+ }
+};
+
+// where_expression<bool> {{{2
+template <typename _Tp>
+class where_expression<bool, _Tp> : public const_where_expression<bool, _Tp>
+{
+ using _M = bool;
+ using typename const_where_expression<_M, _Tp>::value_type;
+ using const_where_expression<_M, _Tp>::_M_k;
+ using const_where_expression<_M, _Tp>::_M_value;
+
+public:
+ where_expression(const where_expression&) = delete;
+ where_expression& operator=(const where_expression&) = delete;
+
+ _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
+ : const_where_expression<_M, _Tp>(__kk, dd)
+ {}
+
+#define _GLIBCXX_SIMD_OP_(__op) \
+ template <typename _Up> \
+ _GLIBCXX_SIMD_INTRINSIC void operator __op(_Up&& __x)&& \
+ { \
+ if (_M_k) \
+ { \
+ _M_value __op static_cast<_Up&&>(__x); \
+ } \
+ } \
+ static_assert(true)
+ _GLIBCXX_SIMD_OP_(=);
+ _GLIBCXX_SIMD_OP_(+=);
+ _GLIBCXX_SIMD_OP_(-=);
+ _GLIBCXX_SIMD_OP_(*=);
+ _GLIBCXX_SIMD_OP_(/=);
+ _GLIBCXX_SIMD_OP_(%=);
+ _GLIBCXX_SIMD_OP_(&=);
+ _GLIBCXX_SIMD_OP_(|=);
+ _GLIBCXX_SIMD_OP_(^=);
+ _GLIBCXX_SIMD_OP_(<<=);
+ _GLIBCXX_SIMD_OP_(>>=);
+#undef _GLIBCXX_SIMD_OP_
+ _GLIBCXX_SIMD_INTRINSIC void operator++() &&
+ {
+ if (_M_k)
+ {
+ ++_M_value;
+ }
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator++(int) &&
+ {
+ if (_M_k)
+ {
+ ++_M_value;
+ }
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator--() &&
+ {
+ if (_M_k)
+ {
+ --_M_value;
+ }
+ }
+ _GLIBCXX_SIMD_INTRINSIC void operator--(int) &&
+ {
+ if (_M_k)
+ {
+ --_M_value;
+ }
+ }
+
+ // intentionally hides const_where_expression::copy_from
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC void
+ copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _Flags) &&
+ {
+ if (_M_k)
+ {
+ _M_value = __mem[0];
+ }
+ }
+};
+
+// where_expression<_M, tuple<...>> {{{2
+
+// where {{{1
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+where(const typename simd<_Tp, _Ap>::mask_type& __k, simd<_Tp, _Ap>& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+ const_where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+ where(const typename simd<_Tp, _Ap>::mask_type& __k,
+ const simd<_Tp, _Ap>& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+ where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+ where(const std::remove_const_t<simd_mask<_Tp, _Ap>>& __k,
+ simd_mask<_Tp, _Ap>& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+ const_where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
+ where(const std::remove_const_t<simd_mask<_Tp, _Ap>>& __k,
+ const simd_mask<_Tp, _Ap>& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC where_expression<bool, _Tp>
+where(_ExactBool __k, _Tp& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC const_where_expression<bool, _Tp>
+where(_ExactBool __k, const _Tp& __value)
+{
+ return {__k, __value};
+}
+template <typename _Tp, typename _Ap>
+void
+where(bool __k, simd<_Tp, _Ap>& __value)
+ = delete;
+template <typename _Tp, typename _Ap>
+void
+where(bool __k, const simd<_Tp, _Ap>& __value)
+ = delete;
+
+// proposed mask iterations {{{1
+namespace __proposed {
+template <size_t _Np> class where_range
+{
+ const std::bitset<_Np> __bits;
+
+public:
+ where_range(std::bitset<_Np> __b) : __bits(__b) {}
+
+ class iterator
+ {
+ size_t __mask;
+ size_t __bit;
+
+ _GLIBCXX_SIMD_INTRINSIC void __next_bit()
+ {
+ __bit = __builtin_ctzl(__mask);
+ }
+ _GLIBCXX_SIMD_INTRINSIC void __reset_lsb()
+ {
+ // 01100100 - 1 = 01100011
+ __mask &= (__mask - 1);
+ // __asm__("btr %1,%0" : "+r"(__mask) : "r"(__bit));
+ }
+
+ public:
+ iterator(decltype(__mask) __m) : __mask(__m) { __next_bit(); }
+ iterator(const iterator&) = default;
+ iterator(iterator&&) = default;
+
+ _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator->() const { return __bit; }
+ _GLIBCXX_SIMD_ALWAYS_INLINE size_t operator*() const { return __bit; }
+
+ _GLIBCXX_SIMD_ALWAYS_INLINE iterator& operator++()
+ {
+ __reset_lsb();
+ __next_bit();
+ return *this;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE iterator operator++(int)
+ {
+ iterator __tmp = *this;
+ __reset_lsb();
+ __next_bit();
+ return __tmp;
+ }
+
+ _GLIBCXX_SIMD_ALWAYS_INLINE bool operator==(const iterator& __rhs) const
+ {
+ return __mask == __rhs.__mask;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE bool operator!=(const iterator& __rhs) const
+ {
+ return __mask != __rhs.__mask;
+ }
+ };
+
+ iterator begin() const { return __bits.to_ullong(); }
+ iterator end() const { return 0; }
+};
+
+template <typename _Tp, typename _Ap>
+where_range<simd_size_v<_Tp, _Ap>>
+where(const simd_mask<_Tp, _Ap>& __k)
+{
+ return __k.__to_bitset();
+}
+
+} // namespace __proposed
+
+// }}}1
+// reductions [simd.reductions] {{{1
+template <typename _Tp, typename _Abi, typename _BinaryOperation = std::plus<>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+reduce(const simd<_Tp, _Abi>& __v,
+ _BinaryOperation __binary_op = _BinaryOperation())
+{
+ return _Abi::_SimdImpl::__reduce(__v, __binary_op);
+}
+
+template <typename _M, typename _V, typename _BinaryOperation = std::plus<>>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x,
+ typename _V::value_type __identity_element, _BinaryOperation __binary_op)
+{
+ if (__builtin_expect(none_of(__get_mask(__x)), false))
+ return __identity_element;
+
+ _V __tmp = __identity_element;
+ _V::_Impl::__masked_assign(__data(__get_mask(__x)), __data(__tmp),
+ __data(__get_lvalue(__x)));
+ return reduce(__tmp, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::plus<> __binary_op = {})
+{
+ return reduce(__x, 0, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::multiplies<> __binary_op)
+{
+ return reduce(__x, 1, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_and<> __binary_op)
+{
+ return reduce(__x, ~typename _V::value_type(), __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_or<> __binary_op)
+{
+ return reduce(__x, 0, __binary_op);
+}
+
+template <typename _M, typename _V>
+_GLIBCXX_SIMD_INTRINSIC typename _V::value_type
+reduce(const const_where_expression<_M, _V>& __x, std::bit_xor<> __binary_op)
+{
+ return reduce(__x, 0, __binary_op);
+}
+
+// }}}1
+// algorithms [simd.alg] {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+min(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ return {__private_init, _Ap::_SimdImpl::__min(__data(__a), __data(__b))};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+max(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ return {__private_init, _Ap::_SimdImpl::__max(__data(__a), __data(__b))};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC
+ _GLIBCXX_SIMD_CONSTEXPR std::pair<simd<_Tp, _Ap>, simd<_Tp, _Ap>>
+ minmax(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ const auto pair_of_members
+ = _Ap::_SimdImpl::__minmax(__data(__a), __data(__b));
+ return {simd<_Tp, _Ap>(__private_init, pair_of_members.first),
+ simd<_Tp, _Ap>(__private_init, pair_of_members.second)};
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo,
+ const simd<_Tp, _Ap>& __hi)
+{
+ using _Impl = typename _Ap::_SimdImpl;
+ return {__private_init,
+ _Impl::__min(__data(__hi), _Impl::__max(__data(__lo), __data(__v)))};
+}
+
+// }}}
+
+namespace _P0918 {
+// shuffle {{{1
+template <int _Stride, int _Offset = 0> struct strided
+{
+ static constexpr int _S_stride = _Stride;
+ static constexpr int _S_offset = _Offset;
+ template <typename _Tp, typename _Ap>
+ using __shuffle_return_type = simd<
+ _Tp, simd_abi::deduce_t<
+ _Tp, __div_roundup(simd_size_v<_Tp, _Ap> - _Offset, _Stride), _Ap>>;
+ // alternative, always use fixed_size:
+ // fixed_size_simd<_Tp, __div_roundup(simd_size_v<_Tp, _Ap> - _Offset,
+ // _Stride)>;
+ template <typename _Tp> static constexpr auto __src_index(_Tp __dst_index)
+ {
+ return _Offset + __dst_index * _Stride;
+ }
+};
+
+// SFINAE for the return type ensures _P is a type that provides the alias
+// template member
+// __shuffle_return_type and the static member function __src_index
+template <typename _P, typename _Tp, typename _Ap,
+ typename _R = typename _P::template __shuffle_return_type<_Tp, _Ap>,
+ typename
+ = decltype(_P::__src_index(std::experimental::_SizeConstant<0>()))>
+_GLIBCXX_SIMD_INTRINSIC _R
+shuffle(const simd<_Tp, _Ap>& __x)
+{
+ return _R([&__x](auto __i) constexpr { return __x[_P::__src_index(__i)]; });
+}
+
+// }}}1
+} // namespace _P0918
+
+namespace __proposed {
+using namespace _P0918;
+} // namespace __proposed
+
+template <size_t... _Sizes, typename _Tp, typename _Ap,
+ typename = enable_if_t<((_Sizes + ...) == simd<_Tp, _Ap>::size())>>
+inline std::tuple<simd<_Tp, simd_abi::deduce_t<_Tp, _Sizes>>...>
+split(const simd<_Tp, _Ap>&);
+
+// __extract_part {{{
+template <int _Index, int _Total, int _Combine = 1, typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+ _GLIBCXX_CONST _SimdWrapper<_Tp, _Np / _Total * _Combine>
+ __extract_part(const _SimdWrapper<_Tp, _Np> __x);
+
+template <int Index, int Parts, int _Combine = 1, typename _Tp, typename _A0,
+ typename... _As>
+_GLIBCXX_SIMD_INTRINSIC auto
+__extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x);
+
+// }}}
+// _SizeList {{{
+template <size_t _V0, size_t... _Values> struct _SizeList
+{
+ template <size_t _I> static constexpr size_t __at(_SizeConstant<_I> = {})
+ {
+ if constexpr (_I == 0)
+ {
+ return _V0;
+ }
+ else
+ {
+ return _SizeList<_Values...>::template __at<_I - 1>();
+ }
+ }
+
+ template <size_t _I> static constexpr auto __before(_SizeConstant<_I> = {})
+ {
+ if constexpr (_I == 0)
+ {
+ return _SizeConstant<0>();
+ }
+ else
+ {
+ return _SizeConstant<
+ _V0 + _SizeList<_Values...>::template __before<_I - 1>()>();
+ }
+ }
+
+ template <size_t _Np>
+ static constexpr auto __pop_front(_SizeConstant<_Np> = {})
+ {
+ if constexpr (_Np == 0)
+ {
+ return _SizeList();
+ }
+ else
+ {
+ return _SizeList<_Values...>::template __pop_front<_Np - 1>();
+ }
+ }
+};
+// }}}
+// __extract_center {{{
+template <typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC _SimdWrapper<_Tp, _Np / 2>
+__extract_center(_SimdWrapper<_Tp, _Np> __x)
+{
+ static_assert(_Np >= 4);
+ static_assert(_Np % 4 == 0); // x0 - x1 - x2 - x3 -> return {x1, x2}
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ if constexpr (__have_avx512f && sizeof(_Tp) * _Np == 64)
+ {
+ const auto __intrin = __to_intrin(__x);
+ if constexpr (std::is_integral_v<_Tp>)
+ return __vector_bitcast<_Tp>(_mm512_castsi512_si256(
+ _mm512_shuffle_i32x4(__intrin, __intrin,
+ 1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+ else if constexpr (sizeof(_Tp) == 4)
+ return __vector_bitcast<_Tp>(_mm512_castps512_ps256(
+ _mm512_shuffle_f32x4(__intrin, __intrin,
+ 1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+ else if constexpr (sizeof(_Tp) == 8)
+ return __vector_bitcast<_Tp>(_mm512_castpd512_pd256(
+ _mm512_shuffle_f64x2(__intrin, __intrin,
+ 1 + 2 * 0x4 + 2 * 0x10 + 3 * 0x40)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(_Tp) * _Np == 32 && std::is_floating_point_v<_Tp>)
+ return __vector_bitcast<_Tp>(
+ _mm_shuffle_pd(__lo128(__vector_bitcast<double>(__x)),
+ __hi128(__vector_bitcast<double>(__x)), 1));
+ else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) * _Np <= 32)
+ return __vector_bitcast<_Tp>(
+ _mm_alignr_epi8(__hi128(__vector_bitcast<_LLong>(__x)),
+ __lo128(__vector_bitcast<_LLong>(__x)),
+ sizeof(_Tp) * _Np / 4));
+ else
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ {
+ __vector_type_t<_Tp, _Np / 2> __r;
+ __builtin_memcpy(&__r,
+ reinterpret_cast<const char*>(&__x)
+ + sizeof(_Tp) * _Np / 4,
+ sizeof(_Tp) * _Np / 2);
+ return __r;
+ }
+}
+
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC
+ _SimdWrapper<_Tp, _SimdTuple<_Tp, _A0, _As...>::size() / 2>
+ __extract_center(const _SimdTuple<_Tp, _A0, _As...>& __x)
+{
+ if constexpr (sizeof...(_As) == 0)
+ return __extract_center(__x.first);
+ else
+ return __extract_part<1, 4, 2>(__x);
+}
+
+// }}}
+// __split_wrapper {{{
+template <size_t... _Sizes, typename _Tp, typename... _As>
+auto
+__split_wrapper(_SizeList<_Sizes...>, const _SimdTuple<_Tp, _As...>& __x)
+{
+ return std::experimental::split<_Sizes...>(
+ fixed_size_simd<_Tp, _SimdTuple<_Tp, _As...>::size()>(__private_init, __x));
+}
+
+// }}}
+
+// split<simd>(simd) {{{
+template <typename _V, typename _Ap,
+ size_t Parts = simd_size_v<typename _V::value_type, _Ap> / _V::size()>
+inline enable_if_t<
+ (is_simd<_V>::value
+ && simd_size_v<typename _V::value_type, _Ap> == Parts * _V::size()),
+ std::array<_V, Parts>>
+split(const simd<typename _V::value_type, _Ap>& __x)
+{
+ using _Tp = typename _V::value_type;
+ if constexpr (Parts == 1)
+ {
+ return {simd_cast<_V>(__x)};
+ }
+ else if (__x._M_is_constprop())
+ {
+ return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+ auto __i) constexpr {
+ return _V([&](auto __j) constexpr {
+ return __x[__i * _V::size() + __j];
+ });
+ });
+ }
+ else if constexpr (
+ __is_fixed_size_abi_v<_Ap>
+ && (std::is_same_v<typename _V::abi_type, simd_abi::scalar>
+ || (__is_fixed_size_abi_v<typename _V::abi_type>
+ && sizeof(_V) == sizeof(_Tp) * _V::size() // _V doesn't have padding
+ )))
+ {
+ // fixed_size -> fixed_size (w/o padding) or scalar
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+ const __may_alias<_Tp>* const __element_ptr
+ = reinterpret_cast<const __may_alias<_Tp>*>(&__data(__x));
+ return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+ auto __i) constexpr {
+ return _V(__element_ptr + __i * _V::size(), vector_aligned);
+ });
+#else
+ const auto& __xx = __data(__x);
+ return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+ auto __i) constexpr {
+ [[maybe_unused]] constexpr size_t __offset
+ = decltype(__i)::value * _V::size();
+ return _V([&](auto __j) constexpr {
+ constexpr _SizeConstant<__j + __offset> __k;
+ return __xx[__k];
+ });
+ });
+#endif
+ }
+ else if constexpr (std::is_same_v<typename _V::abi_type, simd_abi::scalar>)
+ {
+ // normally memcpy should work here as well
+ return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+ auto __i) constexpr { return __x[__i]; });
+ }
+ else
+ {
+ return __generate_from_n_evaluations<Parts, std::array<_V, Parts>>([&](
+ auto __i) constexpr {
+ if constexpr (__is_fixed_size_abi_v<typename _V::abi_type>)
+ {
+ return _V([&](auto __j) constexpr {
+ return __x[__i * _V::size() + __j];
+ });
+ }
+ else
+ {
+ return _V(__private_init,
+ __extract_part<decltype(__i)::value, Parts>(__data(__x)));
+ }
+ });
+ }
+}
+
+// }}}
+// split<simd_mask>(simd_mask) {{{
+template <typename _V, typename _Ap,
+ size_t _Parts
+ = simd_size_v<typename _V::simd_type::value_type, _Ap> / _V::size()>
+enable_if_t<
+ (is_simd_mask_v<
+ _V> && simd_size_v<typename _V::simd_type::value_type, _Ap> == _Parts * _V::size()),
+ std::array<_V, _Parts>>
+split(const simd_mask<typename _V::simd_type::value_type, _Ap>& __x)
+{
+ if constexpr (std::is_same_v<_Ap, typename _V::abi_type>)
+ {
+ return {__x};
+ }
+ else if constexpr (_Parts == 1)
+ {
+ return {__proposed::static_simd_cast<_V>(__x)};
+ }
+ else if constexpr (_Parts == 2 && __is_sse_abi<typename _V::abi_type>()
+ && __is_avx_abi<_Ap>())
+ {
+ return {_V(__private_init, __lo128(__data(__x))),
+ _V(__private_init, __hi128(__data(__x)))};
+ }
+ else if constexpr (_V::size() <= CHAR_BIT * sizeof(_ULLong))
+ {
+ const std::bitset __bits = __x.__to_bitset();
+ return __generate_from_n_evaluations<_Parts, std::array<_V, _Parts>>([&](
+ auto __i) constexpr {
+ constexpr size_t __offset = __i * _V::size();
+ return _V(__bitset_init, (__bits >> __offset).to_ullong());
+ });
+ }
+ else
+ {
+ return __generate_from_n_evaluations<_Parts, std::array<_V, _Parts>>([&](
+ auto __i) constexpr {
+ constexpr size_t __offset = __i * _V::size();
+ return _V(
+ __private_init, [&](auto __j) constexpr {
+ return __x[__j + __offset];
+ });
+ });
+ }
+}
+
+// }}}
+// split<_Sizes...>(simd) {{{
+template <size_t... _Sizes, typename _Tp, typename _Ap, typename>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+ std::tuple<simd<_Tp, simd_abi::deduce_t<_Tp, _Sizes>>...>
+ split(const simd<_Tp, _Ap>& __x)
+{
+ using _SL = _SizeList<_Sizes...>;
+ using _Tuple = std::tuple<__deduced_simd<_Tp, _Sizes>...>;
+ constexpr size_t _Np = simd_size_v<_Tp, _Ap>;
+ constexpr size_t _N0 = _SL::template __at<0>();
+ using _V = __deduced_simd<_Tp, _N0>;
+
+ if (__x._M_is_constprop())
+ return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+ auto __i) constexpr {
+ using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+ constexpr size_t __offset = _SL::__before(__i);
+ return _Vi([&](auto __j) constexpr { return __x[__offset + __j]; });
+ });
+ else if constexpr (_Np == _N0)
+ {
+ static_assert(sizeof...(_Sizes) == 1);
+ return {simd_cast<_V>(__x)};
+ }
+ else if constexpr // split from fixed_size, such that __x::first.size == _N0
+ (__is_fixed_size_abi_v<
+ _Ap> && __fixed_size_storage_t<_Tp, _Np>::_S_first_size == _N0)
+ {
+ static_assert(!__is_fixed_size_abi_v<typename _V::abi_type>,
+ "How can <_Tp, _Np> be __a single _SimdTuple entry but __a "
+ "fixed_size_simd "
+ "when deduced?");
+ // extract first and recurse (__split_wrapper is needed to deduce a new
+ // _Sizes pack)
+ return std::tuple_cat(
+ std::make_tuple(_V(__private_init, __data(__x).first)),
+ __split_wrapper(_SL::template __pop_front<1>(), __data(__x).second));
+ }
+ else if constexpr ((!std::is_same_v<simd_abi::scalar,
+ simd_abi::deduce_t<_Tp, _Sizes>> && ...)
+ && (!__is_fixed_size_abi_v<
+ simd_abi::deduce_t<_Tp, _Sizes>> && ...))
+ {
+ if constexpr (((_Sizes * 2 == _Np) && ...))
+ return {{__private_init, __extract_part<0, 2>(__data(__x))},
+ {__private_init, __extract_part<1, 2>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<_Np / 3, _Np / 3, _Np / 3>>)
+ return {{__private_init, __extract_part<0, 3>(__data(__x))},
+ {__private_init, __extract_part<1, 3>(__data(__x))},
+ {__private_init, __extract_part<2, 3>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<2 * _Np / 3, _Np / 3>>)
+ return {{__private_init, __extract_part<0, 3, 2>(__data(__x))},
+ {__private_init, __extract_part<2, 3>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<_Np / 3, 2 * _Np / 3>>)
+ return {{__private_init, __extract_part<0, 3>(__data(__x))},
+ {__private_init, __extract_part<1, 3, 2>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<_Np / 2, _Np / 4, _Np / 4>>)
+ return {{__private_init, __extract_part<0, 2>(__data(__x))},
+ {__private_init, __extract_part<2, 4>(__data(__x))},
+ {__private_init, __extract_part<3, 4>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<_Np / 4, _Np / 4, _Np / 2>>)
+ return {{__private_init, __extract_part<0, 4>(__data(__x))},
+ {__private_init, __extract_part<1, 4>(__data(__x))},
+ {__private_init, __extract_part<1, 2>(__data(__x))}};
+ else if constexpr (std::is_same_v<_SizeList<_Sizes...>,
+ _SizeList<_Np / 4, _Np / 2, _Np / 4>>)
+ return {{__private_init, __extract_part<0, 4>(__data(__x))},
+ {__private_init, __extract_center(__data(__x))},
+ {__private_init, __extract_part<3, 4>(__data(__x))}};
+ else if constexpr (((_Sizes * 4 == _Np) && ...))
+ return {{__private_init, __extract_part<0, 4>(__data(__x))},
+ {__private_init, __extract_part<1, 4>(__data(__x))},
+ {__private_init, __extract_part<2, 4>(__data(__x))},
+ {__private_init, __extract_part<3, 4>(__data(__x))}};
+ // else fall through
+ }
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+ const __may_alias<_Tp>* const __element_ptr
+ = reinterpret_cast<const __may_alias<_Tp>*>(&__x);
+ return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+ auto __i) constexpr {
+ using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+ constexpr size_t __offset = _SL::__before(__i);
+ constexpr size_t __base_align = alignof(simd<_Tp, _Ap>);
+ constexpr size_t __a
+ = __base_align - ((__offset * sizeof(_Tp)) % __base_align);
+ constexpr size_t __b = ((__a - 1) & __a) ^ __a;
+ constexpr size_t __alignment = __b == 0 ? __a : __b;
+ return _Vi(__element_ptr + __offset, overaligned<__alignment>);
+ });
+#else
+ return __generate_from_n_evaluations<sizeof...(_Sizes), _Tuple>([&](
+ auto __i) constexpr {
+ using _Vi = __deduced_simd<_Tp, _SL::__at(__i)>;
+ const auto& __xx = __data(__x);
+ using _Offset = decltype(_SL::__before(__i));
+ return _Vi([&](auto __j) constexpr {
+ constexpr _SizeConstant<_Offset::value + __j> __k;
+ return __xx[__k];
+ });
+ });
+#endif
+}
+
+// }}}
+
+// __subscript_in_pack {{{
+template <size_t _I, typename _Tp, typename _Ap, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__subscript_in_pack(const simd<_Tp, _Ap>& __x, const simd<_Tp, _As>&... __xs)
+{
+ if constexpr (_I < simd_size_v<_Tp, _Ap>)
+ return __x[_I];
+ else
+ return __subscript_in_pack<_I - simd_size_v<_Tp, _Ap>>(__xs...);
+}
+
+// }}}
+// __store_pack_of_simd {{{
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC void
+__store_pack_of_simd(char* __mem, const simd<_Tp, _A0>& __x0,
+ const simd<_Tp, _As>&... __xs)
+{
+ constexpr size_t __n_bytes = sizeof(_Tp) * simd_size_v<_Tp, _A0>;
+ __builtin_memcpy(__mem, &__data(__x0), __n_bytes);
+ if constexpr (sizeof...(__xs) > 0)
+ __store_pack_of_simd(__mem + __n_bytes, __xs...);
+}
+
+// }}}
+// concat(simd...) {{{
+template <typename _Tp, typename... _As>
+inline _GLIBCXX_SIMD_CONSTEXPR
+simd<_Tp, simd_abi::deduce_t<_Tp, (simd_size_v<_Tp, _As> + ...)>>
+concat(const simd<_Tp, _As>&... __xs)
+{
+ using _Rp = __deduced_simd<_Tp, (simd_size_v<_Tp, _As> + ...)>;
+ if constexpr(sizeof...(__xs) == 1)
+ return simd_cast<_Rp>(__xs...);
+ else if ((... && __xs._M_is_constprop()))
+ return simd<_Tp,
+ simd_abi::deduce_t<_Tp, (simd_size_v<_Tp, _As> + ...)>>([&](
+ auto __i) constexpr { return __subscript_in_pack<__i>(__xs...); });
+ else
+ {
+ _Rp __r{};
+ __store_pack_of_simd(reinterpret_cast<char*>(&__data(__r)), __xs...);
+ return __r;
+ }
+}
+
+// }}}
+// concat(array<simd>) {{{
+template <typename _Tp, typename _Abi, size_t _Np>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
+__deduced_simd<_Tp, simd_size_v<_Tp, _Abi> * _Np>
+concat(const std::array<simd<_Tp, _Abi>, _Np>& __x)
+{
+ return __call_with_subscripts<_Np>(__x, [](const auto&... __xs) {
+ return concat(__xs...);
+ });
+}
+
+// }}}
+
+// _SmartReference {{{
+template <typename _Up, typename _Accessor = _Up,
+ typename _ValueType = typename _Up::value_type>
+class _SmartReference
+{
+ friend _Accessor;
+ int _M_index;
+ _Up& _M_obj;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _ValueType __read() const noexcept
+ {
+ if constexpr (std::is_arithmetic_v<_Up>)
+ {
+ _GLIBCXX_DEBUG_ASSERT(_M_index == 0);
+ return _M_obj;
+ }
+ else
+ {
+ return _M_obj[_M_index];
+ }
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr void __write(_Tp&& __x) const
+ {
+ _Accessor::__set(_M_obj, _M_index, static_cast<_Tp&&>(__x));
+ }
+
+public:
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference(_Up& __o, int __i) noexcept
+ : _M_index(__i), _M_obj(__o)
+ {}
+
+ using value_type = _ValueType;
+
+ _GLIBCXX_SIMD_INTRINSIC _SmartReference(const _SmartReference&) = delete;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr operator value_type() const noexcept
+ {
+ return __read();
+ }
+
+ template <typename _Tp,
+ typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, value_type>>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator=(_Tp&& __x) &&
+ {
+ __write(static_cast<_Tp&&>(__x));
+ return {_M_obj, _M_index};
+ }
+
+ // TODO: improve with operator.()
+
+#define _GLIBCXX_SIMD_OP_(__op) \
+ template <typename _Tp, \
+ typename _TT \
+ = decltype(std::declval<value_type>() __op std::declval<_Tp>()), \
+ typename = _ValuePreservingOrInt<__remove_cvref_t<_Tp>, _TT>, \
+ typename = _ValuePreservingOrInt<_TT, value_type>> \
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator __op##=( \
+ _Tp&& __x)&& \
+ { \
+ const value_type& __lhs = __read(); \
+ __write(__lhs __op __x); \
+ return {_M_obj, _M_index}; \
+ }
+ _GLIBCXX_SIMD_ALL_ARITHMETICS(_GLIBCXX_SIMD_OP_);
+ _GLIBCXX_SIMD_ALL_SHIFTS(_GLIBCXX_SIMD_OP_);
+ _GLIBCXX_SIMD_ALL_BINARY(_GLIBCXX_SIMD_OP_);
+#undef _GLIBCXX_SIMD_OP_
+
+ template <typename _Tp = void,
+ typename = decltype(
+ ++std::declval<std::conditional_t<true, value_type, _Tp>&>())>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator++() &&
+ {
+ value_type __x = __read();
+ __write(++__x);
+ return {_M_obj, _M_index};
+ }
+
+ template <typename _Tp = void,
+ typename = decltype(
+ std::declval<std::conditional_t<true, value_type, _Tp>&>()++)>
+ _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator++(int) &&
+ {
+ const value_type __r = __read();
+ value_type __x = __r;
+ __write(++__x);
+ return __r;
+ }
+
+ template <typename _Tp = void,
+ typename = decltype(
+ --std::declval<std::conditional_t<true, value_type, _Tp>&>())>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SmartReference operator--() &&
+ {
+ value_type __x = __read();
+ __write(--__x);
+ return {_M_obj, _M_index};
+ }
+
+ template <typename _Tp = void,
+ typename = decltype(
+ std::declval<std::conditional_t<true, value_type, _Tp>&>()--)>
+ _GLIBCXX_SIMD_INTRINSIC constexpr value_type operator--(int) &&
+ {
+ const value_type __r = __read();
+ value_type __x = __r;
+ __write(--__x);
+ return __r;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC friend void
+ swap(_SmartReference&& __a, _SmartReference&& __b) noexcept(
+ conjunction<
+ std::is_nothrow_constructible<value_type, _SmartReference&&>,
+ std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+ {
+ value_type __tmp = static_cast<_SmartReference&&>(__a);
+ static_cast<_SmartReference&&>(__a) = static_cast<value_type>(__b);
+ static_cast<_SmartReference&&>(__b) = std::move(__tmp);
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC friend void
+ swap(value_type& __a, _SmartReference&& __b) noexcept(
+ conjunction<
+ std::is_nothrow_constructible<value_type, value_type&&>,
+ std::is_nothrow_assignable<value_type&, value_type&&>,
+ std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+ {
+ value_type __tmp(std::move(__a));
+ __a = static_cast<value_type>(__b);
+ static_cast<_SmartReference&&>(__b) = std::move(__tmp);
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC friend void
+ swap(_SmartReference&& __a, value_type& __b) noexcept(
+ conjunction<
+ std::is_nothrow_constructible<value_type, _SmartReference&&>,
+ std::is_nothrow_assignable<value_type&, value_type&&>,
+ std::is_nothrow_assignable<_SmartReference&&, value_type&&>>::value)
+ {
+ value_type __tmp(__a);
+ static_cast<_SmartReference&&>(__a) = std::move(__b);
+ __b = std::move(__tmp);
+ }
+};
+
+// }}}
+// __scalar_abi_wrapper {{{
+template <int _Bytes> struct __scalar_abi_wrapper
+{
+ template <typename _Tp, typename _Abi = simd_abi::scalar>
+ static constexpr bool _S_is_valid_v
+ = _Abi::template _IsValid<_Tp>::value && sizeof(_Tp) == _Bytes;
+};
+
+// }}}
+// __decay_abi metafunction {{{
+template <typename _Tp> struct __decay_abi
+{
+ using type = _Tp;
+};
+template <int _Bytes> struct __decay_abi<__scalar_abi_wrapper<_Bytes>>
+{
+ using type = simd_abi::scalar;
+};
+
+// }}}
+// __full_abi metafunction {{{1
+// Given an ABI tag A where A::_S_is_partial == true, define type to be such
+// that _S_is_partial == false and A::_S_full_size<T> == type::size<T> for all
+// valid T
+template <template <int> class _Abi, int _Bytes, typename _Tp> struct __full_abi
+{
+ static constexpr auto __choose()
+ {
+ using _High = _Abi<__next_power_of_2(_Bytes) / 2>;
+ if constexpr (_High::template _S_is_valid_v<
+ _Tp> || _Bytes <= sizeof(_Tp) * 2)
+ return _High();
+ else
+ return
+ typename __full_abi<_Abi, __next_power_of_2(_Bytes) / 2, _Tp>::type();
+ }
+ using type = decltype(__choose());
+};
+
+template <int _Bytes, typename _Tp>
+struct __full_abi<__scalar_abi_wrapper, _Bytes, _Tp>
+{
+ using type = simd_abi::scalar;
+};
+
+// _AbiList {{{1
+template <template <int> class...> struct _AbiList
+{
+ template <typename, int> static constexpr bool _S_has_valid_abi = false;
+ template <typename, int> using _FirstValidAbi = void;
+ template <typename, int> using _BestAbi = void;
+};
+
+template <template <int> class _A0, template <int> class... _Rest>
+struct _AbiList<_A0, _Rest...>
+{
+ template <typename _Tp, int _Np>
+ static constexpr bool _S_has_valid_abi
+ = _A0<sizeof(_Tp) * _Np>::template _S_is_valid_v<
+ _Tp> || _AbiList<_Rest...>::template _S_has_valid_abi<_Tp, _Np>;
+
+ template <typename _Tp, int _Np>
+ using _FirstValidAbi = std::conditional_t<
+ _A0<sizeof(_Tp) * _Np>::template _S_is_valid_v<_Tp>,
+ typename __decay_abi<_A0<sizeof(_Tp) * _Np>>::type,
+ typename _AbiList<_Rest...>::template _FirstValidAbi<_Tp, _Np>>;
+
+ template <typename _Tp, int _Np> static constexpr auto __determine_best_abi()
+ {
+ constexpr int _Bytes = sizeof(_Tp) * _Np;
+ if constexpr (_A0<_Bytes>::template _S_is_valid_v<_Tp>)
+ return typename __decay_abi<_A0<_Bytes>>::type{};
+ else
+ {
+ using _B = typename __full_abi<_A0, _Bytes, _Tp>::type;
+ if constexpr (_B::template _S_is_valid_v<
+ _Tp> && _B::template size<_Tp> <= _Np)
+ return _B{};
+ else
+ return typename _AbiList<_Rest...>::template _BestAbi<_Tp, _Np>{};
+ }
+ }
+
+ template <typename _Tp, int _Np>
+ using _BestAbi = decltype(__determine_best_abi<_Tp, _Np>());
+};
+
+// }}}1
+
+// the following lists all native ABIs, which makes them accessible to
+// simd_abi::deduce and select_best_vector_type_t (for fixed_size). Order
+// matters: Whatever comes first has higher priority.
+using _AllNativeAbis = _AbiList<simd_abi::_VecBltnBtmsk, simd_abi::_VecBuiltin,
+ __scalar_abi_wrapper>;
+
+// valid _SimdTraits specialization {{{1
+template <typename _Tp, typename _Abi>
+struct _SimdTraits<_Tp, _Abi,
+ std::void_t<typename _Abi::template _IsValid<_Tp>>>
+ : _Abi::template __traits<_Tp>
+{
+};
+
+// __deduce_impl specializations {{{1
+// try all native ABIs (including scalar) first
+template <typename _Tp, std::size_t _Np>
+struct __deduce_impl<
+ _Tp, _Np, enable_if_t<_AllNativeAbis::template _S_has_valid_abi<_Tp, _Np>>>
+{
+ using type = _AllNativeAbis::_FirstValidAbi<_Tp, _Np>;
+};
+
+// fall back to fixed_size only if scalar and native ABIs don't match
+template <typename _Tp, std::size_t _Np, typename = void>
+struct __deduce_fixed_size_fallback
+{
+};
+template <typename _Tp, std::size_t _Np>
+struct __deduce_fixed_size_fallback<
+ _Tp, _Np, enable_if_t<simd_abi::fixed_size<_Np>::template _S_is_valid_v<_Tp>>>
+{
+ using type = simd_abi::fixed_size<_Np>;
+};
+template <typename _Tp, std::size_t _Np, typename>
+struct __deduce_impl : public __deduce_fixed_size_fallback<_Tp, _Np>
+{
+};
+
+//}}}1
+
+// simd_mask {{{
+template <typename _Tp, typename _Abi>
+class simd_mask : public _SimdTraits<_Tp, _Abi>::_MaskBase
+{
+ // types, tags, and friends {{{
+ using _Traits = _SimdTraits<_Tp, _Abi>;
+ using _MemberType = typename _Traits::_MaskMember;
+ static constexpr _Tp* _S_type_tag = nullptr;
+ friend typename _Traits::_MaskBase;
+ friend class simd<_Tp, _Abi>; // to construct masks on return
+ friend typename _Traits::_SimdImpl; // to construct masks on return and
+ // inspect data on masked operations
+public:
+ using _Impl = typename _Traits::_MaskImpl;
+ friend _Impl;
+ // }}}
+ // member types {{{
+ using value_type = bool;
+ using reference = _SmartReference<_MemberType, _Impl, value_type>;
+ using simd_type = simd<_Tp, _Abi>;
+ using abi_type = _Abi;
+
+ // }}}
+ static constexpr size_t size() { return __size_or_zero_v<_Tp, _Abi>; }
+ // constructors & assignment {{{
+ simd_mask() = default;
+ simd_mask(const simd_mask&) = default;
+ simd_mask(simd_mask&&) = default;
+ simd_mask& operator=(const simd_mask&) = default;
+ simd_mask& operator=(simd_mask&&) = default;
+
+ // }}}
+
+ // access to internal representation (suggested extension) {{{
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit simd_mask(
+ typename _Traits::_MaskCastType __init)
+ : _M_data{__init}
+ {}
+ // conversions to internal type is done in _MaskBase
+
+ // }}}
+ // bitset interface (extension to be proposed) {{{
+ // TS_FEEDBACK:
+ // Conversion of simd_mask to and from bitset makes it much easier to
+ // interface with other facilities. I suggest adding `static
+ // simd_mask::from_bitset` and `simd_mask::to_bitset`.
+ _GLIBCXX_SIMD_ALWAYS_INLINE static simd_mask
+ __from_bitset(std::bitset<size()> bs)
+ {
+ return {__bitset_init, bs};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE std::bitset<size()> __to_bitset() const
+ {
+ return _Impl::__to_bits(_M_data)._M_to_bitset();
+ }
+
+ // }}}
+ // explicit broadcast constructor {{{
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+ simd_mask(value_type __x)
+ : _M_data(_Impl::template __broadcast<_Tp>(__x))
+ {}
+
+ // }}}
+ // implicit type conversion constructor {{{
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+ // proposed improvement
+ template <typename _Up, typename _A2,
+ typename = enable_if_t<simd_size_v<_Up, _A2> == size()>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit(
+ sizeof(_MemberType) != sizeof(typename _SimdTraits<_Up, _A2>::_MaskMember))
+ simd_mask(const simd_mask<_Up, _A2>& __x)
+ : simd_mask(__proposed::static_simd_cast<simd_mask>(__x))
+ {}
+#else
+ // conforming to ISO/IEC 19570:2018
+ template <typename _Up, typename = enable_if_t<conjunction<
+ is_same<abi_type, simd_abi::fixed_size<size()>>,
+ is_same<_Up, _Up>>::value>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE
+ simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
+ : _M_data(_Impl::__from_bitmask(__data(__x), _S_type_tag))
+ {}
+#endif
+ // }}}
+ // load constructor {{{
+ template <typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask(const value_type* __mem, _Flags)
+ : _M_data(_Impl::template __load<_Tp, _Flags>(__mem))
+ {}
+ template <typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask(const value_type* __mem, simd_mask __k,
+ _Flags __f)
+ : _M_data{}
+ {
+ _M_data = _Impl::__masked_load(_M_data, __k._M_data, __mem, __f);
+ }
+
+ // }}}
+ // loads [simd_mask.load] {{{
+ template <typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE void copy_from(const value_type* __mem, _Flags)
+ {
+ _M_data = _Impl::template __load<_Tp, _Flags>(__mem);
+ }
+
+ // }}}
+ // stores [simd_mask.store] {{{
+ template <typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE void copy_to(value_type* __mem, _Flags __f) const
+ {
+ _Impl::__store(_M_data, __mem, __f);
+ }
+
+ // }}}
+ // scalar access {{{
+ _GLIBCXX_SIMD_ALWAYS_INLINE reference operator[](size_t __i)
+ {
+ return {_M_data, int(__i)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE value_type operator[]([
+ [maybe_unused]] size_t __i) const
+ {
+ if constexpr (__is_scalar_abi<_Abi>())
+ {
+ _GLIBCXX_DEBUG_ASSERT(__i == 0);
+ return _M_data;
+ }
+ else
+ return static_cast<bool>(_M_data[__i]);
+ }
+
+ // }}}
+ // negation {{{
+ _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask operator!() const
+ {
+ return {__private_init, _Impl::__bit_not(_M_data)};
+ }
+
+ // }}}
+ // simd_mask binary operators [simd_mask.binary] {{{
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+ // simd_mask<int> && simd_mask<uint> needs disambiguation
+ template <typename _Up, typename _A2,
+ typename
+ = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+ operator&&(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
+ {
+ return {__private_init,
+ _Impl::__logical_and(__x._M_data, simd_mask(__y)._M_data)};
+ }
+ template <typename _Up, typename _A2,
+ typename
+ = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+ operator||(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
+ {
+ return {__private_init,
+ _Impl::__logical_or(__x._M_data, simd_mask(__y)._M_data)};
+ }
+#endif // _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator&&(const simd_mask& __x,
+ const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__logical_and(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator||(const simd_mask& __x,
+ const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__logical_or(__x._M_data, __y._M_data)};
+ }
+
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator&(const simd_mask& __x,
+ const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__bit_and(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator|(const simd_mask& __x,
+ const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__bit_or(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask operator^(const simd_mask& __x,
+ const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__bit_xor(__x._M_data, __y._M_data)};
+ }
+
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator&=(simd_mask& __x,
+ const simd_mask& __y)
+ {
+ __x._M_data = _Impl::__bit_and(__x._M_data, __y._M_data);
+ return __x;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator|=(simd_mask& __x,
+ const simd_mask& __y)
+ {
+ __x._M_data = _Impl::__bit_or(__x._M_data, __y._M_data);
+ return __x;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask& operator^=(simd_mask& __x,
+ const simd_mask& __y)
+ {
+ __x._M_data = _Impl::__bit_xor(__x._M_data, __y._M_data);
+ return __x;
+ }
+
+ // }}}
+ // simd_mask compares [simd_mask.comparison] {{{
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+ operator==(const simd_mask& __x, const simd_mask& __y)
+ {
+ return !operator!=(__x, __y);
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+ operator!=(const simd_mask& __x, const simd_mask& __y)
+ {
+ return {__private_init, _Impl::__bit_xor(__x._M_data, __y._M_data)};
+ }
+
+ // }}}
+ // private_init ctor {{{
+ _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+ simd_mask(_PrivateInit, typename _Traits::_MaskMember __init)
+ : _M_data(__init)
+ {}
+
+ // }}}
+ // private_init generator ctor {{{
+ template <typename _Fp,
+ typename = decltype(bool(std::declval<_Fp>()(size_t())))>
+ _GLIBCXX_SIMD_INTRINSIC constexpr simd_mask(_PrivateInit, _Fp&& __gen)
+ : _M_data()
+ {
+ __execute_n_times<size()>(
+ [&](auto __i) constexpr { _Impl::__set(_M_data, __i, __gen(__i)); });
+ }
+
+ // }}}
+ // bitset_init ctor {{{
+ _GLIBCXX_SIMD_INTRINSIC simd_mask(_BitsetInit, std::bitset<size()> __init)
+ : _M_data(
+ _Impl::__from_bitmask(_SanitizedBitMask<size()>(__init), _S_type_tag))
+ {}
+
+ // }}}
+ // __cvt {{{
+ // TS_FEEDBACK:
+ // The conversion operator this implements should be a ctor on simd_mask.
+ // Once you call .__cvt() on a simd_mask it converts conveniently.
+ // A useful variation: add `explicit(sizeof(_Tp) != sizeof(_Up))`
+ struct _CvtProxy
+ {
+ template <typename _Up, typename _A2,
+ typename
+ = enable_if_t<simd_size_v<_Up, _A2> == simd_size_v<_Tp, _Abi>>>
+ operator simd_mask<_Up, _A2>() &&
+ {
+ using namespace std::experimental::__proposed;
+ return static_simd_cast<simd_mask<_Up, _A2>>(_M_data);
+ }
+
+ const simd_mask<_Tp, _Abi>& _M_data;
+ };
+ _GLIBCXX_SIMD_INTRINSIC _CvtProxy __cvt() const { return {*this}; }
+ // }}}
+ // operator?: overloads (suggested extension) {{{
+#ifdef __GXX_CONDITIONAL_IS_OVERLOADABLE__
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+ operator?:(const simd_mask& __k, const simd_mask& __where_true,
+ const simd_mask& __where_false)
+ {
+ auto __ret = __where_false;
+ _Impl::__masked_assign(__k._M_data, __ret._M_data, __where_true._M_data);
+ return __ret;
+ }
+
+ template <typename _U1, typename _U2,
+ typename _Rp = simd<common_type_t<_U1, _U2>, _Abi>,
+ typename = enable_if_t<conjunction_v<
+ is_convertible<_U1, _Rp>, is_convertible<_U2, _Rp>,
+ is_convertible<simd_mask, typename _Rp::mask_type>>>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend _Rp
+ operator?:(const simd_mask& __k, const _U1& __where_true,
+ const _U2& __where_false)
+ {
+ _Rp __ret = __where_false;
+ _Rp::_Impl::__masked_assign(__data(
+ static_cast<typename _Rp::mask_type>(__k)),
+ __data(__ret),
+ __data(static_cast<_Rp>(__where_true)));
+ return __ret;
+ }
+
+#ifdef _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+ template <typename _Kp, typename _Ak, typename _Up, typename _Au,
+ typename = enable_if_t<
+ conjunction_v<is_convertible<simd_mask<_Kp, _Ak>, simd_mask>,
+ is_convertible<simd_mask<_Up, _Au>, simd_mask>>>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
+ operator?:(const simd_mask<_Kp, _Ak>& __k, const simd_mask& __where_true,
+ const simd_mask<_Up, _Au>& __where_false)
+ {
+ simd_mask __ret = __where_false;
+ _Impl::__masked_assign(simd_mask(__k)._M_data, __ret._M_data,
+ __where_true._M_data);
+ return __ret;
+ }
+#endif // _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
+#endif // __GXX_CONDITIONAL_IS_OVERLOADABLE__
+ // }}}
+ // _M_is_constprop {{{
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ if constexpr (__is_scalar_abi<_Abi>())
+ return __builtin_constant_p(_M_data);
+ else
+ return _M_data._M_is_constprop();
+ }
+
+ // }}}
+
+private:
+ friend const auto& __data<_Tp, abi_type>(const simd_mask&);
+ friend auto& __data<_Tp, abi_type>(simd_mask&);
+ alignas(_Traits::_S_mask_align) _MemberType _M_data;
+};
+
+// }}}
+
+// __data(simd_mask) {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd_mask<_Tp, _Ap>& __x)
+{
+ return __x._M_data;
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd_mask<_Tp, _Ap>& __x)
+{
+ return __x._M_data;
+}
+// }}}
+
+// simd_mask reductions [simd_mask.reductions] {{{
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+all_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (!__k[__i])
+ return false;
+ return true;
+ }
+ else
+ return _Abi::_MaskImpl::__all_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+any_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (__k[__i])
+ return true;
+ return false;
+ }
+ else
+ return _Abi::_MaskImpl::__any_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+none_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (__k[__i])
+ return false;
+ return true;
+ }
+ else
+ return _Abi::_MaskImpl::__none_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+some_of(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = 1; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (__k[__i] != __k[__i - 1])
+ return true;
+ return false;
+ }
+ else
+ return _Abi::_MaskImpl::__some_of(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+popcount(const simd_mask<_Tp, _Abi>& __k) noexcept
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ int __r = 0;
+ for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (__k[__i])
+ ++__r;
+ return __r;
+ }
+ else
+ return _Abi::_MaskImpl::__popcount(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+find_first_set(const simd_mask<_Tp, _Abi>& __k)
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = 0; __i < simd_size_v<_Tp, _Abi>; ++__i)
+ if (__k[__i])
+ return __i;
+ __builtin_unreachable(); // make none_of(__k) UB/ill-formed
+ }
+ else
+ return _Abi::_MaskImpl::__find_first_set(__k);
+}
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+find_last_set(const simd_mask<_Tp, _Abi>& __k)
+{
+ if (__builtin_is_constant_evaluated() || __k._M_is_constprop())
+ {
+ for (size_t __i = simd_size_v<_Tp, _Abi>; __i > 0; --__i)
+ if (__k[__i - 1])
+ return __i - 1;
+ __builtin_unreachable(); // make none_of(__k) UB/ill-formed
+ }
+ else
+ return _Abi::_MaskImpl::__find_last_set(__k);
+}
+
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+all_of(_ExactBool __x) noexcept
+{
+ return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+any_of(_ExactBool __x) noexcept
+{
+ return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+none_of(_ExactBool __x) noexcept
+{
+ return !__x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bool
+ some_of(_ExactBool) noexcept
+{
+ return false;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+popcount(_ExactBool __x) noexcept
+{
+ return __x;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+ find_first_set(_ExactBool)
+{
+ return 0;
+}
+_GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR int
+ find_last_set(_ExactBool)
+{
+ return 0;
+}
+
+// }}}
+
+// _SimdIntOperators{{{1
+template <typename _V, typename _Impl, bool> class _SimdIntOperators
+{
+};
+
+template <typename _V, typename _Impl> class _SimdIntOperators<_V, _Impl, true>
+{
+ _GLIBCXX_SIMD_INTRINSIC const _V& __derived() const
+ {
+ return *static_cast<const _V*>(this);
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _GLIBCXX_SIMD_CONSTEXPR _V
+ __make_derived(_Tp&& __d)
+ {
+ return {__private_init, static_cast<_Tp&&>(__d)};
+ }
+
+public:
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator%=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs % __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator&=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs & __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator|=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs | __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator^=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs ^ __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator<<=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs << __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator>>=(_V& __lhs, const _V& __x)
+ {
+ return __lhs = __lhs >> __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator<<=(_V& __lhs, int __x)
+ {
+ return __lhs = __lhs << __x;
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V& operator>>=(_V& __lhs, int __x)
+ {
+ return __lhs = __lhs >> __x;
+ }
+
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator%(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__modulus(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator&(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_and(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator|(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_or(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator^(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_xor(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator<<(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_shift_left(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator>>(const _V& __x, const _V& __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_shift_right(__data(__x), __data(__y)));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator<<(const _V& __x, int __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_shift_left(__data(__x), __y));
+ }
+ _GLIBCXX_SIMD_CONSTEXPR friend _V operator>>(const _V& __x, int __y)
+ {
+ return _SimdIntOperators::__make_derived(
+ _Impl::__bit_shift_right(__data(__x), __y));
+ }
+
+ // unary operators (for integral _Tp)
+ _GLIBCXX_SIMD_CONSTEXPR _V operator~() const
+ {
+ return {__private_init, _Impl::__complement(__derived()._M_data)};
+ }
+};
+
+//}}}1
+
+// simd {{{
+template <typename _Tp, typename _Abi>
+class simd : public _SimdIntOperators<
+ simd<_Tp, _Abi>, typename _SimdTraits<_Tp, _Abi>::_SimdImpl,
+ conjunction<std::is_integral<_Tp>,
+ typename _SimdTraits<_Tp, _Abi>::_IsValid>::value>,
+ public _SimdTraits<_Tp, _Abi>::_SimdBase
+{
+ using _Traits = _SimdTraits<_Tp, _Abi>;
+ using _MemberType = typename _Traits::_SimdMember;
+ using _CastType = typename _Traits::_SimdCastType;
+ static constexpr _Tp* _S_type_tag = nullptr;
+ friend typename _Traits::_SimdBase;
+
+public:
+ using _Impl = typename _Traits::_SimdImpl;
+ friend _Impl;
+ friend _SimdIntOperators<simd, _Impl, true>;
+
+ using value_type = _Tp;
+ using reference = _SmartReference<_MemberType, _Impl, value_type>;
+ using mask_type = simd_mask<_Tp, _Abi>;
+ using abi_type = _Abi;
+
+ static constexpr size_t size() { return __size_or_zero_v<_Tp, _Abi>; }
+ _GLIBCXX_SIMD_CONSTEXPR simd() = default;
+ _GLIBCXX_SIMD_CONSTEXPR simd(const simd&) = default;
+ _GLIBCXX_SIMD_CONSTEXPR simd(simd&&) noexcept = default;
+ _GLIBCXX_SIMD_CONSTEXPR simd& operator=(const simd&) = default;
+ _GLIBCXX_SIMD_CONSTEXPR simd& operator=(simd&&) noexcept = default;
+
+ // implicit broadcast constructor
+ template <typename _Up, typename = _ValuePreservingOrInt<_Up, value_type>>
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd(_Up&& __x)
+ : _M_data(
+ _Impl::__broadcast(static_cast<value_type>(static_cast<_Up&&>(__x))))
+ {}
+
+ // implicit type conversion constructor (convert from fixed_size to
+ // fixed_size)
+ template <typename _Up>
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
+ simd(const simd<_Up, simd_abi::fixed_size<size()>>& __x,
+ enable_if_t<
+ conjunction<std::is_same<simd_abi::fixed_size<size()>, abi_type>,
+ std::negation<__is_narrowing_conversion<_Up, value_type>>,
+ __converts_to_higher_integer_rank<_Up, value_type>>::value,
+ void*> = nullptr)
+ : simd{static_cast<std::array<_Up, size()>>(__x).data(), vector_aligned}
+ {}
+
+ // explicit type conversion constructor
+#ifdef _GLIBCXX_SIMD_ENABLE_STATIC_CAST
+ template <typename _Up, typename _A2,
+ typename = decltype(
+ static_simd_cast<simd>(std::declval<const simd<_Up, _A2>&>()))>
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+ simd(const simd<_Up, _A2>& __x)
+ : simd(static_simd_cast<simd>(__x))
+ {}
+#endif // _GLIBCXX_SIMD_ENABLE_STATIC_CAST
+
+ // generator constructor
+ template <typename _Fp>
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+ simd(_Fp&& __gen, _ValuePreservingOrInt<decltype(std::declval<_Fp>()(
+ std::declval<_SizeConstant<0>&>())),
+ value_type>* = nullptr)
+ : _M_data(_Impl::__generator(static_cast<_Fp&&>(__gen), _S_type_tag))
+ {}
+
+ // load constructor
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE simd(const _Up* __mem, _Flags __f)
+ : _M_data(_Impl::__load(__mem, __f, _S_type_tag))
+ {}
+
+ // loads [simd.load]
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE void copy_from(const _Vectorizable<_Up>* __mem,
+ _Flags __f)
+ {
+ _M_data
+ = static_cast<decltype(_M_data)>(_Impl::__load(__mem, __f, _S_type_tag));
+ }
+
+ // stores [simd.store]
+ template <typename _Up, typename _Flags>
+ _GLIBCXX_SIMD_ALWAYS_INLINE void copy_to(_Vectorizable<_Up>* __mem,
+ _Flags __f) const
+ {
+ _Impl::__store(_M_data, __mem, __f, _S_type_tag);
+ }
+
+ // scalar access
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR reference
+ operator[](size_t __i)
+ {
+ return {_M_data, int(__i)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR value_type operator[]([
+ [maybe_unused]] size_t __i) const
+ {
+ if constexpr (__is_scalar_abi<_Abi>())
+ {
+ _GLIBCXX_DEBUG_ASSERT(__i == 0);
+ return _M_data;
+ }
+ else
+ {
+ return _M_data[__i];
+ }
+ }
+
+ // increment and decrement:
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd& operator++()
+ {
+ _Impl::__increment(_M_data);
+ return *this;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator++(int)
+ {
+ simd __r = *this;
+ _Impl::__increment(_M_data);
+ return __r;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd& operator--()
+ {
+ _Impl::__decrement(_M_data);
+ return *this;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator--(int)
+ {
+ simd __r = *this;
+ _Impl::__decrement(_M_data);
+ return __r;
+ }
+
+ // unary operators (for any _Tp)
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR mask_type
+ operator!() const
+ {
+ return {__private_init, _Impl::__negate(_M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator+() const
+ {
+ return *this;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd operator-() const
+ {
+ return {__private_init, _Impl::__unary_minus(_M_data)};
+ }
+
+ // access to internal representation (suggested extension)
+ _GLIBCXX_SIMD_ALWAYS_INLINE explicit _GLIBCXX_SIMD_CONSTEXPR
+ simd(_CastType __init)
+ : _M_data(__init)
+ {}
+
+ // compound assignment [simd.cassign]
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+ operator+=(simd& __lhs, const simd& __x)
+ {
+ return __lhs = __lhs + __x;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+ operator-=(simd& __lhs, const simd& __x)
+ {
+ return __lhs = __lhs - __x;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+ operator*=(simd& __lhs, const simd& __x)
+ {
+ return __lhs = __lhs * __x;
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd&
+ operator/=(simd& __lhs, const simd& __x)
+ {
+ return __lhs = __lhs / __x;
+ }
+
+ // binary operators [simd.binary]
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+ operator+(const simd& __x, const simd& __y)
+ {
+ return {__private_init, _Impl::__plus(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+ operator-(const simd& __x, const simd& __y)
+ {
+ return {__private_init, _Impl::__minus(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+ operator*(const simd& __x, const simd& __y)
+ {
+ return {__private_init, _Impl::__multiplies(__x._M_data, __y._M_data)};
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+ operator/(const simd& __x, const simd& __y)
+ {
+ return {__private_init, _Impl::__divides(__x._M_data, __y._M_data)};
+ }
+
+ // compares [simd.comparison]
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator==(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__equal_to(__x._M_data, __y._M_data));
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator!=(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__not_equal_to(__x._M_data, __y._M_data));
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator<(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__less(__x._M_data, __y._M_data));
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator<=(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__less_equal(__x._M_data, __y._M_data));
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator>(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__less(__y._M_data, __x._M_data));
+ }
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend mask_type
+ operator>=(const simd& __x, const simd& __y)
+ {
+ return simd::__make_mask(_Impl::__less_equal(__y._M_data, __x._M_data));
+ }
+
+ // operator?: overloads (suggested extension) {{{
+#ifdef __GXX_CONDITIONAL_IS_OVERLOADABLE__
+ _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd
+ operator?:(const mask_type& __k, const simd& __where_true,
+ const simd& __where_false)
+ {
+ auto __ret = __where_false;
+ _Impl::__masked_assign(__data(__k), __data(__ret), __data(__where_true));
+ return __ret;
+ }
+#endif // __GXX_CONDITIONAL_IS_OVERLOADABLE__
+ // }}}
+
+ // "private" because of the first arguments's namespace
+ _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+ simd(_PrivateInit, const _MemberType& __init)
+ : _M_data(__init)
+ {}
+
+ // "private" because of the first arguments's namespace
+ _GLIBCXX_SIMD_INTRINSIC simd(_BitsetInit, std::bitset<size()> __init)
+ : _M_data()
+ {
+ where(mask_type(__bitset_init, __init), *this) = ~*this;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ if constexpr (__is_scalar_abi<_Abi>())
+ return __builtin_constant_p(_M_data);
+ else
+ return _M_data._M_is_constprop();
+ }
+
+private:
+ _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR static mask_type
+ __make_mask(typename mask_type::_MemberType __k)
+ {
+ return {__private_init, __k};
+ }
+
+ friend const auto& __data<value_type, abi_type>(const simd&);
+ friend auto& __data<value_type, abi_type>(simd&);
+ alignas(_Traits::_S_simd_align) _MemberType _M_data;
+};
+
+// }}}
+// __data {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__data(const simd<_Tp, _Ap>& __x)
+{
+ return __x._M_data;
+}
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__data(simd<_Tp, _Ap>& __x)
+{
+ return __x._M_data;
+}
+// }}}
+
+namespace __proposed {
+namespace float_bitwise_operators {
+// float_bitwise_operators {{{
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator^(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ return {__private_init, _Ap::_SimdImpl::__bit_xor(__data(__a), __data(__b))};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator|(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ return {__private_init, _Ap::_SimdImpl::__bit_or(__data(__a), __data(__b))};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR simd<_Tp, _Ap>
+operator&(const simd<_Tp, _Ap>& __a, const simd<_Tp, _Ap>& __b)
+{
+ return {__private_init, _Ap::_SimdImpl::__bit_and(__data(__a), __data(__b))};
+}
+// }}}
+} // namespace float_bitwise_operators
+} // namespace __proposed
+
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_H
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
new file mode 100644
index 00000000000..4dbdce95797
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -0,0 +1,2854 @@
+// Simd Abi specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+
+#if __cplusplus >= 201703L
+
+#include <array>
+#include <cmath>
+#include <cstdlib>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+// _S_allbits{{{
+template <typename _V>
+static inline constexpr _V _S_allbits
+ = reinterpret_cast<_V>(~__vector_type_t<char, sizeof(_V) / sizeof(char)>());
+
+// }}}
+// _S_signmask, _S_absmask{{{
+template <typename _V, typename = _VectorTraits<_V>>
+static inline constexpr _V _S_signmask = __xor(_V() + 1, _V() - 1);
+template <typename _V, typename = _VectorTraits<_V>>
+static inline constexpr _V _S_absmask
+ = __andnot(_S_signmask<_V>, _S_allbits<_V>);
+
+//}}}
+// __vector_permute<Indices...>{{{
+// Index == -1 requests zeroing of the output element
+template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_Tp
+__vector_permute(_Tp __x)
+{
+ static_assert(sizeof...(_Indices) == _TVT::_S_width);
+ return __make_vector<typename _TVT::value_type>(
+ (_Indices == -1 ? 0 : __x[_Indices == -1 ? 0 : _Indices])...);
+}
+
+// }}}
+// __vector_shuffle<Indices...>{{{
+// Index == -1 requests zeroing of the output element
+template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_Tp
+__vector_shuffle(_Tp __x, _Tp __y)
+{
+ return _Tp{(_Indices == -1 ? 0
+ : _Indices < _TVT::_S_width
+ ? __x[_Indices]
+ : __y[_Indices - _TVT::_S_width])...};
+}
+
+// }}}
+// __make_wrapper{{{
+template <typename _Tp, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, sizeof...(_Args)>
+__make_wrapper(const _Args&... __args)
+{
+ return __make_vector<_Tp>(__args...);
+}
+
+// }}}
+// __wrapper_bitcast{{{
+template <typename _Tp, size_t _ToN = 0, typename _Up, size_t _M,
+ size_t _Np = _ToN != 0 ? _ToN : sizeof(_Up) * _M / sizeof(_Tp)>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<_Tp, _Np>
+__wrapper_bitcast(_SimdWrapper<_Up, _M> __x)
+{
+ static_assert(_Np > 1);
+ return __intrin_bitcast<__vector_type_t<_Tp, _Np>>(__x._M_data);
+}
+
+// }}}
+// __shift_elements_right{{{
+// if (__shift % 2ⁿ == 0) => the low n Bytes are correct
+template <unsigned __shift, typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _Tp
+__shift_elements_right(_Tp __v)
+{
+ [[maybe_unused]] const auto __iv = __to_intrin(__v);
+ static_assert(__shift <= sizeof(_Tp));
+ if constexpr (__shift == 0)
+ return __v;
+ else if constexpr (__shift == sizeof(_Tp))
+ return _Tp();
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ else if constexpr (__have_sse && __shift == 8
+ && _TVT::template __is<float, 4>)
+ return _mm_movehl_ps(__iv, __iv);
+ else if constexpr (__have_sse2 && __shift == 8
+ && _TVT::template __is<double, 2>)
+ return _mm_unpackhi_pd(__iv, __iv);
+ else if constexpr (__have_sse2 && sizeof(_Tp) == 16)
+ return reinterpret_cast<typename _TVT::type>(
+ _mm_srli_si128(reinterpret_cast<__m128i>(__iv), __shift));
+ else if constexpr (__shift == 16 && sizeof(_Tp) == 32)
+ {
+ /*if constexpr (__have_avx && _TVT::template __is<double, 4>)
+ return _mm256_permute2f128_pd(__iv, __iv, 0x81);
+ else if constexpr (__have_avx && _TVT::template __is<float, 8>)
+ return _mm256_permute2f128_ps(__iv, __iv, 0x81);
+ else if constexpr (__have_avx)
+ return reinterpret_cast<typename _TVT::type>(
+ _mm256_permute2f128_si256(__iv, __iv, 0x81));
+ else*/
+ return __zero_extend(__hi128(__v));
+ }
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 32 && __shift < 16)
+ {
+ const auto __vll = __vector_bitcast<_LLong>(__v);
+ return reinterpret_cast<typename _TVT::type>(
+ _mm256_alignr_epi8(_mm256_permute2x128_si256(__vll, __vll, 0x81), __vll,
+ __shift));
+ }
+ else if constexpr (__have_avx && sizeof(_Tp) == 32 && __shift < 16)
+ {
+ const auto __vll = __vector_bitcast<_LLong>(__v);
+ return reinterpret_cast<typename _TVT::type>(
+ __concat(_mm_alignr_epi8(__hi128(__vll), __lo128(__vll), __shift),
+ _mm_srli_si128(__hi128(__vll), __shift)));
+ }
+ else if constexpr (sizeof(_Tp) == 32 && __shift > 16)
+ return __zero_extend(__shift_elements_right<__shift - 16>(__hi128(__v)));
+ else if constexpr (sizeof(_Tp) == 64 && __shift == 32)
+ return __zero_extend(__hi256(__v));
+ else if constexpr (__have_avx512f && sizeof(_Tp) == 64)
+ {
+ if constexpr (__shift >= 48)
+ return __zero_extend(
+ __shift_elements_right<__shift - 48>(__extract<3, 4>(__v)));
+ else if constexpr (__shift >= 32)
+ return __zero_extend(
+ __shift_elements_right<__shift - 32>(__hi256(__v)));
+ else if constexpr (__shift % 8 == 0)
+ return reinterpret_cast<typename _TVT::type>(
+ _mm512_alignr_epi64(__m512i(), __intrin_bitcast<__m512i>(__v),
+ __shift / 8));
+ else if constexpr (__shift % 4 == 0)
+ return reinterpret_cast<typename _TVT::type>(
+ _mm512_alignr_epi32(__m512i(), __intrin_bitcast<__m512i>(__v),
+ __shift / 4));
+ else if constexpr (__have_avx512bw && __shift < 16)
+ {
+ const auto __vll = __vector_bitcast<_LLong>(__v);
+ return reinterpret_cast<typename _TVT::type>(
+ _mm512_alignr_epi8(_mm512_shuffle_i32x4(__vll, __vll, 0xf9), __vll,
+ __shift));
+ }
+ else if constexpr (__have_avx512bw && __shift < 32)
+ {
+ const auto __vll = __vector_bitcast<_LLong>(__v);
+ return reinterpret_cast<typename _TVT::type>(
+ _mm512_alignr_epi8(_mm512_shuffle_i32x4(__vll, __m512i(), 0xee),
+ _mm512_shuffle_i32x4(__vll, __vll, 0xf9),
+ __shift - 16));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+/*
+ } else if constexpr (__shift % 16 == 0 && sizeof(_Tp) == 64)
+ return __auto_bitcast(__extract<__shift / 16, 4>(__v));
+*/
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ else
+ {
+ constexpr int __chunksize
+ = __shift % 8 == 0 ? 8
+ : __shift % 4 == 0 ? 4 : __shift % 2 == 0 ? 2 : 1;
+ auto __w = __vector_bitcast<__int_with_sizeof_t<__chunksize>>(__v);
+ using _Up = decltype(__w);
+ return __intrin_bitcast<_Tp>(
+ __call_with_n_evaluations<(sizeof(_Tp) - __shift) / __chunksize>(
+ [](auto... __chunks) { return _Up{__chunks...}; },
+ [&](auto __i) { return __w[__shift / __chunksize + __i]; }));
+ }
+}
+
+// }}}
+// __extract_part(_SimdWrapper<_Tp, _Np>) {{{
+template <int _Index, int _Total, int _Combine, typename _Tp, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC
+ _GLIBCXX_CONST _SimdWrapper<_Tp, _Np / _Total * _Combine>
+ __extract_part(const _SimdWrapper<_Tp, _Np> __x)
+{
+ if constexpr (_Index % 2 == 0 && _Total % 2 == 0 && _Combine % 2 == 0)
+ return __extract_part<_Index / 2, _Total / 2, _Combine / 2>(__x);
+ else
+ {
+ constexpr size_t __values_per_part = _Np / _Total;
+ constexpr size_t __values_to_skip = _Index * __values_per_part;
+ constexpr size_t __return_size = __values_per_part * _Combine;
+ using _R = __vector_type_t<_Tp, __return_size>;
+ static_assert((_Index + _Combine) * __values_per_part * sizeof(_Tp)
+ <= sizeof(__x),
+ "out of bounds __extract_part");
+ // the following assertion would ensure no "padding" to be read
+ // static_assert(_Total >= _Index + _Combine, "_Total must be greater than
+ // _Index");
+
+ // static_assert(__return_size * _Total == _Np, "_Np must be divisible by
+ // _Total");
+ if (__x._M_is_constprop())
+ return __generate_from_n_evaluations<__return_size, _R>(
+ [&](auto __i) { return __x[__values_to_skip + __i]; });
+ if constexpr (_Index == 0 && _Total == 1)
+ return __x;
+ else if constexpr (_Index == 0)
+ return __intrin_bitcast<_R>(__as_vector(__x));
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ else if constexpr (sizeof(__x) == 32 && __return_size * sizeof(_Tp) <= 16)
+ {
+ constexpr size_t __bytes_to_skip = __values_to_skip * sizeof(_Tp);
+ if constexpr (__bytes_to_skip == 16)
+ return __vector_bitcast<_Tp, __return_size>(
+ __hi128(__as_vector(__x)));
+ else
+ return __vector_bitcast<_Tp, __return_size>(
+ _mm_alignr_epi8(__hi128(__vector_bitcast<_LLong>(__x)),
+ __lo128(__vector_bitcast<_LLong>(__x)),
+ __bytes_to_skip));
+ }
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ else if constexpr (_Index > 0
+ && (__values_to_skip % __return_size != 0
+ || sizeof(_R) >= 8)
+ && (__values_to_skip + __return_size) * sizeof(_Tp)
+ <= 64
+ && sizeof(__x) >= 16)
+ return __intrin_bitcast<_R>(
+ __shift_elements_right<__values_to_skip * sizeof(_Tp)>(
+ __as_vector(__x)));
+ else
+ {
+ _R __r = {};
+ __builtin_memcpy(&__r,
+ reinterpret_cast<const char*>(&__x)
+ + sizeof(_Tp) * __values_to_skip,
+ __return_size * sizeof(_Tp));
+ return __r;
+ }
+ }
+}
+
+// }}}
+// __extract_part(_SimdWrapper<bool, _Np>) {{{
+template <int _Index, int _Total, int _Combine = 1, size_t _Np>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper<bool, _Np / _Total * _Combine>
+__extract_part(const _SimdWrapper<bool, _Np> __x)
+{
+ static_assert(_Combine == 1, "_Combine != 1 not implemented");
+ static_assert(__have_avx512f && _Np == _Np);
+ static_assert(_Total >= 2 && _Index + _Combine <= _Total && _Index >= 0);
+ return __x._M_data >> (_Index * _Np / _Total);
+}
+
+// }}}
+
+// __vector_convert {{{
+// implementation requires an index sequence
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, _From __l, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, _From __l, _From __m, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+ static_cast<_Tp>(__m[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, _From __l, _From __m, _From __n,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+ static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, _From __l, _From __m, _From __n, _From __o,
+ index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+ static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...,
+ static_cast<_Tp>(__o[_I])...};
+}
+
+template <typename _To, typename _From, size_t... _I>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From __a, _From __b, _From __c, _From __d, _From __e,
+ _From __f, _From __g, _From __h, _From __i, _From __j,
+ _From __k, _From __l, _From __m, _From __n, _From __o,
+ _From __p, index_sequence<_I...>)
+{
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _To{static_cast<_Tp>(__a[_I])..., static_cast<_Tp>(__b[_I])...,
+ static_cast<_Tp>(__c[_I])..., static_cast<_Tp>(__d[_I])...,
+ static_cast<_Tp>(__e[_I])..., static_cast<_Tp>(__f[_I])...,
+ static_cast<_Tp>(__g[_I])..., static_cast<_Tp>(__h[_I])...,
+ static_cast<_Tp>(__i[_I])..., static_cast<_Tp>(__j[_I])...,
+ static_cast<_Tp>(__k[_I])..., static_cast<_Tp>(__l[_I])...,
+ static_cast<_Tp>(__m[_I])..., static_cast<_Tp>(__n[_I])...,
+ static_cast<_Tp>(__o[_I])..., static_cast<_Tp>(__p[_I])...};
+}
+
+// Defer actual conversion to the overload that takes an index sequence. Note
+// that this function adds zeros or drops values off the end if you don't ensure
+// matching width.
+template <typename _To, typename... _From, typename _ToT = _VectorTraits<_To>,
+ typename _FromT = _VectorTraits<__first_of_pack_t<_From...>>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From... __xs)
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+ if (!(... && __builtin_constant_p(__xs)))
+ {
+ if constexpr ((sizeof...(_From) & (sizeof...(_From) - 1))
+ == 0) // power-of-two number of arguments
+ return __convert_x86<_To>(__as_vector(__xs)...);
+ else
+ {
+ using _FF = __first_of_pack_t<_From...>;
+ return __vector_convert<_To>(__xs..., _FF{});
+ }
+ }
+ else
+#endif
+ return __vector_convert<_To>(
+ __xs...,
+ make_index_sequence<std::min(_ToT::_S_width, _FromT::_S_width)>());
+}
+
+// This overload takes a vectorizable type _To and produces a return type that
+// matches the width.
+template <typename _To, typename... _From,
+ typename = enable_if_t<__is_vectorizable_v<_To>>,
+ typename _FromT = _VectorTraits<__first_of_pack_t<_From...>>,
+ typename = int>
+_GLIBCXX_SIMD_INTRINSIC constexpr _To
+__vector_convert(_From... __xs)
+{
+ return __vector_convert<__vector_type_t<_To, _FromT::_S_width>>(__xs...);
+}
+
+// }}}
+// __convert function{{{
+template <typename _To, typename _From, typename... _More>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__convert(_From __v0, _More... __vs)
+{
+ if constexpr (__is_vectorizable_v<_From>)
+ {
+ static_assert((true && ... && is_same_v<_From, _More>) );
+ using _V = typename _VectorTraits<_To>::type;
+ using _Tp = typename _VectorTraits<_To>::value_type;
+ return _V{static_cast<_Tp>(__v0), static_cast<_Tp>(__vs)...};
+ }
+ else if constexpr (!__is_vector_type_v<_From>)
+ return __convert<_To>(__as_vector(__v0), __as_vector(__vs)...);
+ else
+ {
+ static_assert((true && ... && is_same_v<_From, _More>) );
+ if constexpr (__is_vectorizable_v<_To>)
+ return __convert<__vector_type_t<_To, (_VectorTraits<_From>::_S_width
+ * (1 + sizeof...(_More)))>>(
+ __v0, __vs...);
+ else if constexpr (!__is_vector_type_v<_To>)
+ return _To(__convert<typename _To::_BuiltinType>(__v0, __vs...));
+ else
+ {
+ static_assert(
+ sizeof...(_More) == 0
+ || _VectorTraits<_To>::_S_width
+ >= (1 + sizeof...(_More)) * _VectorTraits<_From>::_S_width,
+ "__convert(...) requires the input to fit into the output");
+ return __vector_convert<_To>(__v0, __vs...);
+ }
+ }
+}
+
+// }}}
+// __convert_all{{{
+// Converts __v into std::array<_To, N>, where N is _NParts if non-zero or
+// otherwise deduced from _To such that N * #elements(_To) <= #elements(__v).
+// Note: this function may return less than all converted elements
+template <typename _To,
+ size_t _NParts = 0, // allows to convert fewer or more (only last _To,
+ // to be partially filled) than all
+ size_t _Offset = 0, // where to start, # of elements (not Bytes or
+ // Parts)
+ typename _From, typename _FromVT = _VectorTraits<_From>>
+_GLIBCXX_SIMD_INTRINSIC auto
+__convert_all(_From __v)
+{
+ if constexpr (std::is_arithmetic_v<_To> && _NParts != 1)
+ {
+ static_assert(_Offset < _FromVT::_S_width);
+ constexpr auto _Np
+ = _NParts == 0 ? _FromVT::_S_partial_width - _Offset : _NParts;
+ return __generate_from_n_evaluations<_Np, std::array<_To, _Np>>(
+ [&](auto __i) { return static_cast<_To>(__v[__i + _Offset]); });
+ }
+ else
+ {
+ static_assert(__is_vector_type_v<_To>);
+ using _ToVT = _VectorTraits<_To>;
+ if constexpr (__is_vector_type_v<_From>)
+ return __convert_all<_To, _NParts>(__as_wrapper(__v));
+ else if constexpr (_NParts == 1)
+ {
+ static_assert(_Offset % _ToVT::_S_width == 0);
+ return std::array<_To, 1>{__vector_convert<_To>(
+ __extract_part<_Offset / _ToVT::_S_width,
+ __div_roundup(_FromVT::_S_partial_width,
+ _ToVT::_S_width)>(__v))};
+ }
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ else if constexpr (
+ !__have_sse4_1 && _Offset == 0
+ && is_integral_v<
+ typename _FromVT::
+ value_type> && sizeof(typename _FromVT::value_type) < sizeof(typename _ToVT::value_type)
+ && !(sizeof(typename _FromVT::value_type) == 4
+ && is_same_v<typename _ToVT::value_type, double>) )
+ {
+ using _ToT = typename _ToVT::value_type;
+ using _FromT = typename _FromVT::value_type;
+ constexpr size_t _Np
+ = _NParts != 0 ? _NParts
+ : (_FromVT::_S_partial_width / _ToVT::_S_width);
+ using _R = std::array<_To, _Np>;
+ // __adjust modifies its input to have _Np (use _SizeConstant) entries
+ // so that no unnecessary intermediate conversions are requested and,
+ // more importantly, no intermediate conversions are missing
+ [[maybe_unused]] auto __adjust
+ = [](auto __n,
+ auto __vv) -> _SimdWrapper<_FromT, decltype(__n)::value> {
+ return __vector_bitcast<_FromT, decltype(__n)::value>(__vv);
+ };
+ [[maybe_unused]] const auto __vi = __to_intrin(__v);
+ auto&& __make_array =
+ []<typename _ToConvert>(_ToConvert __x0,
+ [[maybe_unused]] _ToConvert __x1) {
+ if constexpr (_Np == 1)
+ return _R{__vector_bitcast<_ToT>(__x0)};
+ else
+ return _R{__vector_bitcast<_ToT>(__x0),
+ __vector_bitcast<_ToT>(__x1)};
+ };
+
+ if constexpr (_Np == 0)
+ return _R{};
+ else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) == 2)
+ {
+ static_assert(std::is_integral_v<_FromT>);
+ static_assert(std::is_integral_v<_ToT>);
+ if constexpr (is_unsigned_v<_FromT>)
+ return __make_array(_mm_unpacklo_epi8(__vi, __m128i()),
+ _mm_unpackhi_epi8(__vi, __m128i()));
+ else
+ return __make_array(
+ _mm_srai_epi16(_mm_unpacklo_epi8(__vi, __vi), 8),
+ _mm_srai_epi16(_mm_unpackhi_epi8(__vi, __vi), 8));
+ }
+ else if constexpr (sizeof(_FromT) == 2 && sizeof(_ToT) == 4)
+ {
+ static_assert(std::is_integral_v<_FromT>);
+ if constexpr (is_floating_point_v<_ToT>)
+ {
+ const auto __ints
+ = __convert_all<__vector_type16_t<int>, _Np>(
+ __adjust(_SizeConstant<_Np * 4>(), __v));
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ return __vector_convert<_To>(__ints[__i]);
+ });
+ }
+ else if constexpr (is_unsigned_v<_FromT>)
+ return __make_array(_mm_unpacklo_epi16(__vi, __m128i()),
+ _mm_unpackhi_epi16(__vi, __m128i()));
+ else
+ return __make_array(
+ _mm_srai_epi32(_mm_unpacklo_epi16(__vi, __vi), 16),
+ _mm_srai_epi32(_mm_unpackhi_epi16(__vi, __vi), 16));
+ }
+ else if constexpr (sizeof(_FromT) == 4 && sizeof(_ToT) == 8
+ && is_integral_v<_FromT> && is_integral_v<_ToT>)
+ {
+ if constexpr (is_unsigned_v<_FromT>)
+ return __make_array(_mm_unpacklo_epi32(__vi, __m128i()),
+ _mm_unpackhi_epi32(__vi, __m128i()));
+ else
+ return __make_array(
+ _mm_unpacklo_epi32(__vi, _mm_srai_epi32(__vi, 31)),
+ _mm_unpackhi_epi32(__vi, _mm_srai_epi32(__vi, 31)));
+ }
+ else if constexpr (sizeof(_FromT) == 4 && sizeof(_ToT) == 8
+ && is_integral_v<_FromT> && is_integral_v<_ToT>)
+ {
+ if constexpr (is_unsigned_v<_FromT>)
+ return __make_array(_mm_unpacklo_epi32(__vi, __m128i()),
+ _mm_unpackhi_epi32(__vi, __m128i()));
+ else
+ return __make_array(
+ _mm_unpacklo_epi32(__vi, _mm_srai_epi32(__vi, 31)),
+ _mm_unpackhi_epi32(__vi, _mm_srai_epi32(__vi, 31)));
+ }
+ else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) >= 4
+ && is_signed_v<_FromT>)
+ {
+ const __m128i __vv[2] = {_mm_unpacklo_epi8(__vi, __vi),
+ _mm_unpackhi_epi8(__vi, __vi)};
+ const __vector_type16_t<int> __vvvv[4]
+ = {__vector_bitcast<int>(_mm_unpacklo_epi16(__vv[0], __vv[0])),
+ __vector_bitcast<int>(_mm_unpackhi_epi16(__vv[0], __vv[0])),
+ __vector_bitcast<int>(_mm_unpacklo_epi16(__vv[1], __vv[1])),
+ __vector_bitcast<int>(_mm_unpackhi_epi16(__vv[1], __vv[1]))};
+ if constexpr (sizeof(_ToT) == 4)
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ return __vector_convert<_To>(__vvvv[__i] >> 24);
+ });
+ else if constexpr (is_integral_v<_ToT>)
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ const auto __signbits = __to_intrin(__vvvv[__i / 2] >> 31);
+ const auto __sx32 = __to_intrin(__vvvv[__i / 2] >> 24);
+ return __vector_bitcast<_ToT>(
+ __i % 2 == 0 ? _mm_unpacklo_epi32(__sx32, __signbits)
+ : _mm_unpackhi_epi32(__sx32, __signbits));
+ });
+ else
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ const auto __int4 = __vvvv[__i / 2] >> 24;
+ return __vector_convert<_To>(
+ __i % 2 == 0 ? __int4
+ : __vector_bitcast<int>(
+ _mm_unpackhi_epi64(__to_intrin(__int4),
+ __to_intrin(__int4))));
+ });
+ }
+ else if constexpr (sizeof(_FromT) == 1 && sizeof(_ToT) == 4)
+ {
+ const auto __shorts = __convert_all<__vector_type16_t<
+ conditional_t<is_signed_v<_FromT>, short, unsigned short>>>(
+ __adjust(_SizeConstant<(_Np + 1) / 2 * 8>(), __v));
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ return __convert_all<_To>(__shorts[__i / 2])[__i % 2];
+ });
+ }
+ else if constexpr (sizeof(_FromT) == 2 && sizeof(_ToT) == 8
+ && is_signed_v<_FromT> && is_integral_v<_ToT>)
+ {
+ const __m128i __vv[2] = {_mm_unpacklo_epi16(__vi, __vi),
+ _mm_unpackhi_epi16(__vi, __vi)};
+ const __vector_type16_t<int> __vvvv[4]
+ = {__vector_bitcast<int>(
+ _mm_unpacklo_epi32(_mm_srai_epi32(__vv[0], 16),
+ _mm_srai_epi32(__vv[0], 31))),
+ __vector_bitcast<int>(
+ _mm_unpackhi_epi32(_mm_srai_epi32(__vv[0], 16),
+ _mm_srai_epi32(__vv[0], 31))),
+ __vector_bitcast<int>(
+ _mm_unpacklo_epi32(_mm_srai_epi32(__vv[1], 16),
+ _mm_srai_epi32(__vv[1], 31))),
+ __vector_bitcast<int>(
+ _mm_unpackhi_epi32(_mm_srai_epi32(__vv[1], 16),
+ _mm_srai_epi32(__vv[1], 31)))};
+ return __generate_from_n_evaluations<_Np, _R>(
+ [&](auto __i) { return __vector_bitcast<_ToT>(__vvvv[__i]); });
+ }
+ else if constexpr (sizeof(_FromT) <= 2 && sizeof(_ToT) == 8)
+ {
+ const auto __ints = __convert_all<__vector_type16_t<
+ conditional_t<is_signed_v<_FromT> || is_floating_point_v<_ToT>,
+ int, unsigned int>>>(
+ __adjust(_SizeConstant<(_Np + 1) / 2 * 4>(), __v));
+ return __generate_from_n_evaluations<_Np, _R>([&](auto __i) {
+ return __convert_all<_To>(__ints[__i / 2])[__i % 2];
+ });
+ }
+ else
+ __assert_unreachable<_To>();
+ }
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ else if constexpr ((_FromVT::_S_partial_width - _Offset)
+ > _ToVT::_S_width)
+ {
+ /*
+ static_assert(
+ (_FromVT::_S_partial_width & (_FromVT::_S_partial_width - 1)) == 0,
+ "__convert_all only supports power-of-2 number of elements.
+ Otherwise " "the return type cannot be std::array<_To, N>.");
+ */
+ constexpr size_t _NTotal
+ = (_FromVT::_S_partial_width - _Offset) / _ToVT::_S_width;
+ constexpr size_t _Np = _NParts == 0 ? _NTotal : _NParts;
+ static_assert(
+ _Np <= _NTotal
+ || (_Np == _NTotal + 1
+ && (_FromVT::_S_partial_width - _Offset) % _ToVT::_S_width
+ > 0));
+ using _R = std::array<_To, _Np>;
+ if constexpr (_Np == 1)
+ return _R{__vector_convert<_To>(
+ __as_vector(__extract_part<_Offset, _FromVT::_S_partial_width,
+ _ToVT::_S_width>(__v)))};
+ else
+ return __generate_from_n_evaluations<_Np, _R>([&](
+ auto __i) constexpr {
+ auto __part
+ = __extract_part<__i * _ToVT::_S_width + _Offset,
+ _FromVT::_S_partial_width, _ToVT::_S_width>(
+ __v);
+ return __vector_convert<_To>(__part);
+ });
+ }
+ else if constexpr (_Offset == 0)
+ return std::array<_To, 1>{__vector_convert<_To>(__as_vector(__v))};
+ else
+ return std::array<_To, 1>{__vector_convert<_To>(__as_vector(
+ __extract_part<_Offset, _FromVT::_S_partial_width,
+ _FromVT::_S_partial_width - _Offset>(__v)))};
+ }
+}
+
+// }}}
+
+// _GnuTraits {{{
+template <typename _Tp, typename _Mp, typename _Abi, size_t _Np>
+struct _GnuTraits
+{
+ using _IsValid = true_type;
+ using _SimdImpl = typename _Abi::_SimdImpl;
+ using _MaskImpl = typename _Abi::_MaskImpl;
+
+ // simd and simd_mask member types {{{
+ using _SimdMember = _SimdWrapper<_Tp, _Np>;
+ using _MaskMember = _SimdWrapper<_Mp, _Np>;
+ static constexpr size_t _S_simd_align = alignof(_SimdMember);
+ static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+ // }}}
+ // _SimdBase / base class for simd, providing extra conversions {{{
+ struct _SimdBase2
+ {
+ explicit operator __intrinsic_type_t<_Tp, _Np>() const
+ {
+ return __to_intrin(static_cast<const simd<_Tp, _Abi>*>(this)->_M_data);
+ }
+ explicit operator __vector_type_t<_Tp, _Np>() const
+ {
+ return static_cast<const simd<_Tp, _Abi>*>(this)->_M_data.__builtin();
+ }
+ };
+ struct _SimdBase1
+ {
+ explicit operator __intrinsic_type_t<_Tp, _Np>() const
+ {
+ return __data(*static_cast<const simd<_Tp, _Abi>*>(this));
+ }
+ };
+ using _SimdBase
+ = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+ __vector_type_t<_Tp, _Np>>::value,
+ _SimdBase1, _SimdBase2>;
+
+ // }}}
+ // _MaskBase {{{
+ struct _MaskBase2
+ {
+ explicit operator __intrinsic_type_t<_Tp, _Np>() const
+ {
+ return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data.__intrin();
+ }
+ explicit operator __vector_type_t<_Tp, _Np>() const
+ {
+ return static_cast<const simd_mask<_Tp, _Abi>*>(this)->_M_data._M_data;
+ }
+ };
+ struct _MaskBase1
+ {
+ explicit operator __intrinsic_type_t<_Tp, _Np>() const
+ {
+ return __data(*static_cast<const simd_mask<_Tp, _Abi>*>(this));
+ }
+ };
+ using _MaskBase
+ = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+ __vector_type_t<_Tp, _Np>>::value,
+ _MaskBase1, _MaskBase2>;
+
+ // }}}
+ // _MaskCastType {{{
+ // parameter type of one explicit simd_mask constructor
+ class _MaskCastType
+ {
+ using _Up = __intrinsic_type_t<_Tp, _Np>;
+ _Up _M_data;
+
+ public:
+ _MaskCastType(_Up __x) : _M_data(__x) {}
+ operator _MaskMember() const { return _M_data; }
+ };
+
+ // }}}
+ // _SimdCastType {{{
+ // parameter type of one explicit simd constructor
+ class _SimdCastType1
+ {
+ using _Ap = __intrinsic_type_t<_Tp, _Np>;
+ _SimdMember _M_data;
+
+ public:
+ _SimdCastType1(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+ operator _SimdMember() const { return _M_data; }
+ };
+
+ class _SimdCastType2
+ {
+ using _Ap = __intrinsic_type_t<_Tp, _Np>;
+ using _B = __vector_type_t<_Tp, _Np>;
+ _SimdMember _M_data;
+
+ public:
+ _SimdCastType2(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
+ _SimdCastType2(_B __b) : _M_data(__b) {}
+ operator _SimdMember() const { return _M_data; }
+ };
+
+ using _SimdCastType
+ = std::conditional_t<std::is_same<__intrinsic_type_t<_Tp, _Np>,
+ __vector_type_t<_Tp, _Np>>::value,
+ _SimdCastType1, _SimdCastType2>;
+ //}}}
+};
+
+// }}}
+struct _CommonImplX86;
+struct _CommonImplNeon;
+struct _CommonImplBuiltin;
+template <typename _Abi> struct _SimdImplBuiltin;
+template <typename _Abi> struct _MaskImplBuiltin;
+template <typename _Abi> struct _SimdImplX86;
+template <typename _Abi> struct _MaskImplX86;
+template <typename _Abi> struct _SimdImplNeon;
+template <typename _Abi> struct _MaskImplNeon;
+// simd_abi::_VecBuiltin {{{
+template <int _UsedBytes> struct simd_abi::_VecBuiltin
+{
+ template <typename _Tp>
+ static constexpr size_t size = _UsedBytes / sizeof(_Tp);
+ template <typename _Tp>
+ static constexpr size_t _S_full_size
+ = sizeof(__vector_type_t<_Tp, size<_Tp>>) / sizeof(_Tp);
+ static constexpr bool _S_is_partial = (_UsedBytes & (_UsedBytes - 1)) != 0;
+
+ // validity traits {{{
+ struct _IsValidAbiTag : __bool_constant<(_UsedBytes > 1)>
+ {
+ };
+
+ template <typename _Tp>
+ struct _IsValidSizeFor
+ : std::conjunction<
+ __bool_constant<(_UsedBytes / sizeof(_Tp) > 1
+ && _UsedBytes % sizeof(_Tp) == 0)>,
+ __bool_constant<(_UsedBytes <= __vectorized_sizeof<_Tp>())>>
+ {
+ };
+ template <typename _Tp>
+ struct _IsValid : std::conjunction<_IsValidAbiTag, __is_vectorizable<_Tp>,
+ _IsValidSizeFor<_Tp>>
+ {
+ };
+ template <typename _Tp>
+ static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+ // }}}
+ // _SimdImpl/_MaskImpl {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+ using _CommonImpl = _CommonImplX86;
+ using _SimdImpl = _SimdImplX86<_VecBuiltin<_UsedBytes>>;
+ using _MaskImpl = _MaskImplX86<_VecBuiltin<_UsedBytes>>;
+#elif _GLIBCXX_SIMD_HAVE_NEON
+ using _CommonImpl = _CommonImplNeon;
+ using _SimdImpl = _SimdImplNeon<_VecBuiltin<_UsedBytes>>;
+ using _MaskImpl = _MaskImplNeon<_VecBuiltin<_UsedBytes>>;
+#else
+ using _CommonImpl = _CommonImplBuiltin;
+ using _SimdImpl = _SimdImplBuiltin<_VecBuiltin<_UsedBytes>>;
+ using _MaskImpl = _MaskImplBuiltin<_VecBuiltin<_UsedBytes>>;
+#endif
+
+ // }}}
+ // __traits {{{
+ template <typename _Tp>
+ using __traits = std::conditional_t<
+ _S_is_valid_v<_Tp>,
+ _GnuTraits<_Tp, _Tp, _VecBuiltin<_UsedBytes>, size<_Tp>>, _InvalidTraits>;
+ //}}}
+ // implicit masks {{{
+ template <typename _Tp>
+ static constexpr _SimdWrapper<_Tp, size<_Tp>> __implicit_mask()
+ {
+ constexpr auto __size = _S_full_size<_Tp>;
+ using _ImplicitMask = __vector_type_t<__int_for_sizeof_t<_Tp>, __size>;
+ return reinterpret_cast<__vector_type_t<_Tp, __size>>(
+ !_S_is_partial ? ~_ImplicitMask()
+ : __generate_vector<_ImplicitMask>([](auto __i) constexpr {
+ return __i < _UsedBytes / sizeof(_Tp) ? -1 : 0;
+ }));
+ }
+
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ static constexpr _Tp __masked(_Tp __x)
+ {
+ using _Up = typename _TVT::value_type;
+ if constexpr (_S_is_partial)
+ return __and(__as_vector(__x), __implicit_mask<_Up>()._M_data);
+ else
+ return __x;
+ }
+
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ static constexpr auto __make_padding_nonzero(_Tp __x)
+ {
+ if constexpr (!_S_is_partial)
+ return __x;
+ else
+ {
+ using _Up = typename _TVT::value_type;
+ if constexpr (std::is_integral_v<_Up>)
+ return __or(__x, ~__implicit_mask<_Up>()._M_data);
+ else
+ {
+ constexpr auto __one
+ = __andnot(__implicit_mask<_Up>()._M_data,
+ __vector_broadcast<_S_full_size<_Up>>(_Up(1)));
+ return __or(__x, __one);
+ }
+ }
+ }
+ // }}}
+};
+
+// }}}
+// simd_abi::_VecBltnBtmsk {{{
+template <int _UsedBytes> struct simd_abi::_VecBltnBtmsk
+{
+ template <typename _Tp>
+ static constexpr size_t size = _UsedBytes / sizeof(_Tp);
+ template <typename _Tp>
+ static constexpr size_t _S_full_size
+ = sizeof(__vector_type_t<_Tp, size<_Tp>>) / sizeof(_Tp);
+ static constexpr bool _S_is_partial = (_UsedBytes & (_UsedBytes - 1)) != 0;
+
+ // validity traits {{{
+ struct _IsValidAbiTag : __bool_constant<(_UsedBytes > 1)>
+ {
+ };
+ template <typename _Tp>
+ struct _IsValidSizeFor
+ : __bool_constant<(_UsedBytes / sizeof(_Tp) > 1
+ && _UsedBytes % sizeof(_Tp) == 0 && _UsedBytes <= 64
+ && (_UsedBytes > 32 || __have_avx512vl))>
+ {
+ };
+ // Bitmasks require at least AVX512F. If sizeof(_Tp) < 4 the AVX512BW is also
+ // required.
+ template <typename _Tp>
+ struct _IsValid
+ : conjunction<_IsValidAbiTag, __bool_constant<__have_avx512f>,
+ __bool_constant<__have_avx512bw || (sizeof(_Tp) >= 4)>,
+ __bool_constant<(__vectorized_sizeof<_Tp>() > sizeof(_Tp))>,
+ _IsValidSizeFor<_Tp>>
+ {
+ };
+ template <typename _Tp>
+ static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+ // }}}
+ // implicit mask {{{
+private:
+ template <typename _Tp> using _ImplicitMask = _SimdWrapper<bool, size<_Tp>>;
+
+public:
+ template <size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr __bool_storage_member_type_t<_Np>
+ __implicit_mask_n()
+ {
+ using _Tp = __bool_storage_member_type_t<_Np>;
+ return _Np < sizeof(_Tp) * CHAR_BIT ? _Tp((1ULL << _Np) - 1) : ~_Tp();
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _ImplicitMask<_Tp> __implicit_mask()
+ {
+ return __implicit_mask_n<size<_Tp>>();
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __masked(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (is_same_v<_Tp, bool>)
+ if constexpr (_S_is_partial || _Np < 8)
+ return _MaskImpl::__bit_and(__x, _SimdWrapper<_Tp, _Np>(
+ __bool_storage_member_type_t<_Np>(
+ (1ULL << _Np) - 1)));
+ else
+ return __x;
+ else
+ return __masked(__x._M_data);
+ }
+
+ template <typename _TV>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _TV __masked(_TV __x)
+ {
+ static_assert(
+ !__is_bitmask_v<_TV>,
+ "_VecBltnBtmsk::__masked cannot work on bitmasks, since it doesn't "
+ "know the number of elements. Use _SimdWrapper<bool, N> instead.");
+ if constexpr (_S_is_partial)
+ {
+ using _Tp = typename _VectorTraits<_TV>::value_type;
+ constexpr size_t _Np = size<_Tp>;
+ return __make_dependent_t<_TV, _CommonImpl>::_S_blend(
+ __implicit_mask<_Tp>(), _SimdWrapper<_Tp, _Np>(),
+ _SimdWrapper<_Tp, _Np>(__x));
+ }
+ else
+ return __x;
+ }
+
+ template <typename _TV, typename _TVT = _VectorTraits<_TV>>
+ static constexpr auto __make_padding_nonzero(_TV __x)
+ {
+ if constexpr (!_S_is_partial)
+ return __x;
+ else
+ {
+ using _Tp = typename _TVT::value_type;
+ constexpr size_t _Np = size<_Tp>;
+ if constexpr (is_integral_v<typename _TVT::value_type>)
+ return __x
+ | __generate_vector<_Tp, _S_full_size<_Tp>>(
+ [](auto __i) -> _Tp {
+ if (__i < _Np)
+ return 0;
+ else
+ return 1;
+ });
+ else
+ return __make_dependent_t<_TV, _CommonImpl>::_S_blend(
+ __implicit_mask<_Tp>(),
+ _SimdWrapper<_Tp, _Np>(
+ __vector_broadcast<_S_full_size<_Tp>>(_Tp(1))),
+ _SimdWrapper<_Tp, _Np>(__x))
+ ._M_data;
+ }
+ }
+
+ // }}}
+ // simd/_MaskImpl {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+ using _CommonImpl = _CommonImplX86;
+ using _SimdImpl = _SimdImplX86<_VecBltnBtmsk<_UsedBytes>>;
+ using _MaskImpl = _MaskImplX86<_VecBltnBtmsk<_UsedBytes>>;
+#else
+ template <int> struct _MissingImpl;
+ using _CommonImpl = _MissingImpl<_UsedBytes>;
+ using _SimdImpl = _MissingImpl<_UsedBytes>;
+ using _MaskImpl = _MissingImpl<_UsedBytes>;
+#endif
+
+ // }}}
+ // __traits {{{
+ template <typename _Tp>
+ using __traits = std::conditional_t<
+ _S_is_valid_v<_Tp>,
+ _GnuTraits<_Tp, bool, _VecBltnBtmsk<_UsedBytes>, size<_Tp>>,
+ _InvalidTraits>;
+ //}}}
+};
+
+//}}}
+// _CommonImplBuiltin {{{
+struct _CommonImplBuiltin
+{
+ // __converts_via_decomposition{{{
+ // This lists all cases where a __vector_convert needs to fall back to
+ // conversion of individual scalars (i.e. decompose the input vector into
+ // scalars, convert, compose output vector). In those cases, __masked_load &
+ // __masked_store prefer to use the __bit_iteration implementation.
+ template <typename _From, typename _To, size_t _ToSize>
+ static inline constexpr bool __converts_via_decomposition_v
+ = sizeof(_From) != sizeof(_To);
+
+ // }}}
+ // _S_load{{{
+ template <typename _Tp, size_t _Np, size_t _M = _Np * sizeof(_Tp),
+ typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static __vector_type_t<_Tp, _Np>
+ _S_load(const void* __p, _Fp)
+ {
+ static_assert(_Np > 1);
+ static_assert(_M % sizeof(_Tp) == 0);
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90424
+ using _Up = conditional_t<
+ is_integral_v<_Tp>,
+ conditional_t<_M % 4 == 0, conditional_t<_M % 8 == 0, long long, int>,
+ conditional_t<_M % 2 == 0, short, signed char>>,
+ conditional_t<(_M < 8 || _Np % 2 == 1 || _Np == 2), _Tp, double>>;
+ using _V = __vector_type_t<_Up, _Np * sizeof(_Tp) / sizeof(_Up)>;
+#else // _GLIBCXX_SIMD_WORKAROUND_PR90424
+ using _V = __vector_type_t<_Tp, _Np>;
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90424
+ _V __r{};
+ static_assert(_M <= sizeof(_V));
+ if constexpr (std::is_same_v<_Fp, vector_aligned_tag>)
+ __p = __builtin_assume_aligned(__p, alignof(__vector_type_t<_Tp, _Np>));
+ else if constexpr (!std::is_same_v<_Fp, element_aligned_tag>)
+ __p = __builtin_assume_aligned(__p, _Fp::_S_alignment);
+
+ __builtin_memcpy(&__r, __p, _M);
+ return reinterpret_cast<__vector_type_t<_Tp, _Np>>(__r);
+ }
+
+ // }}}
+ // __store {{{
+ template <size_t _ReqBytes = 0, typename _Flags, typename _TV>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_TV __x, void* __addr, _Flags)
+ {
+ constexpr size_t _Bytes = _ReqBytes == 0 ? sizeof(__x) : _ReqBytes;
+ static_assert(sizeof(__x) >= _Bytes);
+
+ if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+ __addr = __builtin_assume_aligned(__addr, alignof(_TV));
+ else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+ __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+
+ if constexpr (__is_vector_type_v<_TV>)
+ {
+ using _Tp = typename _VectorTraits<_TV>::value_type;
+ constexpr size_t _Np = _Bytes / sizeof(_Tp);
+ static_assert(_Np * sizeof(_Tp) == _Bytes);
+
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90424
+ using _Up = std::conditional_t<
+ (std::is_integral_v<_Tp> || _Bytes < 4),
+ std::conditional_t<(sizeof(__x) > sizeof(long long)), long long, _Tp>,
+ float>;
+ const auto __v = __vector_bitcast<_Up>(__x);
+#else // _GLIBCXX_SIMD_WORKAROUND_PR90424
+ const __vector_type_t<_Tp, _Np> __v = __x;
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90424
+
+ if constexpr ((_Bytes & (_Bytes - 1)) != 0)
+ {
+ constexpr size_t _MoreBytes = __next_power_of_2(_Bytes);
+ alignas(decltype(__v)) char __tmp[_MoreBytes];
+ __builtin_memcpy(__tmp, &__v, _MoreBytes);
+ __builtin_memcpy(__addr, __tmp, _Bytes);
+ }
+ else
+ __builtin_memcpy(__addr, &__v, _Bytes);
+ }
+ else
+ __builtin_memcpy(__addr, &__x, _Bytes);
+ }
+
+ template <typename _Flags, typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __x,
+ void* __addr, _Flags)
+ {
+ __store<_Np * sizeof(_Tp)>(__x._M_data, __addr, _Flags());
+ }
+
+ // }}}
+ // __store_bool_array(_BitMask) {{{
+ template <size_t _Np, typename _Flags, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr void
+ __store_bool_array(_BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+ {
+ if constexpr (_Np == 1)
+ __mem[0] = __x[0];
+ else if constexpr (_Np == 2)
+ {
+ short __bool2 = (__x._M_to_bits() * 0x81) & 0x0101;
+ __store<_Np>(__bool2, __mem, _Flags());
+ }
+ else if constexpr (_Np == 3)
+ {
+ int __bool3 = (__x._M_to_bits() * 0x4081) & 0x010101;
+ __store<_Np>(__bool3, __mem, _Flags());
+ }
+ else
+ {
+ __execute_n_times<__div_roundup(_Np, 4)>([&](auto __i) {
+ constexpr int __offset = __i * 4;
+ constexpr int __remaining = _Np - __offset;
+ if constexpr (__remaining > 4 && __remaining <= 7)
+ {
+ const _ULLong __bool7
+ = (__x.template _M_extract<__offset>()._M_to_bits()
+ * 0x40810204081ULL)
+ & 0x0101010101010101ULL;
+ __store<__remaining>(__bool7, __mem + __offset, _Flags());
+ }
+ else if constexpr (__remaining >= 4)
+ {
+ int __bits = __x.template _M_extract<__offset>()._M_to_bits();
+ if constexpr (__remaining > 7)
+ __bits &= 0xf;
+ const int __bool4 = (__bits * 0x204081) & 0x01010101;
+ __store<4>(__bool4, __mem + __offset, _Flags());
+ }
+ });
+ }
+ }
+
+ // }}}
+ // _S_blend{{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+ _S_blend(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+ _SimdWrapper<_Tp, _Np> __at1)
+ {
+ return __vector_bitcast<__int_for_sizeof_t<_Tp>>(__k) ? __at1._M_data
+ : __at0._M_data;
+ }
+
+ // }}}
+};
+
+// }}}
+// _SimdImplBuiltin {{{1
+template <typename _Abi> struct _SimdImplBuiltin
+{
+ // member types {{{2
+ template <typename _Tp> static constexpr size_t _S_max_store_size = 16;
+ using abi_type = _Abi;
+ template <typename _Tp> using _TypeTag = _Tp*;
+ template <typename _Tp>
+ using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+ template <typename _Tp>
+ using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+ template <typename _Tp>
+ static constexpr size_t _S_size = _Abi::template size<_Tp>;
+ template <typename _Tp>
+ static constexpr size_t _S_full_size = _Abi::template _S_full_size<_Tp>;
+ using _CommonImpl = typename _Abi::_CommonImpl;
+ using _SuperImpl = typename _Abi::_SimdImpl;
+ using _MaskImpl = typename _Abi::_MaskImpl;
+
+ // __make_simd(_SimdWrapper/__intrinsic_type_t) {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+ __make_simd(_SimdWrapper<_Tp, _Np> __x)
+ {
+ return {__private_init, __x};
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+ __make_simd(__intrinsic_type_t<_Tp, _Np> __x)
+ {
+ return {__private_init, __vector_bitcast<_Tp>(__x)};
+ }
+
+ // __broadcast {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
+ __broadcast(_Tp __x) noexcept
+ {
+ return __vector_broadcast<_S_full_size<_Tp>>(__x);
+ }
+
+ // __generator {{{2
+ template <typename _Fp, typename _Tp>
+ inline static constexpr _SimdMember<_Tp> __generator(_Fp&& __gen,
+ _TypeTag<_Tp>)
+ {
+ return __generate_vector<_Tp, _S_full_size<_Tp>>([&](auto __i) constexpr {
+ if constexpr (__i < _S_size<_Tp>)
+ return __gen(__i);
+ else
+ return 0;
+ });
+ }
+
+ // __load {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp> __load(const _Up* __mem, _Fp,
+ _TypeTag<_Tp>) noexcept
+ {
+ constexpr size_t _Np = _S_size<_Tp>;
+ constexpr size_t __max_load_size
+ = (sizeof(_Up) >= 4 && __have_avx512f) || __have_avx512bw
+ ? 64
+ : (std::is_floating_point_v<_Up> && __have_avx) || __have_avx2 ? 32
+ : 16;
+ constexpr size_t __bytes_to_load = sizeof(_Up) * _Np;
+ if constexpr (sizeof(_Up) > 8)
+ return __generate_vector<_Tp, _SimdMember<_Tp>::_S_width>([&](
+ auto __i) constexpr {
+ return static_cast<_Tp>(__i < _Np ? __mem[__i] : 0);
+ });
+ else if constexpr (std::is_same_v<_Up, _Tp>)
+ return _CommonImpl::template _S_load<_Tp, _S_full_size<_Tp>,
+ _Np * sizeof(_Tp)>(__mem, _Fp());
+ else if constexpr (__bytes_to_load <= __max_load_size)
+ return __convert<_SimdMember<_Tp>>(
+ _CommonImpl::template _S_load<_Up, _Np>(__mem, _Fp()));
+ else if constexpr (__bytes_to_load % __max_load_size == 0)
+ {
+ constexpr size_t __n_loads = __bytes_to_load / __max_load_size;
+ constexpr size_t __elements_per_load = _Np / __n_loads;
+ return __call_with_n_evaluations<__n_loads>(
+ [](auto... __uncvted) {
+ return __convert<_SimdMember<_Tp>>(__uncvted...);
+ },
+ [&](auto __i) {
+ return _CommonImpl::template _S_load<_Up, __elements_per_load>(
+ __mem + __i * __elements_per_load, _Fp());
+ });
+ }
+ else if constexpr (__bytes_to_load % (__max_load_size / 2) == 0
+ && __max_load_size > 16)
+ { // e.g. int[] -> <char, 12> with AVX2
+ constexpr size_t __n_loads = __bytes_to_load / (__max_load_size / 2);
+ constexpr size_t __elements_per_load = _Np / __n_loads;
+ return __call_with_n_evaluations<__n_loads>(
+ [](auto... __uncvted) {
+ return __convert<_SimdMember<_Tp>>(__uncvted...);
+ },
+ [&](auto __i) {
+ return _CommonImpl::template _S_load<_Up, __elements_per_load>(
+ __mem + __i * __elements_per_load, _Fp());
+ });
+ }
+ else // e.g. int[] -> <char, 9>
+ return __call_with_subscripts(
+ __mem, make_index_sequence<_Np>(), [](auto... __args) {
+ return __vector_type_t<_Tp, _S_full_size<_Tp>>{
+ static_cast<_Tp>(__args)...};
+ });
+ }
+
+ // __masked_load {{{2
+ template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+ static inline _SimdWrapper<_Tp, _Np>
+ __masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
+ const _Up* __mem, _Fp) noexcept
+ {
+ _BitOps::__bit_iteration(_MaskImpl::__to_bits(__k), [&](auto __i) {
+ __merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+ });
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdMember<_Tp> __v, _Up* __mem,
+ _Fp, _TypeTag<_Tp>) noexcept
+ {
+ // TODO: converting int -> "smaller int" can be optimized with AVX512
+ constexpr size_t _Np = _S_size<_Tp>;
+ constexpr size_t __max_store_size
+ = _SuperImpl::template _S_max_store_size<_Up>;
+ if constexpr (sizeof(_Up) > 8)
+ __execute_n_times<_Np>([&](auto __i) constexpr {
+ __mem[__i] = __v[__i];
+ });
+ else if constexpr (std::is_same_v<_Up, _Tp>)
+ _CommonImpl::__store(__v, __mem, _Fp());
+ else if constexpr (sizeof(_Up) * _Np <= __max_store_size)
+ _CommonImpl::__store(_SimdWrapper<_Up, _Np>(__convert<_Up>(__v)), __mem,
+ _Fp());
+ else
+ {
+ constexpr size_t __vsize = __max_store_size / sizeof(_Up);
+ // round up to convert the last partial vector as well:
+ constexpr size_t __stores = __div_roundup(_Np, __vsize);
+ constexpr size_t __full_stores = _Np / __vsize;
+ using _V = __vector_type_t<_Up, __vsize>;
+ const std::array<_V, __stores> __converted
+ = __convert_all<_V, __stores>(__v);
+ __execute_n_times<__full_stores>([&](auto __i) constexpr {
+ _CommonImpl::__store(__converted[__i], __mem + __i * __vsize, _Fp());
+ });
+ if constexpr (__full_stores < __stores)
+ _CommonImpl::template __store<(_Np - __full_stores * __vsize)
+ * sizeof(_Up)>(
+ __converted[__full_stores], __mem + __full_stores * __vsize, _Fp());
+ }
+ }
+
+ // __masked_store_nocvt {{{2
+ template <typename _Tp, std::size_t _Np, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+ _SimdWrapper<_Tp, _Np> __k)
+ {
+ _BitOps::__bit_iteration(
+ _MaskImpl::__to_bits(__k), [&](auto __i) constexpr {
+ __mem[__i] = __v[__i];
+ });
+ }
+
+ // __masked_store {{{2
+ template <typename _TW, typename _TVT = _VectorTraits<_TW>,
+ typename _Tp = typename _TVT::value_type, typename _Up,
+ typename _Fp>
+ static inline void __masked_store(const _TW __v, _Up* __mem, _Fp,
+ const _MaskMember<_Tp> __k) noexcept
+ {
+ constexpr size_t _TV_size = _S_size<_Tp>;
+ [[maybe_unused]] const auto __vi = __to_intrin(__v);
+ constexpr size_t __max_store_size
+ = _SuperImpl::template _S_max_store_size<_Up>;
+ if constexpr (
+ std::is_same_v<
+ _Tp,
+ _Up> || (std::is_integral_v<_Tp> && std::is_integral_v<_Up> && sizeof(_Tp) == sizeof(_Up)))
+ {
+ // bitwise or no conversion, reinterpret:
+ const auto __kk = [&]() {
+ if constexpr (__is_bitmask_v<decltype(__k)>)
+ return _MaskMember<_Up>(__k._M_data);
+ else
+ return __wrapper_bitcast<_Up>(__k);
+ }();
+ _SuperImpl::__masked_store_nocvt(__wrapper_bitcast<_Up>(__v), __mem,
+ _Fp(), __kk);
+ }
+ else if constexpr (__vectorized_sizeof<_Up>() > sizeof(_Up)
+ && !_CommonImpl::template __converts_via_decomposition_v<
+ _Tp, _Up, __max_store_size>)
+ { // conversion via decomposition is better handled via the bit_iteration
+ // fallback below
+ constexpr size_t _UW_size
+ = std::min(_TV_size, __max_store_size / sizeof(_Up));
+ static_assert(_UW_size <= _TV_size);
+ using _UW = _SimdWrapper<_Up, _UW_size>;
+ using _UV = __vector_type_t<_Up, _UW_size>;
+ using _UAbi = simd_abi::deduce_t<_Up, _UW_size>;
+ if constexpr (_UW_size == _TV_size) // one convert+store
+ {
+ const _UW __converted = __convert<_UW>(__v);
+ _SuperImpl::__masked_store_nocvt(
+ __converted, __mem, _Fp(),
+ _UAbi::_MaskImpl::template __convert<_Up>(__k));
+ }
+ else
+ {
+ static_assert(_UW_size * sizeof(_Up) == __max_store_size);
+ constexpr size_t _NFullStores = _TV_size / _UW_size;
+ constexpr size_t _NAllStores = __div_roundup(_TV_size, _UW_size);
+ constexpr size_t _NParts = _S_full_size<_Tp> / _UW_size;
+ const std::array<_UV, _NAllStores> __converted
+ = __convert_all<_UV, _NAllStores>(__v);
+ __execute_n_times<_NFullStores>([&](auto __i) {
+ _SuperImpl::__masked_store_nocvt(
+ _UW(__converted[__i]), __mem + __i * _UW_size, _Fp(),
+ _UAbi::_MaskImpl::template __convert<_Up>(
+ __extract_part<__i, _NParts>(__k.__as_full_vector())));
+ });
+ if constexpr (_NAllStores > _NFullStores) // one partial at the end
+ _SuperImpl::__masked_store_nocvt(
+ _UW(__converted[_NFullStores]), __mem + _NFullStores * _UW_size,
+ _Fp(),
+ _UAbi::_MaskImpl::template __convert<_Up>(
+ __extract_part<_NFullStores, _NParts>(
+ __k.__as_full_vector())));
+ }
+ }
+ else
+ _BitOps::__bit_iteration(
+ _MaskImpl::__to_bits(__k), [&](auto __i) constexpr {
+ __mem[__i] = static_cast<_Up>(__v[__i]);
+ });
+ }
+
+ // __complement {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __complement(_SimdWrapper<_Tp, _Np> __x) noexcept
+ {
+ return ~__x._M_data;
+ }
+
+ // __unary_minus {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __unary_minus(_SimdWrapper<_Tp, _Np> __x) noexcept
+ {
+ // GCC doesn't use the psign instructions, but pxor & psub seem to be just
+ // as good a choice as pcmpeqd & psign. So meh.
+ return -__x._M_data;
+ }
+
+ // arithmetic operators {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __plus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __x._M_data + __y._M_data;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __minus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __x._M_data - __y._M_data;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __multiplies(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __x._M_data * __y._M_data;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __divides(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ // Note that division by 0 is always UB, so we must ensure we avoid the
+ // case for partial registers
+ if constexpr (!_Abi::_S_is_partial)
+ return __x._M_data / __y._M_data;
+ else
+ return __as_vector(__x) / _Abi::__make_padding_nonzero(__as_vector(__y));
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __modulus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if constexpr (!_Abi::_S_is_partial)
+ return __x._M_data % __y._M_data;
+ else
+ return __as_vector(__x) % _Abi::__make_padding_nonzero(__as_vector(__y));
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_and(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __and(__x._M_data, __y._M_data);
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_or(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __or(__x._M_data, __y._M_data);
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_xor(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __xor(__x._M_data, __y._M_data);
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __bit_shift_left(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __x._M_data << __y._M_data;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __bit_shift_right(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+#ifdef _GLIBCXX_SIMD_WORKAROUND_XXX_5
+ if constexpr (sizeof(_Tp) == 8)
+ return __generate_vector<__vector_type_t<_Tp, _Np>>([&](auto __i) {
+ return __x._M_data[__i.value] >> __y._M_data[__i.value];
+ });
+ else
+#endif
+ return __x._M_data >> __y._M_data;
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_shift_left(_SimdWrapper<_Tp, _Np> __x, int __y)
+ {
+ // The behavior is undefined if the right operand is negative, or greater
+ // than or equal to the width of the promoted left operand.
+ if (__y < 0 || __y >= sizeof(std::declval<_Tp>() << __y) * CHAR_BIT)
+ __builtin_unreachable();
+ else if (__builtin_constant_p(__y) && __y >= sizeof(_Tp) * CHAR_BIT)
+ return {};
+ else
+ return __x._M_data << __y;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_shift_right(_SimdWrapper<_Tp, _Np> __x, int __y)
+ {
+ if (__y < 0 || __y >= sizeof(std::declval<_Tp>() >> __y) * CHAR_BIT)
+ __builtin_unreachable();
+ else if (__builtin_constant_p(__y) && __y >= sizeof(_Tp) * CHAR_BIT
+ && is_unsigned_v<_Tp>)
+ return {};
+ else
+ return __x._M_data >> __y;
+ }
+
+ // compares {{{2
+ // __equal_to {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __vector_bitcast<_Tp>(__x._M_data == __y._M_data);
+ }
+
+ // __not_equal_to {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __not_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __vector_bitcast<_Tp>(__x._M_data != __y._M_data);
+ }
+
+ // __less {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __less(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __vector_bitcast<_Tp>(__x._M_data < __y._M_data);
+ }
+
+ // __less_equal {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __less_equal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __vector_bitcast<_Tp>(__x._M_data <= __y._M_data);
+ }
+
+ // negation {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __negate(_SimdWrapper<_Tp, _Np> __x) noexcept
+ {
+ return __vector_bitcast<_Tp>(!__x._M_data);
+ }
+
+ // __min, __max, __minmax {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_NORMAL_MATH
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __min(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+ {
+ return __a._M_data < __b._M_data ? __a._M_data : __b._M_data;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_NORMAL_MATH
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __max(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+ {
+ return __a._M_data > __b._M_data ? __a._M_data : __b._M_data;
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_NORMAL_MATH
+ _GLIBCXX_SIMD_INTRINSIC static constexpr std::pair<_SimdWrapper<_Tp, _Np>,
+ _SimdWrapper<_Tp, _Np>>
+ __minmax(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
+ {
+ return {__a._M_data < __b._M_data ? __a._M_data : __b._M_data,
+ __a._M_data < __b._M_data ? __b._M_data : __a._M_data};
+ }
+
+ // reductions {{{2
+ template <size_t _Np, size_t... _Is, size_t... _Zeros, typename _Tp,
+ typename _BinaryOperation>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp
+ __reduce_partial(std::index_sequence<_Is...>, std::index_sequence<_Zeros...>,
+ simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
+ {
+ using _V = __vector_type_t<_Tp, _Np / 2>;
+ static_assert(sizeof(_V) <= sizeof(__x));
+ // _S_width is the size of the smallest native SIMD register that can
+ // store _Np/2 elements:
+ using _FullSimd = __deduced_simd<_Tp, _VectorTraits<_V>::_S_width>;
+ using _HalfSimd = __deduced_simd<_Tp, _Np / 2>;
+ const auto __xx = __as_vector(__x);
+ return _HalfSimd::abi_type::_SimdImpl::__reduce(
+ static_cast<_HalfSimd>(__as_vector(__binary_op(
+ static_cast<_FullSimd>(__intrin_bitcast<_V>(__xx)),
+ static_cast<_FullSimd>(__intrin_bitcast<_V>(
+ __vector_permute<(_Np / 2 + _Is)..., (int(_Zeros * 0) - 1)...>(
+ __xx)))))),
+ __binary_op);
+ }
+
+ template <typename _Tp, typename _BinaryOperation>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
+ __reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (_Np == 1)
+ return __x[0];
+ else if constexpr (_Np == 2)
+ return __binary_op(simd<_Tp, simd_abi::scalar>(__x[0]),
+ simd<_Tp, simd_abi::scalar>(__x[1]))[0];
+ else if constexpr (_Abi::_S_is_partial) //{{{
+ {
+ [[maybe_unused]] constexpr auto __full_size
+ = _Abi::template _S_full_size<_Tp>;
+ if constexpr (_Np == 3)
+ return __binary_op(__binary_op(simd<_Tp, simd_abi::scalar>(__x[0]),
+ simd<_Tp, simd_abi::scalar>(__x[1])),
+ simd<_Tp, simd_abi::scalar>(__x[2]))[0];
+ else if constexpr (std::is_same_v<__remove_cvref_t<_BinaryOperation>,
+ std::plus<>>)
+ {
+ using _Ap = simd_abi::deduce_t<_Tp, __full_size>;
+ return _Ap::_SimdImpl::__reduce(
+ simd<_Tp, _Ap>(__private_init, _Abi::__masked(__as_vector(__x))),
+ __binary_op);
+ }
+ else if constexpr (std::is_same_v<__remove_cvref_t<_BinaryOperation>,
+ std::multiplies<>>)
+ {
+ using _Ap = simd_abi::deduce_t<_Tp, __full_size>;
+ using _TW = _SimdWrapper<_Tp, __full_size>;
+ constexpr auto __implicit_mask_full
+ = _Abi::template __implicit_mask<_Tp>().__as_full_vector();
+ constexpr _TW __one = __vector_broadcast<__full_size>(_Tp(1));
+ const _TW __x_full = __data(__x).__as_full_vector();
+ const _TW __x_padded_with_ones
+ = _Ap::_CommonImpl::_S_blend(__implicit_mask_full, __one,
+ __x_full);
+ return _Ap::_SimdImpl::__reduce(
+ simd<_Tp, _Ap>(__private_init, __x_padded_with_ones),
+ __binary_op);
+ }
+ else if constexpr (_Np & 1)
+ {
+ using _Ap = simd_abi::deduce_t<_Tp, _Np - 1>;
+ return __binary_op(
+ simd<_Tp, simd_abi::scalar>(_Ap::_SimdImpl::__reduce(
+ simd<_Tp, _Ap>(__intrin_bitcast<__vector_type_t<_Tp, _Np - 1>>(
+ __as_vector(__x))),
+ __binary_op)),
+ simd<_Tp, simd_abi::scalar>(__x[_Np - 1]))[0];
+ }
+ else
+ return __reduce_partial<_Np>(
+ std::make_index_sequence<_Np / 2>(),
+ std::make_index_sequence<__full_size - _Np / 2>(), __x,
+ __binary_op);
+ } //}}}
+ else if constexpr (sizeof(__x) == 16) //{{{
+ {
+ if constexpr (_Np == 16)
+ {
+ const auto __y = __data(__x);
+ __x = __binary_op(
+ __make_simd<_Tp, _Np>(__vector_permute<0, 0, 1, 1, 2, 2, 3, 3, 4,
+ 4, 5, 5, 6, 6, 7, 7>(__y)),
+ __make_simd<_Tp, _Np>(
+ __vector_permute<8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14,
+ 14, 15, 15>(__y)));
+ }
+ if constexpr (_Np >= 8)
+ {
+ const auto __y = __vector_bitcast<short>(__data(__x));
+ __x
+ = __binary_op(__make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+ __vector_permute<0, 0, 1, 1, 2, 2, 3, 3>(__y))),
+ __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+ __vector_permute<4, 4, 5, 5, 6, 6, 7, 7>(__y))));
+ }
+ if constexpr (_Np >= 4)
+ {
+ using _Up
+ = std::conditional_t<std::is_floating_point_v<_Tp>, float, int>;
+ const auto __y = __vector_bitcast<_Up>(__data(__x));
+ __x = __binary_op(__x, __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+ __vector_permute<3, 2, 1, 0>(__y))));
+ }
+ using _Up
+ = std::conditional_t<std::is_floating_point_v<_Tp>, double, _LLong>;
+ const auto __y = __vector_bitcast<_Up>(__data(__x));
+ __x = __binary_op(__x, __make_simd<_Tp, _Np>(__vector_bitcast<_Tp>(
+ __vector_permute<1, 1>(__y))));
+ return __x[0];
+ } //}}}
+ else
+ {
+ static_assert(sizeof(__x) > __min_vector_size<_Tp>);
+ static_assert((_Np & (_Np - 1)) == 0); // _Np must be a power of 2
+ using _Ap = simd_abi::deduce_t<_Tp, _Np / 2>;
+ using _V = std::experimental::simd<_Tp, _Ap>;
+ return _Ap::_SimdImpl::__reduce(
+ __binary_op(_V(__private_init, __extract<0, 2>(__as_vector(__x))),
+ _V(__private_init, __extract<1, 2>(__as_vector(__x)))),
+ static_cast<_BinaryOperation&&>(__binary_op));
+ }
+ }
+
+ // math {{{2
+ // frexp, modf and copysign implemented in simd_math.h
+#define _GLIBCXX_SIMD_MATH_FALLBACK(__name) \
+ template <typename _Tp, typename... _More> \
+ static _Tp __##__name(const _Tp& __x, const _More&... __more) \
+ { \
+ return __generate_vector<_Tp>( \
+ [&](auto __i) { return std::__name(__x[__i], __more[__i]...); }); \
+ }
+
+#define _GLIBCXX_SIMD_MATH_FALLBACK_MASKRET(__name) \
+ template <typename _Tp, typename... _More> \
+ static \
+ typename _Tp::mask_type __##__name(const _Tp& __x, const _More&... __more) \
+ { \
+ return __generate_vector<_Tp>( \
+ [&](auto __i) { return std::__name(__x[__i], __more[__i]...); }); \
+ }
+
+#define _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(_RetTp, __name) \
+ template <typename _Tp, typename... _More> \
+ static auto __##__name(const _Tp& __x, const _More&... __more) \
+ { \
+ return __fixed_size_storage_t<_RetTp, \
+ _VectorTraits<_Tp>::_S_partial_width>:: \
+ __generate([&](auto __meta) constexpr { \
+ return __meta.__generator( \
+ [&](auto __i) { \
+ return std::__name(__x[__meta._S_offset + __i], \
+ __more[__meta._S_offset + __i]...); \
+ }, \
+ static_cast<_RetTp*>(nullptr)); \
+ }); \
+ }
+
+ _GLIBCXX_SIMD_MATH_FALLBACK(acos)
+ _GLIBCXX_SIMD_MATH_FALLBACK(asin)
+ _GLIBCXX_SIMD_MATH_FALLBACK(atan)
+ _GLIBCXX_SIMD_MATH_FALLBACK(atan2)
+ _GLIBCXX_SIMD_MATH_FALLBACK(cos)
+ _GLIBCXX_SIMD_MATH_FALLBACK(sin)
+ _GLIBCXX_SIMD_MATH_FALLBACK(tan)
+ _GLIBCXX_SIMD_MATH_FALLBACK(acosh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(asinh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(atanh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(cosh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(sinh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(tanh)
+ _GLIBCXX_SIMD_MATH_FALLBACK(exp)
+ _GLIBCXX_SIMD_MATH_FALLBACK(exp2)
+ _GLIBCXX_SIMD_MATH_FALLBACK(expm1)
+ _GLIBCXX_SIMD_MATH_FALLBACK(ldexp)
+ _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(int, ilogb)
+ _GLIBCXX_SIMD_MATH_FALLBACK(log)
+ _GLIBCXX_SIMD_MATH_FALLBACK(log10)
+ _GLIBCXX_SIMD_MATH_FALLBACK(log1p)
+ _GLIBCXX_SIMD_MATH_FALLBACK(log2)
+ _GLIBCXX_SIMD_MATH_FALLBACK(logb)
+
+ // modf implemented in simd_math.h
+ _GLIBCXX_SIMD_MATH_FALLBACK(scalbn)
+ _GLIBCXX_SIMD_MATH_FALLBACK(scalbln)
+ _GLIBCXX_SIMD_MATH_FALLBACK(cbrt)
+ _GLIBCXX_SIMD_MATH_FALLBACK(fabs)
+ _GLIBCXX_SIMD_MATH_FALLBACK(pow)
+ _GLIBCXX_SIMD_MATH_FALLBACK(sqrt)
+ _GLIBCXX_SIMD_MATH_FALLBACK(erf)
+ _GLIBCXX_SIMD_MATH_FALLBACK(erfc)
+ _GLIBCXX_SIMD_MATH_FALLBACK(lgamma)
+ _GLIBCXX_SIMD_MATH_FALLBACK(tgamma)
+
+ _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long, lrint)
+ _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long long, llrint)
+
+ _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long, lround)
+ _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET(long long, llround)
+
+ _GLIBCXX_SIMD_MATH_FALLBACK(fmod)
+ _GLIBCXX_SIMD_MATH_FALLBACK(remainder)
+
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ static _Tp __remquo(const _Tp __x, const _Tp __y,
+ __fixed_size_storage_t<int, _TVT::_S_partial_width>* __z)
+ {
+ return __generate_vector<_Tp>([&](auto __i) {
+ int __tmp;
+ auto __r = std::remquo(__x[__i], __y[__i], &__tmp);
+ __z->__set(__i, __tmp);
+ return __r;
+ });
+ }
+
+ // copysign in simd_math.h
+ _GLIBCXX_SIMD_MATH_FALLBACK(nextafter)
+ _GLIBCXX_SIMD_MATH_FALLBACK(fdim)
+ _GLIBCXX_SIMD_MATH_FALLBACK(fmax)
+ _GLIBCXX_SIMD_MATH_FALLBACK(fmin)
+ _GLIBCXX_SIMD_MATH_FALLBACK(fma)
+
+ template <typename _Tp, size_t _Np>
+ static constexpr auto __isgreater(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y) noexcept
+ {
+ using _Ip = __int_for_sizeof_t<_Tp>;
+ const auto __xn = __vector_bitcast<_Ip>(__x);
+ const auto __yn = __vector_bitcast<_Ip>(__y);
+ const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+ return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+ __vector_bitcast<_Tp>(__xp > __yp));
+ }
+ template <typename _Tp, size_t _Np>
+ static constexpr auto __isgreaterequal(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y) noexcept
+ {
+ using _Ip = __int_for_sizeof_t<_Tp>;
+ const auto __xn = __vector_bitcast<_Ip>(__x);
+ const auto __yn = __vector_bitcast<_Ip>(__y);
+ const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+ return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+ __vector_bitcast<_Tp>(__xp >= __yp));
+ }
+ template <typename _Tp, size_t _Np>
+ static constexpr auto __isless(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y) noexcept
+ {
+ using _Ip = __int_for_sizeof_t<_Tp>;
+ const auto __xn = __vector_bitcast<_Ip>(__x);
+ const auto __yn = __vector_bitcast<_Ip>(__y);
+ const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+ return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+ __vector_bitcast<_Tp>(__xp < __yp));
+ }
+ template <typename _Tp, size_t _Np>
+ static constexpr auto __islessequal(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y) noexcept
+ {
+ using _Ip = __int_for_sizeof_t<_Tp>;
+ const auto __xn = __vector_bitcast<_Ip>(__x);
+ const auto __yn = __vector_bitcast<_Ip>(__y);
+ const auto __xp = __xn < 0 ? -(__xn & numeric_limits<_Ip>::max()) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & numeric_limits<_Ip>::max()) : __yn;
+ return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+ __vector_bitcast<_Tp>(__xp <= __yp));
+ }
+ template <typename _Tp, size_t _Np>
+ static constexpr auto __islessgreater(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y) noexcept
+ {
+ return __and(__not(_SuperImpl::__isunordered(__x, __y)),
+ _SuperImpl::__not_equal_to(__x, __y));
+ }
+
+#undef _GLIBCXX_SIMD_MATH_FALLBACK
+#undef _GLIBCXX_SIMD_MATH_FALLBACK_MASKRET
+#undef _GLIBCXX_SIMD_MATH_FALLBACK_FIXEDRET
+ // __abs {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __abs(_SimdWrapper<_Tp, _Np> __x) noexcept
+ {
+ // if (__builtin_is_constant_evaluated())
+ // {
+ // return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+ // }
+ if constexpr (std::is_floating_point_v<_Tp>)
+ // `v < 0 ? -v : v` cannot compile to the efficient implementation of
+ // masking the signbit off because it must consider v == -0
+
+ // ~(-0.) & v would be easy, but breaks with fno-signed-zeros
+ return __and(_S_absmask<__vector_type_t<_Tp, _Np>>, __x._M_data);
+ else
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR91533
+ if constexpr (sizeof(__x) < 16 && std::is_signed_v<_Tp>)
+ {
+ if constexpr (sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_abs_epi32(__to_intrin(__x)));
+ else if constexpr (sizeof(_Tp) == 2)
+ return __auto_bitcast(_mm_abs_epi16(__to_intrin(__x)));
+ else
+ return __auto_bitcast(_mm_abs_epi8(__to_intrin(__x)));
+ }
+ else
+#endif //_GLIBCXX_SIMD_WORKAROUND_PR91533
+ return __x._M_data < 0 ? -__x._M_data : __x._M_data;
+ }
+
+ // __nearbyint {{{3
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x_) noexcept
+ {
+ using value_type = typename _TVT::value_type;
+ using _V = typename _TVT::type;
+ const _V __x = __x_;
+ const _V __absx = __and(__x, _S_absmask<_V>);
+ static_assert(CHAR_BIT * sizeof(1ull)
+ >= std::numeric_limits<value_type>::digits);
+ constexpr _V __shifter_abs
+ = _V() + (1ull << (std::numeric_limits<value_type>::digits - 1));
+ const _V __shifter = __or(__and(_S_signmask<_V>, __x), __shifter_abs);
+ _V __shifted = __x + __shifter;
+ // how can we stop -fassociative-math to break this pattern?
+ // asm("" : "+X"(__shifted));
+ __shifted -= __shifter;
+ return __absx < __shifter_abs ? __shifted : __x;
+ }
+
+ // __rint {{{3
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x) noexcept
+ {
+ return _SuperImpl::__nearbyint(__x);
+ }
+
+ // __trunc {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __trunc(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _V = __vector_type_t<_Tp, _Np>;
+ const _V __absx = __and(__x._M_data, _S_absmask<_V>);
+ static_assert(CHAR_BIT * sizeof(1ull) >= std::numeric_limits<_Tp>::digits);
+ constexpr _Tp __shifter = 1ull << (std::numeric_limits<_Tp>::digits - 1);
+ _V __truncated = (__absx + __shifter) - __shifter;
+ __truncated -= __truncated > __absx ? _V() + 1 : _V();
+ return __absx < __shifter ? __or(__xor(__absx, __x._M_data), __truncated)
+ : __x._M_data;
+ }
+
+ // __round {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __round(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _V = __vector_type_t<_Tp, _Np>;
+ const _V __absx = __and(__x._M_data, _S_absmask<_V>);
+ static_assert(CHAR_BIT * sizeof(1ull) >= std::numeric_limits<_Tp>::digits);
+ constexpr _Tp __shifter = 1ull << (std::numeric_limits<_Tp>::digits - 1);
+ _V __truncated = (__absx + __shifter) - __shifter;
+ __truncated -= __truncated > __absx ? _V() + 1 : _V();
+ const _V __rounded
+ = __or(__xor(__absx, __x._M_data),
+ __truncated + (__absx - __truncated >= _Tp(.5) ? _V() + 1 : _V()));
+ return __absx < __shifter ? __rounded : __x._M_data;
+ }
+
+ // __floor {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __floor(_SimdWrapper<_Tp, _Np> __x)
+ {
+ const auto __y = _SuperImpl::__trunc(__x)._M_data;
+ const auto __negative_input
+ = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+ const auto __mask
+ = __andnot(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+ return __or(__andnot(__mask, __y),
+ __and(__mask, __y - __vector_broadcast<_Np, _Tp>(1)));
+ }
+
+ // __ceil {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __ceil(_SimdWrapper<_Tp, _Np> __x)
+ {
+ const auto __y = _SuperImpl::__trunc(__x)._M_data;
+ const auto __negative_input
+ = __vector_bitcast<_Tp>(__x._M_data < __vector_broadcast<_Np, _Tp>(0));
+ const auto __inv_mask
+ = __or(__vector_bitcast<_Tp>(__y == __x._M_data), __negative_input);
+ return __or(__and(__inv_mask, __y),
+ __andnot(__inv_mask, __y + __vector_broadcast<_Np, _Tp>(1)));
+ }
+
+ // __isnan {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isnan(_SimdWrapper<_Tp, _Np> __x)
+ {
+#if __FINITE_MATH_ONLY__
+ [](auto&&) {}(__x);
+ return {}; // false
+#elif !defined __SUPPORT_SNAN__
+ return __vector_bitcast<_Tp>(~(__x._M_data == __x._M_data));
+#elif defined __STDC_IEC_559__
+ using _Up = make_unsigned_t<__int_for_sizeof_t<_Tp>>;
+ constexpr auto __max = __vector_bitcast<_Up>(
+ __vector_broadcast<_Np>(numeric_limits<_Tp>::infinity()));
+ auto __bits = __vector_bitcast<_Up>(__x);
+ __bits &= __vector_bitcast<_Up>(_S_absmask<__vector_type_t<_Tp, _Np>>);
+ return __vector_bitcast<_Tp>(__bits > __max);
+#else
+#error "Not implemented: how to support SNaN but non-IEC559 floating-point?"
+#endif
+ }
+
+ // __isfinite {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isfinite(_SimdWrapper<_Tp, _Np> __x)
+ {
+#if __FINITE_MATH_ONLY__
+ [](auto&&) {}(__x);
+ return __vector_bitcast<_Np>(_Tp()) == __vector_bitcast<_Np>(_Tp());
+#else
+ // if all exponent bits are set, __x is either inf or NaN
+ using _I = __int_for_sizeof_t<_Tp>;
+ constexpr auto __inf = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+ return __vector_bitcast<_Tp>(__inf > (__vector_bitcast<_I>(__x) & __inf));
+#endif
+ }
+
+ // __isunordered {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ return __or(__isnan(__x), __isnan(__y));
+ }
+
+ // __signbit {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __signbit(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ return __vector_bitcast<_Tp>(__vector_bitcast<_I>(__x) < 0);
+ // Arithmetic right shift (SRA) would also work (instead of compare), but
+ // 64-bit SRA isn't available on x86 before AVX512. And in general,
+ // compares are more likely to be efficient than SRA.
+ }
+
+ // __isinf {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isinf(_SimdWrapper<_Tp, _Np> __x)
+ {
+#if __FINITE_MATH_ONLY__
+ [](auto&&) {}(__x);
+ return {}; // false
+#else
+ return _SuperImpl::template __equal_to<_Tp, _Np>(
+ _SuperImpl::__abs(__x),
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+ // alternative:
+ // compare to inf using the corresponding integer type
+ /*
+ return
+ __vector_bitcast<_Tp>(__vector_bitcast<__int_for_sizeof_t<_Tp>>(__abs(__x)._M_data)
+ ==
+ __vector_bitcast<__int_for_sizeof_t<_Tp>>(__vector_broadcast<_Np>(
+ std::numeric_limits<_Tp>::infinity())));
+ */
+#endif
+ }
+
+ // __isnormal {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isnormal(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto absn = __vector_bitcast<_I>(_SuperImpl::__abs(__x));
+ const auto minn = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::min()));
+#if __FINITE_MATH_ONLY__
+ return __auto_bitcast(absn >= minn);
+#else
+ const auto infn = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+ return __auto_bitcast(absn >= minn && absn < infn);
+#endif
+ }
+
+ // __fpclassify {{{3
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static __fixed_size_storage_t<int, _Np>
+ __fpclassify(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto __xi = __to_intrin(__abs(__x));
+ const auto __xn = __vector_bitcast<_I>(__xi);
+ constexpr size_t _NI = sizeof(__xn) / sizeof(_I);
+
+ constexpr auto __fp_normal = __vector_broadcast<_NI, _I>(FP_NORMAL);
+ constexpr auto __fp_nan = __vector_broadcast<_NI, _I>(FP_NAN);
+ constexpr auto __fp_infinite = __vector_broadcast<_NI, _I>(FP_INFINITE);
+ constexpr auto __fp_subnormal = __vector_broadcast<_NI, _I>(FP_SUBNORMAL);
+ constexpr auto __fp_zero = __vector_broadcast<_NI, _I>(FP_ZERO);
+
+ __vector_type_t<_I, _NI> __tmp;
+ if constexpr (sizeof(_Tp) == 4)
+ __tmp = __xn < 0x0080'0000
+ ? (__xn == 0 ? __fp_zero : __fp_subnormal)
+ : (__xn < 0x7f80'0000
+ ? __fp_normal
+ : (__xn == 0x7f80'0000 ? __fp_infinite : __fp_nan));
+ else if constexpr (sizeof(_Tp) == 8)
+ __tmp = __xn < 0x0010'0000'0000'0000LL
+ ? (__xn == 0 ? __fp_zero : __fp_subnormal)
+ : (__xn < 0x7ff0'0000'0000'0000LL
+ ? __fp_normal
+ : (__xn == 0x7ff0'0000'0000'0000LL ? __fp_infinite
+ : __fp_nan));
+ else
+ __assert_unreachable<_Tp>();
+
+ if constexpr (sizeof(_I) == sizeof(int))
+ {
+ using _FixedInt = __fixed_size_storage_t<int, _Np>;
+ const auto __as_int = __vector_bitcast<int, _Np>(__tmp);
+ if constexpr (_FixedInt::_S_tuple_size == 1)
+ return {__as_int};
+ else if constexpr (_FixedInt::_S_tuple_size == 2
+ && std::is_same_v<
+ typename _FixedInt::_SecondType::_FirstAbi,
+ simd_abi::scalar>)
+ return {__extract<0, 2>(__as_int), __as_int[_Np - 1]};
+ else if constexpr (_FixedInt::_S_tuple_size == 2)
+ return {__extract<0, 2>(__as_int),
+ __auto_bitcast(__extract<1, 2>(__as_int))};
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (_Np == 2 && sizeof(_I) == 8
+ && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 2)
+ {
+ const auto __aslong = __vector_bitcast<_LLong>(__tmp);
+ return {int(__aslong[0]), {int(__aslong[1])}};
+ }
+#if _GLIBCXX_SIMD_X86INTRIN
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 32
+ && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+ return {_mm_packs_epi32(__to_intrin(__lo128(__tmp)),
+ __to_intrin(__hi128(__tmp)))};
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(__tmp) == 64
+ && __fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+ return {_mm512_cvtepi64_epi32(__to_intrin(__tmp))};
+#endif // _GLIBCXX_SIMD_X86INTRIN
+ else if constexpr (__fixed_size_storage_t<int, _Np>::_S_tuple_size == 1)
+ return {__call_with_subscripts<_Np>(__vector_bitcast<_LLong>(__tmp),
+ [](auto... __l) {
+ return __make_wrapper<int>(__l...);
+ })};
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // __increment & __decrement{{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void __increment(_SimdWrapper<_Tp, _Np>& __x)
+ {
+ __x = __x._M_data + 1;
+ }
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void __decrement(_SimdWrapper<_Tp, _Np>& __x)
+ {
+ __x = __x._M_data - 1;
+ }
+
+ // smart_reference access {{{2
+ template <typename _Tp, size_t _Np, typename _Up>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static void
+ __set(_SimdWrapper<_Tp, _Np>& __v, int __i, _Up&& __x) noexcept
+ {
+ __v.__set(__i, static_cast<_Up&&>(__x));
+ }
+
+ // __masked_assign{{{2
+ template <typename _Tp, typename _K, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+ __id<_SimdWrapper<_Tp, _Np>> __rhs)
+ {
+ __lhs = _CommonImpl::_S_blend(__k, __lhs, __rhs);
+ }
+
+ template <typename _Tp, typename _K, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+ __id<_Tp> __rhs)
+ {
+ if (__builtin_constant_p(__rhs) && __rhs == 0 && std::is_same_v<_K, _Tp>)
+ {
+ if constexpr (!is_same_v<bool, _K>)
+ // the __andnot optimization only makes sense if __k._M_data is a
+ // vector register
+ __lhs._M_data = __andnot(__k._M_data, __lhs._M_data);
+ else
+ // for AVX512/__mmask, a _mm512_maskz_mov is best
+ __lhs = _CommonImpl::_S_blend(__k, __lhs, _SimdWrapper<_Tp, _Np>());
+ }
+ else
+ __lhs = _CommonImpl::_S_blend(__k, __lhs,
+ _SimdWrapper<_Tp, _Np>(
+ __vector_broadcast<_Np>(__rhs)));
+ }
+
+ // __masked_cassign {{{2
+ template <typename _Op, typename _Tp, typename _K, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_cassign(const _SimdWrapper<_K, _Np> __k,
+ _SimdWrapper<_Tp, _Np>& __lhs,
+ const __id<_SimdWrapper<_Tp, _Np>> __rhs, _Op __op)
+ {
+ __lhs = _CommonImpl::_S_blend(__k, __lhs, __op(_SuperImpl{}, __lhs, __rhs));
+ }
+
+ template <typename _Op, typename _Tp, typename _K, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_cassign(const _SimdWrapper<_K, _Np> __k,
+ _SimdWrapper<_Tp, _Np>& __lhs, const __id<_Tp> __rhs,
+ _Op __op)
+ {
+ __lhs = _CommonImpl::_S_blend(__k, __lhs,
+ __op(_SuperImpl{}, __lhs,
+ _SimdWrapper<_Tp, _Np>(
+ __vector_broadcast<_Np>(__rhs))));
+ }
+
+ // __masked_unary {{{2
+ template <template <typename> class _Op, typename _Tp, typename _K,
+ size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __masked_unary(const _SimdWrapper<_K, _Np> __k,
+ const _SimdWrapper<_Tp, _Np> __v)
+ {
+ auto __vv = __make_simd(__v);
+ _Op<decltype(__vv)> __op;
+ return _CommonImpl::_S_blend(__k, __v, __data(__op(__vv)));
+ }
+
+ //}}}2
+};
+
+// _MaskImplBuiltinMixin {{{1
+struct _MaskImplBuiltinMixin
+{
+ template <typename _Tp> using _TypeTag = _Tp*;
+
+ // __to_maskvector {{{
+ template <typename _Up, size_t _ToN = 1>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+ __to_maskvector(bool __x)
+ {
+ using _I = __int_for_sizeof_t<_Up>;
+ return __vector_bitcast<_Up>(__x ? __vector_type_t<_I, _ToN>{~_I()}
+ : __vector_type_t<_I, _ToN>{});
+ }
+
+ template <typename _Up, size_t _UpN = 0, size_t _Np, bool _Sanitized,
+ size_t _ToN = _UpN == 0 ? _Np : _UpN>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+ __to_maskvector(_BitMask<_Np, _Sanitized> __x)
+ {
+ using _I = __int_for_sizeof_t<_Up>;
+ return __vector_bitcast<_Up>(
+ __generate_vector<__vector_type_t<_I, _ToN>>([&](auto __i) constexpr {
+ if constexpr (__i < _Np)
+ return __x[__i] ? ~_I() : _I();
+ else
+ return _I();
+ }));
+ }
+
+ template <typename _Up, size_t _UpN = 0, typename _Tp, size_t _Np,
+ size_t _ToN = _UpN == 0 ? _Np : _UpN>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+ __to_maskvector(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _TW = _SimdWrapper<_Tp, _Np>;
+ using _UW = _SimdWrapper<_Up, _ToN>;
+ if constexpr (sizeof(_Up) == sizeof(_Tp) && sizeof(_TW) == sizeof(_UW))
+ return __wrapper_bitcast<_Up, _ToN>(__x);
+ else if constexpr (is_same_v<_Tp, bool>) // bits -> vector
+ return __to_maskvector<_Up, _ToN>(std::bitset<_Np>(__x._M_data));
+ else
+ { // vector -> vector
+ /*
+ [[maybe_unused]] const auto __y = __vector_bitcast<_Up>(__x._M_data);
+ if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4 && sizeof(__y) == 16)
+ return __vector_permute<1, 3, -1, -1>(__y);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+ && sizeof(__y) == 16)
+ return __vector_permute<1, 3, 5, 7, -1, -1, -1, -1>(__y);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+ && sizeof(__y) == 16)
+ return __vector_permute<3, 7, -1, -1, -1, -1, -1, -1>(__y);
+ else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+ && sizeof(__y) == 16)
+ return __vector_permute<1, 3, 5, 7, 9, 11, 13, 15, -1, -1, -1, -1, -1,
+ -1, -1, -1>(__y);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+ && sizeof(__y) == 16)
+ return __vector_permute<3, 7, 11, 15, -1, -1, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1>(__y);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+ && sizeof(__y) == 16)
+ return __vector_permute<7, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1>(__y);
+ else
+ */
+ {
+ using _I = __int_for_sizeof_t<_Up>;
+ const auto __y
+ = __vector_bitcast<__int_for_sizeof_t<_Tp>>(__x._M_data);
+ return __vector_bitcast<_Up>(
+ __generate_vector<__vector_type_t<_I, _ToN>>([&](
+ auto __i) constexpr {
+ if constexpr (__i < _Np)
+ return _I(__y[__i.value]);
+ else
+ return _I();
+ }));
+ }
+ }
+ }
+
+ // }}}
+ // __to_bits {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __to_bits(_SimdWrapper<_Tp, _Np> __x)
+ {
+ static_assert(!is_same_v<_Tp, bool>);
+ static_assert(_Np <= CHAR_BIT * sizeof(_ULLong));
+ using _Up = make_unsigned_t<__int_for_sizeof_t<_Tp>>;
+ const auto __bools
+ = __vector_bitcast<_Up>(__x) >> (sizeof(_Up) * CHAR_BIT - 1);
+ _ULLong __r = 0;
+ __execute_n_times<_Np>(
+ [&](auto __i) { __r |= _ULLong(__bools[__i.value]) << __i; });
+ return __r;
+ }
+
+ // }}}
+};
+
+// _MaskImplBuiltin {{{1
+template <typename _Abi> struct _MaskImplBuiltin : _MaskImplBuiltinMixin
+{
+ using _MaskImplBuiltinMixin::__to_bits;
+ using _MaskImplBuiltinMixin::__to_maskvector;
+
+ // member types {{{
+ template <typename _Tp>
+ using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+ template <typename _Tp>
+ using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+ using _SuperImpl = typename _Abi::_MaskImpl;
+ using _CommonImpl = typename _Abi::_CommonImpl;
+ template <typename _Tp> static constexpr size_t size = simd_size_v<_Tp, _Abi>;
+
+ // }}}
+ // __broadcast {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __broadcast(bool __x)
+ {
+ return __x ? _Abi::template __implicit_mask<_Tp>() : _MaskMember<_Tp>();
+ }
+
+ // }}}
+ // __load {{{
+ template <typename _Tp, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __load(const bool* __mem)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ if constexpr (sizeof(_Tp) == sizeof(bool))
+ {
+ const auto __bools
+ = _CommonImpl::template _S_load<_I, size<_Tp>>(__mem, _Flags());
+ // bool is {0, 1}, everything else is UB
+ return __vector_bitcast<_Tp>(__bools > 0);
+ }
+ else
+ return __vector_bitcast<_Tp>(__generate_vector<_I, size<_Tp>>([&](
+ auto __i) constexpr { return __mem[__i] ? ~_I() : _I(); }));
+ }
+
+ // }}}
+ // __convert {{{
+ template <typename _Tp, size_t _Np, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+ __convert(_BitMask<_Np, _Sanitized> __x)
+ {
+ if constexpr (__is_builtin_bitmask_abi<_Abi>())
+ return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(__x._M_to_bits());
+ else
+ return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(
+ __x._M_sanitized());
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+ __convert(_SimdWrapper<bool, _Np> __x)
+ {
+ if constexpr (__is_builtin_bitmask_abi<_Abi>())
+ return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(__x._M_data);
+ else
+ return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(
+ _BitMask<_Np>(__x._M_data)._M_sanitized());
+ }
+
+ template <typename _Tp, typename _Up, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+ __convert(_SimdWrapper<_Up, _Np> __x)
+ {
+ if constexpr (__is_builtin_bitmask_abi<_Abi>())
+ return _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>(
+ _SuperImpl::__to_bits(__x));
+ else
+ return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__x);
+ }
+
+ template <typename _Tp, typename _Up, typename _UAbi>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto
+ __convert(simd_mask<_Up, _UAbi> __x)
+ {
+ if constexpr (__is_builtin_bitmask_abi<_Abi>())
+ {
+ using _R = _SimdWrapper<bool, simd_size_v<_Tp, _Abi>>;
+ if constexpr (__is_builtin_bitmask_abi<_UAbi>()) // bits -> bits
+ return _R(__data(__x));
+ else if constexpr (__is_scalar_abi<_UAbi>()) // bool -> bits
+ return _R(__data(__x));
+ else if constexpr (__is_fixed_size_abi_v<_UAbi>) // bitset -> bits
+ return _R(__data(__x)._M_to_bits());
+ else // vector -> bits
+ return _R(_UAbi::_MaskImpl::__to_bits(__data(__x))._M_to_bits());
+ }
+ else
+ return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__data(__x));
+ }
+
+ // }}}
+ // __masked_load {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ static inline _SimdWrapper<_Tp, _Np>
+ __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __mask,
+ const bool* __mem, _Fp) noexcept
+ {
+ // AVX(2) has 32/64 bit maskload, but nothing at 8 bit granularity
+ auto __tmp = __wrapper_bitcast<__int_for_sizeof_t<_Tp>>(__merge);
+ _BitOps::__bit_iteration(_SuperImpl::__to_bits(__mask),
+ [&](auto __i) { __tmp.__set(__i, -__mem[__i]); });
+ __merge = __wrapper_bitcast<_Tp>(__tmp);
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __v,
+ bool* __mem, _Fp) noexcept
+ {
+ __execute_n_times<_Np>([&](auto __i) constexpr { __mem[__i] = __v[__i]; });
+ }
+
+ // __masked_store {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ static inline void __masked_store(const _SimdWrapper<_Tp, _Np> __v,
+ bool* __mem, _Fp,
+ const _SimdWrapper<_Tp, _Np> __k) noexcept
+ {
+ _BitOps::__bit_iteration(
+ _SuperImpl::__to_bits(__k), [&](auto __i) constexpr {
+ __mem[__i] = __v[__i];
+ });
+ }
+
+ // __from_bitmask{{{2
+ template <size_t _Np, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __from_bitmask(_SanitizedBitMask<_Np> __bits, _TypeTag<_Tp>)
+ {
+ return _SuperImpl::template __to_maskvector<_Tp, size<_Tp>>(__bits);
+ }
+
+ // logical and bitwise operators {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __logical_and(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ return __and(__x._M_data, __y._M_data);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __logical_or(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ return __or(__x._M_data, __y._M_data);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_not(const _SimdWrapper<_Tp, _Np>& __x)
+ {
+ if constexpr(_Abi::_S_is_partial)
+ return __andnot(__x._M_data, _Abi::template __implicit_mask<_Tp>());
+ else
+ return __not(__x._M_data);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_and(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ return __and(__x._M_data, __y._M_data);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ return __or(__x._M_data, __y._M_data);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ return __xor(__x._M_data, __y._M_data);
+ }
+
+ // smart_reference access {{{2
+ template <typename _Tp, size_t _Np>
+ static constexpr void __set(_SimdWrapper<_Tp, _Np>& __k, int __i,
+ bool __x) noexcept
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ __k.__set(__i, __x);
+ else
+ {
+ using _Ip = __int_for_sizeof_t<_Tp>;
+ auto __ki = __vector_bitcast<_Ip>(__k._M_data);
+ if (__builtin_is_constant_evaluated())
+ {
+ __k = __vector_bitcast<_Tp>(
+ __generate_from_n_evaluations<_Np, decltype(__ki)>([&](auto __j) {
+ if (__i == __j)
+ return _Ip(-__x);
+ else
+ return __ki[+__j];
+ }));
+ }
+ else
+ {
+ __ki[__i] = _Ip(-__x);
+ __k = __vector_bitcast<_Tp>(__ki);
+ }
+ }
+ }
+
+ // __masked_assign{{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+ __id<_SimdWrapper<_Tp, _Np>> __rhs)
+ {
+ __lhs = _CommonImpl::_S_blend(__k, __lhs, __rhs);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
+ bool __rhs)
+ {
+ if (__builtin_constant_p(__rhs))
+ {
+ if (__rhs == false)
+ {
+ __lhs = __andnot(__k._M_data, __lhs._M_data);
+ }
+ else
+ {
+ __lhs = __or(__k._M_data, __lhs._M_data);
+ }
+ return;
+ }
+ __lhs
+ = _CommonImpl::_S_blend(__k, __lhs, __data(simd_mask<_Tp, _Abi>(__rhs)));
+ }
+
+ //}}}2
+ // __all_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __call_with_subscripts(
+ __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+ make_index_sequence<size<_Tp>>(),
+ [](const auto... __ent) constexpr { return (... && !(__ent == 0)); });
+ }
+
+ // }}}
+ // __any_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __call_with_subscripts(
+ __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+ make_index_sequence<size<_Tp>>(),
+ [](const auto... __ent) constexpr { return (... || !(__ent == 0)); });
+ }
+
+ // }}}
+ // __none_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __call_with_subscripts(
+ __vector_bitcast<__int_for_sizeof_t<_Tp>>(__data(__k)),
+ make_index_sequence<size<_Tp>>(),
+ [](const auto... __ent) constexpr { return (... && (__ent == 0)); });
+ }
+
+ // }}}
+ // __some_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+ {
+ const int __n_true = __popcount(__k);
+ return __n_true > 0 && __n_true < int(size<_Tp>);
+ }
+
+ // }}}
+ // __popcount {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ if constexpr (std::is_default_constructible_v<simd<_I, _Abi>>)
+ return -reduce(
+ simd<_I, _Abi>(__private_init, __wrapper_bitcast<_I>(__data(__k))));
+ else
+ return -reduce(__bit_cast<rebind_simd_t<_I, simd<_Tp, _Abi>>>(
+ simd<_Tp, _Abi>(__private_init, __data(__k))));
+ }
+
+ // }}}
+ // __find_first_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+ {
+ return _BitOps::__firstbit(_SuperImpl::__to_bits(__data(__k))._M_to_bits());
+ }
+
+ // }}}
+ // __find_last_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+ {
+ return _BitOps::__lastbit(_SuperImpl::__to_bits(__data(__k))._M_to_bits());
+ }
+
+ // }}}
+};
+
+//}}}1
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_ABIS_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h
new file mode 100644
index 00000000000..256b64023d2
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_converter.h
@@ -0,0 +1,337 @@
+// Generic simd conversions -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+
+#if __cplusplus >= 201703L
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+// _SimdConverter scalar -> scalar {{{
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::scalar, _To, simd_abi::scalar,
+ std::enable_if_t<!std::is_same_v<_From, _To>>>
+{
+ _GLIBCXX_SIMD_INTRINSIC constexpr _To operator()(_From __a) const noexcept
+ {
+ return static_cast<_To>(__a);
+ }
+};
+
+// }}}
+// _SimdConverter "native" -> scalar {{{
+template <typename _From, typename _To, typename _Abi>
+struct _SimdConverter<_From, _Abi, _To, simd_abi::scalar,
+ std::enable_if_t<!std::is_same_v<_Abi, simd_abi::scalar>>>
+{
+ using _Arg = typename _Abi::template __traits<_From>::_SimdMember;
+ static constexpr size_t _S_n = _Arg::_S_width;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr std::array<_To, _S_n>
+ __all(_Arg __a) const noexcept
+ {
+ return __call_with_subscripts(
+ __a, make_index_sequence<_S_n>(),
+ [&](auto... __values) constexpr -> std::array<_To, _S_n> {
+ return {static_cast<_To>(__values)...};
+ });
+ }
+};
+
+// }}}
+// _SimdConverter scalar -> "native" {{{
+template <typename _From, typename _To, typename _Abi>
+struct _SimdConverter<_From, simd_abi::scalar, _To, _Abi,
+ std::enable_if_t<!std::is_same_v<_Abi, simd_abi::scalar>>>
+{
+ using _Ret = typename _Abi::template __traits<_To>::_SimdMember;
+
+ template <typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+ operator()(_From __a, _More... __more) const noexcept
+ {
+ static_assert(sizeof...(_More) + 1 == _Abi::template size<_To>);
+ static_assert(std::conjunction_v<std::is_same<_From, _More>...>);
+ return __make_vector<_To>(__a, __more...);
+ }
+};
+
+// }}}
+// _SimdConverter "native 1" -> "native 2" {{{
+template <typename _From, typename _To, typename _AFrom, typename _ATo>
+struct _SimdConverter<
+ _From, _AFrom, _To, _ATo,
+ std::enable_if_t<!std::disjunction_v<
+ __is_fixed_size_abi<_AFrom>, __is_fixed_size_abi<_ATo>,
+ std::is_same<_AFrom, simd_abi::scalar>,
+ std::is_same<_ATo, simd_abi::scalar>,
+ std::conjunction<std::is_same<_From, _To>, std::is_same<_AFrom, _ATo>>>>>
+{
+ using _Arg = typename _AFrom::template __traits<_From>::_SimdMember;
+ using _Ret = typename _ATo::template __traits<_To>::_SimdMember;
+ using _V = __vector_type_t<_To, simd_size_v<_To, _ATo>>;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr auto __all(_Arg __a) const noexcept
+ {
+ return __convert_all<_V>(__a);
+ }
+
+ template <typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+ operator()(_Arg __a, _More... __more) const noexcept
+ {
+ return __convert<_V>(__a, __more...);
+ }
+};
+
+// }}}
+// _SimdConverter scalar -> fixed_size<1> {{{1
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::scalar, _To, simd_abi::fixed_size<1>,
+ void>
+{
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_To, simd_abi::scalar>
+ operator()(_From __x) const noexcept
+ {
+ return {static_cast<_To>(__x)};
+ }
+};
+
+// _SimdConverter fixed_size<1> -> scalar {{{1
+template <typename _From, typename _To>
+struct _SimdConverter<_From, simd_abi::fixed_size<1>, _To, simd_abi::scalar,
+ void>
+{
+ _GLIBCXX_SIMD_INTRINSIC constexpr _To
+ operator()(_SimdTuple<_From, simd_abi::scalar> __x) const noexcept
+ {
+ return {static_cast<_To>(__x.first)};
+ }
+};
+
+// _SimdConverter fixed_size<_Np> -> fixed_size<_Np> {{{1
+template <typename _From, typename _To, int _Np>
+struct _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+ simd_abi::fixed_size<_Np>,
+ std::enable_if_t<!std::is_same_v<_From, _To>>>
+{
+ using _Ret = __fixed_size_storage_t<_To, _Np>;
+ using _Arg = __fixed_size_storage_t<_From, _Np>;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Ret
+ operator()(const _Arg& __x) const noexcept
+ {
+ if constexpr (std::is_same_v<_From, _To>)
+ return __x;
+
+ // special case (optimize) int signedness casts
+ else if constexpr (sizeof(_From) == sizeof(_To)
+ && std::is_integral_v<_From> && std::is_integral_v<_To>)
+ return __bit_cast<_Ret>(__x);
+
+ // special case if all ABI tags in _Ret are scalar
+ else if constexpr (__is_scalar_abi<typename _Ret::_FirstAbi>())
+ {
+ return __call_with_subscripts(
+ __x, make_index_sequence<_Np>(),
+ [](auto... __values) constexpr -> _Ret {
+ return __make_simd_tuple<_To, decltype((void) __values,
+ simd_abi::scalar())...>(
+ static_cast<_To>(__values)...);
+ });
+ }
+
+ // from one vector to one vector
+ else if constexpr (_Arg::_S_first_size == _Ret::_S_first_size)
+ {
+ _SimdConverter<_From, typename _Arg::_FirstAbi, _To,
+ typename _Ret::_FirstAbi>
+ __native_cvt;
+ if constexpr (_Arg::_S_tuple_size == 1)
+ return {__native_cvt(__x.first)};
+ else
+ {
+ constexpr size_t _NRemain = _Np - _Arg::_S_first_size;
+ _SimdConverter<_From, simd_abi::fixed_size<_NRemain>, _To,
+ simd_abi::fixed_size<_NRemain>>
+ __remainder_cvt;
+ return {__native_cvt(__x.first), __remainder_cvt(__x.second)};
+ }
+ }
+
+ // from one vector to multiple vectors
+ else if constexpr (_Arg::_S_first_size > _Ret::_S_first_size)
+ {
+ const auto __multiple_return_chunks
+ = __convert_all<__vector_type_t<_To, _Ret::_S_first_size>>(__x.first);
+ constexpr auto __converted = __multiple_return_chunks.size()
+ * _Ret::_FirstAbi::template size<_To>;
+ constexpr auto __remaining = _Np - __converted;
+ if constexpr (_Arg::_S_tuple_size == 1 && __remaining == 0)
+ return __to_simd_tuple<_To, _Np>(__multiple_return_chunks);
+ else if constexpr (_Arg::_S_tuple_size == 1)
+ { // e.g. <int, 3> -> <double, 2, 1> or <short, 7> -> <double, 4, 2,
+ // 1>
+ using _RetRem = __remove_cvref_t<decltype(
+ __simd_tuple_pop_front<__multiple_return_chunks.size()>(_Ret()))>;
+ const auto __return_chunks2
+ = __convert_all<__vector_type_t<_To, _RetRem::_S_first_size>, 0,
+ __converted>(__x.first);
+ constexpr auto __converted2
+ = __converted + __return_chunks2.size() * _RetRem::_S_first_size;
+ if constexpr (__converted2 == _Np)
+ return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+ __return_chunks2);
+ else
+ {
+ using _RetRem2 = __remove_cvref_t<decltype(
+ __simd_tuple_pop_front<__return_chunks2.size()>(_RetRem()))>;
+ const auto __return_chunks3
+ = __convert_all<__vector_type_t<_To, _RetRem2::_S_first_size>,
+ 0, __converted2>(__x.first);
+ constexpr auto __converted3
+ = __converted2
+ + __return_chunks3.size() * _RetRem2::_S_first_size;
+ if constexpr (__converted3 == _Np)
+ return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+ __return_chunks2,
+ __return_chunks3);
+ else
+ {
+ using _RetRem3 = __remove_cvref_t<decltype(
+ __simd_tuple_pop_front<__return_chunks3.size()>(
+ _RetRem2()))>;
+ const auto __return_chunks4 = __convert_all<
+ __vector_type_t<_To, _RetRem3::_S_first_size>, 0,
+ __converted3>(__x.first);
+ constexpr auto __converted4
+ = __converted3
+ + __return_chunks4.size() * _RetRem3::_S_first_size;
+ if constexpr (__converted4 == _Np)
+ return __to_simd_tuple<_To, _Np>(__multiple_return_chunks,
+ __return_chunks2,
+ __return_chunks3,
+ __return_chunks4);
+ else
+ __assert_unreachable<_To>();
+ }
+ }
+ }
+ else
+ {
+ constexpr size_t _NRemain = _Np - _Arg::_S_first_size;
+ _SimdConverter<_From, simd_abi::fixed_size<_NRemain>, _To,
+ simd_abi::fixed_size<_NRemain>>
+ __remainder_cvt;
+ return __simd_tuple_concat(
+ __to_simd_tuple<_To, _Arg::_S_first_size>(
+ __multiple_return_chunks),
+ __remainder_cvt(__x.second));
+ }
+ }
+
+ // from multiple vectors to one vector
+ // _Arg::_S_first_size < _Ret::_S_first_size
+ // a) heterogeneous input at the end of the tuple (possible with partial
+ // native registers in _Ret)
+ else if constexpr (_Ret::_S_tuple_size == 1
+ && _Np % _Arg::_S_first_size != 0)
+ {
+ static_assert(_Ret::_FirstAbi::_S_is_partial);
+ return _Ret{__generate_from_n_evaluations<
+ _Np, typename _VectorTraits<typename _Ret::_FirstType>::type>(
+ [&](auto __i) { return static_cast<_To>(__x[__i]); })};
+ }
+ else
+ {
+ static_assert(_Arg::_S_tuple_size > 1);
+ constexpr auto __n
+ = __div_roundup(_Ret::_S_first_size, _Arg::_S_first_size);
+ return __call_with_n_evaluations<__n>(
+ [&__x](auto... __uncvted) {
+ // assuming _Arg Abi tags for all __i are _Arg::_FirstAbi
+ _SimdConverter<_From, typename _Arg::_FirstAbi, _To,
+ typename _Ret::_FirstAbi>
+ __native_cvt;
+ if constexpr (_Ret::_S_tuple_size == 1)
+ return _Ret{__native_cvt(__uncvted...)};
+ else
+ return _Ret{
+ __native_cvt(__uncvted...),
+ _SimdConverter<
+ _From, simd_abi::fixed_size<_Np - _Ret::_S_first_size>, _To,
+ simd_abi::fixed_size<_Np - _Ret::_S_first_size>>()(
+ __simd_tuple_pop_front<sizeof...(__uncvted)>(__x))};
+ },
+ [&__x](auto __i) { return __get_tuple_at<__i>(__x); });
+ }
+ }
+};
+
+// _SimdConverter "native" -> fixed_size<_Np> {{{1
+// i.e. 1 register to ? registers
+template <typename _From, typename _Ap, typename _To, int _Np>
+struct _SimdConverter<_From, _Ap, _To, simd_abi::fixed_size<_Np>,
+ std::enable_if_t<!__is_fixed_size_abi_v<_Ap>>>
+{
+ static_assert(
+ _Np == simd_size_v<_From, _Ap>,
+ "_SimdConverter to fixed_size only works for equal element counts");
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_To, _Np>
+ operator()(typename _SimdTraits<_From, _Ap>::_SimdMember __x) const noexcept
+ {
+ _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+ simd_abi::fixed_size<_Np>>
+ __fixed_cvt;
+ return __fixed_cvt(__fixed_size_storage_t<_From, _Np>{__x});
+ }
+};
+
+// _SimdConverter fixed_size<_Np> -> "native" {{{1
+// i.e. ? register to 1 registers
+template <typename _From, int _Np, typename _To, typename _Ap>
+struct _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To, _Ap,
+ std::enable_if_t<!__is_fixed_size_abi_v<_Ap>>>
+{
+ static_assert(
+ _Np == simd_size_v<_To, _Ap>,
+ "_SimdConverter to fixed_size only works for equal element counts");
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr typename _SimdTraits<_To, _Ap>::_SimdMember
+ operator()(__fixed_size_storage_t<_From, _Np> __x) const noexcept
+ {
+ _SimdConverter<_From, simd_abi::fixed_size<_Np>, _To,
+ simd_abi::fixed_size<_Np>>
+ __fixed_cvt;
+ return __fixed_cvt(__x).first;
+ }
+};
+
+// }}}1
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_CONVERTER_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
new file mode 100644
index 00000000000..c8a40ecc3af
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -0,0 +1,309 @@
+// Internal macros for the simd implementation -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+
+#if __cplusplus >= 201703L
+
+#include <cstddef>
+#include <cstdint>
+
+
+#define _GLIBCXX_SIMD_BEGIN_NAMESPACE \
+ namespace std _GLIBCXX_VISIBILITY(default) \
+ { \
+ _GLIBCXX_BEGIN_NAMESPACE_VERSION \
+ namespace experimental { \
+ inline namespace parallelism_v2 {
+#define _GLIBCXX_SIMD_END_NAMESPACE \
+ } \
+ } \
+ _GLIBCXX_END_NAMESPACE_VERSION \
+ }
+
+// ISA extension detection. The following defines all the _GLIBCXX_SIMD_HAVE_XXX
+// macros ARM{{{
+#if defined __ARM_NEON
+#define _GLIBCXX_SIMD_HAVE_NEON 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON 0
+#endif
+#if defined __ARM_NEON && (__ARM_ARCH >= 8 || defined __aarch64__)
+#define _GLIBCXX_SIMD_HAVE_NEON_A32 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON_A32 0
+#endif
+#if defined __ARM_NEON && defined __aarch64__
+#define _GLIBCXX_SIMD_HAVE_NEON_A64 1
+#else
+#define _GLIBCXX_SIMD_HAVE_NEON_A64 0
+#endif
+//}}}
+// x86{{{
+#ifdef __MMX__
+#define _GLIBCXX_SIMD_HAVE_MMX 1
+#else
+#define _GLIBCXX_SIMD_HAVE_MMX 0
+#endif
+#if defined __SSE__ || defined __x86_64__
+#define _GLIBCXX_SIMD_HAVE_SSE 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE 0
+#endif
+#if defined __SSE2__ || defined __x86_64__
+#define _GLIBCXX_SIMD_HAVE_SSE2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE2 0
+#endif
+#ifdef __SSE3__
+#define _GLIBCXX_SIMD_HAVE_SSE3 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE3 0
+#endif
+#ifdef __SSSE3__
+#define _GLIBCXX_SIMD_HAVE_SSSE3 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSSE3 0
+#endif
+#ifdef __SSE4_1__
+#define _GLIBCXX_SIMD_HAVE_SSE4_1 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4_1 0
+#endif
+#ifdef __SSE4_2__
+#define _GLIBCXX_SIMD_HAVE_SSE4_2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4_2 0
+#endif
+#ifdef __XOP__
+#define _GLIBCXX_SIMD_HAVE_XOP 1
+#else
+#define _GLIBCXX_SIMD_HAVE_XOP 0
+#endif
+#ifdef __AVX__
+#define _GLIBCXX_SIMD_HAVE_AVX 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX 0
+#endif
+#ifdef __AVX2__
+#define _GLIBCXX_SIMD_HAVE_AVX2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX2 0
+#endif
+#ifdef __BMI__
+#define _GLIBCXX_SIMD_HAVE_BMI1 1
+#else
+#define _GLIBCXX_SIMD_HAVE_BMI1 0
+#endif
+#ifdef __BMI2__
+#define _GLIBCXX_SIMD_HAVE_BMI2 1
+#else
+#define _GLIBCXX_SIMD_HAVE_BMI2 0
+#endif
+#ifdef __LZCNT__
+#define _GLIBCXX_SIMD_HAVE_LZCNT 1
+#else
+#define _GLIBCXX_SIMD_HAVE_LZCNT 0
+#endif
+#ifdef __SSE4A__
+#define _GLIBCXX_SIMD_HAVE_SSE4A 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE4A 0
+#endif
+#ifdef __FMA__
+#define _GLIBCXX_SIMD_HAVE_FMA 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FMA 0
+#endif
+#ifdef __FMA4__
+#define _GLIBCXX_SIMD_HAVE_FMA4 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FMA4 0
+#endif
+#ifdef __F16C__
+#define _GLIBCXX_SIMD_HAVE_F16C 1
+#else
+#define _GLIBCXX_SIMD_HAVE_F16C 0
+#endif
+#ifdef __POPCNT__
+#define _GLIBCXX_SIMD_HAVE_POPCNT 1
+#else
+#define _GLIBCXX_SIMD_HAVE_POPCNT 0
+#endif
+#ifdef __AVX512F__
+#define _GLIBCXX_SIMD_HAVE_AVX512F 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512F 0
+#endif
+#ifdef __AVX512DQ__
+#define _GLIBCXX_SIMD_HAVE_AVX512DQ 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512DQ 0
+#endif
+#ifdef __AVX512VL__
+#define _GLIBCXX_SIMD_HAVE_AVX512VL 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512VL 0
+#endif
+#ifdef __AVX512BW__
+#define _GLIBCXX_SIMD_HAVE_AVX512BW 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512BW 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_SSE
+#define _GLIBCXX_SIMD_HAVE_SSE_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_SSE_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_SSE2
+#define _GLIBCXX_SIMD_HAVE_FULL_SSE_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_SSE_ABI 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_AVX
+#define _GLIBCXX_SIMD_HAVE_AVX_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_AVX2
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX_ABI 0
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_AVX512F
+#define _GLIBCXX_SIMD_HAVE_AVX512_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_AVX512_ABI 0
+#endif
+#if _GLIBCXX_SIMD_HAVE_AVX512BW
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX512_ABI 1
+#else
+#define _GLIBCXX_SIMD_HAVE_FULL_AVX512_ABI 0
+#endif
+
+#if defined __x86_64__ && !_GLIBCXX_SIMD_HAVE_SSE2
+#error "Use of SSE2 is required on AMD64"
+#endif
+//}}}
+
+#define _GLIBCXX_SIMD_NORMAL_MATH \
+ [[__gnu__::__optimize__("finite-math-only,no-signed-zeros")]]
+#define _GLIBCXX_SIMD_NEVER_INLINE [[__gnu__::__noinline__]]
+#define _GLIBCXX_SIMD_INTRINSIC \
+ [[__gnu__::__always_inline__, __gnu__::__artificial__]] inline
+#define _GLIBCXX_SIMD_ALWAYS_INLINE [[__gnu__::__always_inline__]] inline
+#define _GLIBCXX_SIMD_IS_UNLIKELY(__x) __builtin_expect(__x, 0)
+#define _GLIBCXX_SIMD_IS_LIKELY(__x) __builtin_expect(__x, 1)
+#if defined __STRICT_ANSI__ && __STRICT_ANSI__
+#define _GLIBCXX_SIMD_CONSTEXPR
+#else
+#define _GLIBCXX_SIMD_CONSTEXPR constexpr
+#endif
+
+#define _GLIBCXX_SIMD_LIST_BINARY(__macro) __macro(|) __macro(&) __macro(^)
+#define _GLIBCXX_SIMD_LIST_SHIFTS(__macro) __macro(<<) __macro(>>)
+#define _GLIBCXX_SIMD_LIST_ARITHMETICS(__macro) \
+ __macro(+) __macro(-) __macro(*) __macro(/) __macro(%)
+
+#define _GLIBCXX_SIMD_ALL_BINARY(__macro) \
+ _GLIBCXX_SIMD_LIST_BINARY(__macro) static_assert(true)
+#define _GLIBCXX_SIMD_ALL_SHIFTS(__macro) \
+ _GLIBCXX_SIMD_LIST_SHIFTS(__macro) static_assert(true)
+#define _GLIBCXX_SIMD_ALL_ARITHMETICS(__macro) \
+ _GLIBCXX_SIMD_LIST_ARITHMETICS(__macro) static_assert(true)
+
+#ifdef _GLIBCXX_SIMD_NO_ALWAYS_INLINE
+#undef _GLIBCXX_SIMD_ALWAYS_INLINE
+#define _GLIBCXX_SIMD_ALWAYS_INLINE inline
+#undef _GLIBCXX_SIMD_INTRINSIC
+#define _GLIBCXX_SIMD_INTRINSIC inline
+#endif
+
+#if _GLIBCXX_SIMD_HAVE_SSE || _GLIBCXX_SIMD_HAVE_MMX
+#define _GLIBCXX_SIMD_X86INTRIN 1
+#else
+#define _GLIBCXX_SIMD_X86INTRIN 0
+#endif
+
+// workaround macros {{{
+// use aliasing loads to help GCC understand the data accesses better
+// This also seems to hide a miscompilation on swap(x[i], x[i + 1]) with
+// fixed_size_simd<float, 16> x.
+#define _GLIBCXX_SIMD_USE_ALIASING_LOADS 1
+
+// vector conversions on x86 not optimized:
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_PR85048 1
+#endif
+
+// Invalid instruction mov from xmm16-31
+#define _GLIBCXX_SIMD_WORKAROUND_PR89229 1
+
+// integer division not optimized
+#define _GLIBCXX_SIMD_WORKAROUND_PR90993 1
+
+// very bad codegen for extraction and concatenation of 128/256 "subregisters"
+// with sizeof(element type) < 8: https://godbolt.org/g/mqUsgM
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_1 1
+#endif
+
+// bad codegen for 8 Byte memcpy to __vector_type_t<char, 16>
+#define _GLIBCXX_SIMD_WORKAROUND_PR90424 1
+
+// bad codegen for zero-extend using simple concat(__x, 0)
+#if _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_3 1
+#endif
+
+// bad codegen for integer division
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_4 1
+
+// abs pattern may generate MMX instructions without EMMS cleanup (This only
+// happens with SSSE3 because pabs[bwd] is part of SSSE3.)
+#if __GNUC__ < 10 && defined __SSSE3__ && _GLIBCXX_SIMD_X86INTRIN
+#define _GLIBCXX_SIMD_WORKAROUND_PR91533 1
+#endif
+
+#if __GNUC__ < 10 && defined __aarch64__
+#define _GLIBCXX_SIMD_WORKAROUND_XXX_5 1
+#endif
+
+// https://github.com/cplusplus/parallelism-ts/issues/65 (incorrect return type
+// of static_simd_cast)
+#define _GLIBCXX_SIMD_FIX_P2TS_ISSUE65 1
+
+// https://github.com/cplusplus/parallelism-ts/issues/66 (incorrect SFINAE
+// constraint on (static)_simd_cast)
+#define _GLIBCXX_SIMD_FIX_P2TS_ISSUE66 1
+// }}}
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_DETAIL_H_
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
new file mode 100644
index 00000000000..2b643f28835
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -0,0 +1,2102 @@
+// Simd fixed_size ABI specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+/*
+ * The fixed_size ABI gives the following guarantees:
+ * - simd objects are passed via the stack
+ * - memory layout of `simd<_Tp, _Np>` is equivalent to `std::array<_Tp, _Np>`
+ * - alignment of `simd<_Tp, _Np>` is `_Np * sizeof(_Tp)` if _Np is __a
+ * power-of-2 value, otherwise `__next_power_of_2(_Np * sizeof(_Tp))` (Note:
+ * if the alignment were to exceed the system/compiler maximum, it is bounded
+ * to that maximum)
+ * - simd_mask objects are passed like std::bitset<_Np>
+ * - memory layout of `simd_mask<_Tp, _Np>` is equivalent to `std::bitset<_Np>`
+ * - alignment of `simd_mask<_Tp, _Np>` is equal to the alignment of
+ * `std::bitset<_Np>`
+ */
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+
+#if __cplusplus >= 201703L
+
+#include <array>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __simd_tuple_element {{{
+template <size_t _I, typename _Tp> struct __simd_tuple_element;
+template <typename _Tp, typename _A0, typename... _As>
+struct __simd_tuple_element<0, _SimdTuple<_Tp, _A0, _As...>>
+{
+ using type = std::experimental::simd<_Tp, _A0>;
+};
+template <size_t _I, typename _Tp, typename _A0, typename... _As>
+struct __simd_tuple_element<_I, _SimdTuple<_Tp, _A0, _As...>>
+{
+ using type =
+ typename __simd_tuple_element<_I - 1, _SimdTuple<_Tp, _As...>>::type;
+};
+template <size_t _I, typename _Tp>
+using __simd_tuple_element_t = typename __simd_tuple_element<_I, _Tp>::type;
+
+// }}}
+// __simd_tuple_concat {{{
+template <typename _Tp, typename... _A0s, typename... _A1s>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0s..., _A1s...>
+__simd_tuple_concat(const _SimdTuple<_Tp, _A0s...>& __left,
+ const _SimdTuple<_Tp, _A1s...>& __right)
+{
+ if constexpr (sizeof...(_A0s) == 0)
+ return __right;
+ else if constexpr (sizeof...(_A1s) == 0)
+ return __left;
+ else
+ return {__left.first, __simd_tuple_concat(__left.second, __right)};
+}
+
+template <typename _Tp, typename _A10, typename... _A1s>
+_GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, simd_abi::scalar, _A10,
+ _A1s...>
+__simd_tuple_concat(const _Tp& __left,
+ const _SimdTuple<_Tp, _A10, _A1s...>& __right)
+{
+ return {__left, __right};
+}
+
+// }}}
+// __simd_tuple_pop_front {{{
+template <size_t _Np, typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+__simd_tuple_pop_front(_Tp&& __x)
+{
+ if constexpr (_Np == 0)
+ return static_cast<_Tp&&>(__x);
+ else
+ return __simd_tuple_pop_front<_Np - 1>(__x.second);
+}
+
+// }}}
+// __get_simd_at<_Np> {{{1
+struct __as_simd
+{
+};
+struct __as_simd_tuple
+{
+};
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr simd<_Tp, _A0>
+__simd_tuple_get_impl(__as_simd, const _SimdTuple<_Tp, _A0, _Abis...>& __t,
+ _SizeConstant<0>)
+{
+ return {__private_init, __t.first};
+}
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr const auto&
+__simd_tuple_get_impl(__as_simd_tuple,
+ const _SimdTuple<_Tp, _A0, _Abis...>& __t,
+ _SizeConstant<0>)
+{
+ return __t.first;
+}
+template <typename _Tp, typename _A0, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _A0, _Abis...>& __t,
+ _SizeConstant<0>)
+{
+ return __t.first;
+}
+
+template <typename _R, size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__simd_tuple_get_impl(_R, const _SimdTuple<_Tp, _Abis...>& __t,
+ _SizeConstant<_Np>)
+{
+ return __simd_tuple_get_impl(_R(), __t.second, _SizeConstant<_Np - 1>());
+}
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__simd_tuple_get_impl(__as_simd_tuple, _SimdTuple<_Tp, _Abis...>& __t,
+ _SizeConstant<_Np>)
+{
+ return __simd_tuple_get_impl(__as_simd_tuple(), __t.second,
+ _SizeConstant<_Np - 1>());
+}
+
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__get_simd_at(const _SimdTuple<_Tp, _Abis...>& __t)
+{
+ return __simd_tuple_get_impl(__as_simd(), __t, _SizeConstant<_Np>());
+}
+
+// }}}
+// __get_tuple_at<_Np> {{{
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto
+__get_tuple_at(const _SimdTuple<_Tp, _Abis...>& __t)
+{
+ return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
+}
+
+template <size_t _Np, typename _Tp, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC constexpr auto&
+__get_tuple_at(_SimdTuple<_Tp, _Abis...>& __t)
+{
+ return __simd_tuple_get_impl(__as_simd_tuple(), __t, _SizeConstant<_Np>());
+}
+
+// __tuple_element_meta {{{1
+template <typename _Tp, typename _Abi, size_t _Offset>
+struct __tuple_element_meta : public _Abi::_SimdImpl
+{
+ static_assert(is_same_v<typename _Abi::_SimdImpl::abi_type,
+ _Abi>); // this fails e.g. when _SimdImpl is an alias
+ // for _SimdImplBuiltin<_DifferentAbi>
+ using value_type = _Tp;
+ using abi_type = _Abi;
+ using _Traits = _SimdTraits<_Tp, _Abi>;
+ using _MaskImpl = typename _Abi::_MaskImpl;
+ using _MaskMember = typename _Traits::_MaskMember;
+ using simd_type = std::experimental::simd<_Tp, _Abi>;
+ static constexpr size_t _S_offset = _Offset;
+ static constexpr size_t size() { return simd_size<_Tp, _Abi>::value; }
+ static constexpr _MaskImpl _S_mask_impl = {};
+
+ template <size_t _Np, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static auto
+ __submask(_BitMask<_Np, _Sanitized> __bits)
+ {
+ return __bits.template _M_extract<_Offset, size()>();
+ }
+
+ template <size_t _Np, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __make_mask(_BitMask<_Np, _Sanitized> __bits)
+ {
+ return _MaskImpl::template __convert<_Tp>(
+ __bits.template _M_extract<_Offset, size()>()._M_sanitized());
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static _ULLong
+ __mask_to_shifted_ullong(_MaskMember __k)
+ {
+ return _MaskImpl::__to_bits(__k).to_ullong() << _Offset;
+ }
+};
+
+template <size_t _Offset, typename _Tp, typename _Abi, typename... _As>
+__tuple_element_meta<_Tp, _Abi, _Offset>
+__make_meta(const _SimdTuple<_Tp, _Abi, _As...>&)
+{
+ return {};
+}
+
+// }}}1
+// _WithOffset wrapper class {{{
+template <size_t _Offset, typename _Base> struct _WithOffset : public _Base
+{
+ static inline constexpr size_t _S_offset = _Offset;
+
+ _GLIBCXX_SIMD_INTRINSIC char* __as_charptr()
+ {
+ return reinterpret_cast<char*>(this)
+ + _S_offset * sizeof(typename _Base::value_type);
+ }
+ _GLIBCXX_SIMD_INTRINSIC const char* __as_charptr() const
+ {
+ return reinterpret_cast<const char*>(this)
+ + _S_offset * sizeof(typename _Base::value_type);
+ }
+};
+
+// make _WithOffset<_WithOffset> ill-formed to use:
+template <size_t _O0, size_t _O1, typename _Base>
+struct _WithOffset<_O0, _WithOffset<_O1, _Base>>
+{
+};
+
+template <size_t _Offset, typename _Tp>
+decltype(auto)
+__add_offset(_Tp& __base)
+{
+ return static_cast<_WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(__base);
+}
+template <size_t _Offset, typename _Tp>
+decltype(auto)
+__add_offset(const _Tp& __base)
+{
+ return static_cast<const _WithOffset<_Offset, __remove_cvref_t<_Tp>>&>(
+ __base);
+}
+template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
+decltype(auto)
+__add_offset(_WithOffset<_ExistingOffset, _Tp>& __base)
+{
+ return static_cast<_WithOffset<_Offset + _ExistingOffset, _Tp>&>(
+ static_cast<_Tp&>(__base));
+}
+template <size_t _Offset, size_t _ExistingOffset, typename _Tp>
+decltype(auto)
+__add_offset(const _WithOffset<_ExistingOffset, _Tp>& __base)
+{
+ return static_cast<const _WithOffset<_Offset + _ExistingOffset, _Tp>&>(
+ static_cast<const _Tp&>(__base));
+}
+
+template <typename _Tp> constexpr inline size_t __offset = 0;
+template <size_t _Offset, typename _Tp>
+constexpr inline size_t
+ __offset<_WithOffset<_Offset, _Tp>> = _WithOffset<_Offset, _Tp>::_S_offset;
+template <typename _Tp>
+constexpr inline size_t __offset<const _Tp> = __offset<_Tp>;
+template <typename _Tp> constexpr inline size_t __offset<_Tp&> = __offset<_Tp>;
+template <typename _Tp> constexpr inline size_t __offset<_Tp&&> = __offset<_Tp>;
+
+// }}}
+// _SimdTuple specializations {{{1
+// empty {{{2
+template <typename _Tp> struct _SimdTuple<_Tp>
+{
+ using value_type = _Tp;
+ static constexpr size_t _S_tuple_size = 0;
+ static constexpr size_t size() { return 0; }
+};
+
+// _SimdTupleData {{{2
+template <typename _FirstType, typename _SecondType> struct _SimdTupleData
+{
+ _FirstType first;
+ _SecondType second;
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ if constexpr(is_class_v<_FirstType>)
+ return first._M_is_constprop() && second._M_is_constprop();
+ else
+ return __builtin_constant_p(first) && second._M_is_constprop();
+ }
+};
+
+template <typename _FirstType, typename _Tp>
+struct _SimdTupleData<_FirstType, _SimdTuple<_Tp>>
+{
+ _FirstType first;
+ static constexpr _SimdTuple<_Tp> second = {};
+
+ _GLIBCXX_SIMD_INTRINSIC
+ constexpr bool _M_is_constprop() const
+ {
+ if constexpr(is_class_v<_FirstType>)
+ return first._M_is_constprop();
+ else
+ return __builtin_constant_p(first);
+ }
+};
+
+// 1 or more {{{2
+template <typename _Tp, typename _Abi0, typename... _Abis>
+struct _SimdTuple<_Tp, _Abi0, _Abis...>
+ : _SimdTupleData<typename _SimdTraits<_Tp, _Abi0>::_SimdMember,
+ _SimdTuple<_Tp, _Abis...>>
+{
+ static_assert(!__is_fixed_size_abi_v<_Abi0>);
+ using value_type = _Tp;
+ using _FirstType = typename _SimdTraits<_Tp, _Abi0>::_SimdMember;
+ using _FirstAbi = _Abi0;
+ using _SecondType = _SimdTuple<_Tp, _Abis...>;
+ static constexpr size_t _S_tuple_size = sizeof...(_Abis) + 1;
+ static constexpr size_t size()
+ {
+ return simd_size_v<_Tp, _Abi0> + _SecondType::size();
+ }
+ static constexpr size_t _S_first_size = simd_size_v<_Tp, _Abi0>;
+
+ using _Base = _SimdTupleData<typename _SimdTraits<_Tp, _Abi0>::_SimdMember,
+ _SimdTuple<_Tp, _Abis...>>;
+ using _Base::first;
+ using _Base::second;
+
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple() = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(const _SimdTuple&) = default;
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple& operator=(const _SimdTuple&)
+ = default;
+
+ template <typename _Up>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x)
+ : _Base{static_cast<_Up&&>(__x)}
+ {}
+ template <typename _Up, typename _Up2>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _Up2&& __y)
+ : _Base{static_cast<_Up&&>(__x), static_cast<_Up2&&>(__y)}
+ {}
+ template <typename _Up>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple(_Up&& __x, _SimdTuple<_Tp>)
+ : _Base{static_cast<_Up&&>(__x)}
+ {}
+
+ _GLIBCXX_SIMD_INTRINSIC char* __as_charptr()
+ {
+ return reinterpret_cast<char*>(this);
+ }
+ _GLIBCXX_SIMD_INTRINSIC const char* __as_charptr() const
+ {
+ return reinterpret_cast<const char*>(this);
+ }
+
+ template <size_t _Np> _GLIBCXX_SIMD_INTRINSIC constexpr auto& __at()
+ {
+ if constexpr (_Np == 0)
+ return first;
+ else
+ return second.template __at<_Np - 1>();
+ }
+ template <size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC constexpr const auto& __at() const
+ {
+ if constexpr (_Np == 0)
+ return first;
+ else
+ return second.template __at<_Np - 1>();
+ }
+
+ template <size_t _Np> _GLIBCXX_SIMD_INTRINSIC constexpr auto __simd_at() const
+ {
+ if constexpr (_Np == 0)
+ return simd<_Tp, _Abi0>(__private_init, first);
+ else
+ return second.template __simd_at<_Np - 1>();
+ }
+
+ template <size_t _Offset = 0, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple
+ __generate(_Fp&& __gen, _SizeConstant<_Offset> = {})
+ {
+ auto&& __first = __gen(__tuple_element_meta<_Tp, _Abi0, _Offset>());
+ if constexpr (_S_tuple_size == 1)
+ return {__first};
+ else
+ return {__first, _SecondType::__generate(
+ static_cast<_Fp&&>(__gen),
+ _SizeConstant<_Offset + simd_size_v<_Tp, _Abi0>>())};
+ }
+
+ template <size_t _Offset = 0, typename _Fp, typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC _SimdTuple
+ __apply_wrapped(_Fp&& __fun, const _More&... __more) const
+ {
+ auto&& __first = __fun(__make_meta<_Offset>(*this), first, __more.first...);
+ if constexpr (_S_tuple_size == 1)
+ return {__first};
+ else
+ return {
+ __first,
+ second.template __apply_wrapped<_Offset + simd_size_v<_Tp, _Abi0>>(
+ static_cast<_Fp&&>(__fun), __more.second...)};
+ }
+
+ template <size_t _Size, size_t _Offset = 0,
+ typename _R = __fixed_size_storage_t<_Tp, _Size>>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _R __extract_tuple_with_size() const
+ {
+ if constexpr (_Size == _S_first_size && _Offset == 0)
+ return {first};
+ else if constexpr (_Size > _S_first_size && _Offset == 0
+ && _S_tuple_size > 1)
+ return {
+ first,
+ second.template __extract_tuple_with_size<_Size - _S_first_size>()};
+ else if constexpr (_Size == 1)
+ return {operator[](_SizeConstant<_Offset>())};
+ else if constexpr (_R::_S_tuple_size == 1)
+ {
+ static_assert(_Offset % _Size == 0);
+ static_assert(_S_first_size % _Size == 0);
+ return {typename _R::_FirstType(
+ __private_init,
+ __extract_part<_Offset / _Size, _S_first_size / _Size>(first))};
+ }
+ else
+ __assert_unreachable<_SizeConstant<_Size>>();
+ }
+
+ template <typename _Tup>
+ _GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+ __extract_argument(_Tup&& __tup) const
+ {
+ using _TupT = typename __remove_cvref_t<_Tup>::value_type;
+ if constexpr (is_same_v<_SimdTuple, __remove_cvref_t<_Tup>>)
+ return __tup.first;
+ else if (__builtin_is_constant_evaluated())
+ return __fixed_size_storage_t<_TupT, _S_first_size>::__generate([&](
+ auto __meta) constexpr {
+ return __meta.__generator(
+ [&](auto __i) constexpr { return __tup[__i]; },
+ static_cast<_TupT*>(nullptr));
+ });
+ else
+ return [&]() {
+ __fixed_size_storage_t<_TupT, _S_first_size> __r;
+ __builtin_memcpy(__r.__as_charptr(), __tup.__as_charptr(), sizeof(__r));
+ return __r;
+ }();
+ }
+
+ template <typename _Tup>
+ _GLIBCXX_SIMD_INTRINSIC constexpr auto& __skip_argument(_Tup&& __tup) const
+ {
+ static_assert(_S_tuple_size > 1);
+ using _Up = __remove_cvref_t<_Tup>;
+ constexpr size_t __off = __offset<_Up>;
+ if constexpr (_S_first_size == _Up::_S_first_size && __off == 0)
+ return __tup.second;
+ else if constexpr (_S_first_size > _Up::_S_first_size
+ && _S_first_size % _Up::_S_first_size == 0 && __off == 0)
+ return __simd_tuple_pop_front<_S_first_size / _Up::_S_first_size>(__tup);
+ else if constexpr (_S_first_size + __off < _Up::_S_first_size)
+ return __add_offset<_S_first_size>(__tup);
+ else if constexpr (_S_first_size + __off == _Up::_S_first_size)
+ return __tup.second;
+ else
+ __assert_unreachable<_Tup>();
+ }
+
+ template <size_t _Offset, typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC constexpr void
+ __assign_front(const _SimdTuple<_Tp, _Abi0, _More...>& __x) &
+ {
+ static_assert(_Offset == 0);
+ first = __x.first;
+ if constexpr (sizeof...(_More) > 0)
+ {
+ static_assert(sizeof...(_Abis) >= sizeof...(_More));
+ second.template __assign_front<0>(__x.second);
+ }
+ }
+
+ template <size_t _Offset>
+ _GLIBCXX_SIMD_INTRINSIC constexpr void __assign_front(const _FirstType& __x) &
+ {
+ static_assert(_Offset == 0);
+ first = __x;
+ }
+
+ template <size_t _Offset, typename... _As>
+ _GLIBCXX_SIMD_INTRINSIC constexpr void
+ __assign_front(const _SimdTuple<_Tp, _As...>& __x) &
+ {
+ __builtin_memcpy(__as_charptr() + _Offset * sizeof(value_type),
+ __x.__as_charptr(),
+ sizeof(_Tp) * _SimdTuple<_Tp, _As...>::size());
+ }
+
+ /*
+ * Iterate over the first objects in this _SimdTuple and call __fun for each
+ * of them. If additional arguments are passed via __more, chunk them into
+ * _SimdTuple or __vector_type_t objects of the same number of values.
+ */
+ template <typename _Fp, typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple
+ __apply_per_chunk(_Fp&& __fun, _More&&... __more) const
+ {
+ if constexpr ((...
+ || conjunction_v<
+ is_lvalue_reference<_More>,
+ negation<is_const<remove_reference_t<_More>>>>) )
+ {
+ // need to write back at least one of __more after calling __fun
+ auto&& __first = [&](auto... __args) constexpr
+ {
+ auto __r
+ = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first, __args...);
+ [[maybe_unused]] auto&& __ignore_me = {(
+ [](auto&& __dst, const auto& __src) {
+ if constexpr (is_assignable_v<decltype(__dst), decltype(__dst)>)
+ {
+ __dst.template __assign_front<__offset<decltype(__dst)>>(
+ __src);
+ }
+ }(static_cast<_More&&>(__more), __args),
+ 0)...};
+ return __r;
+ }
+ (__extract_argument(__more)...);
+ if constexpr (_S_tuple_size == 1)
+ return {__first};
+ else
+ return {__first,
+ second.__apply_per_chunk(static_cast<_Fp&&>(__fun),
+ __skip_argument(__more)...)};
+ }
+ else
+ {
+ auto&& __first = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first,
+ __extract_argument(__more)...);
+ if constexpr (_S_tuple_size == 1)
+ return {__first};
+ else
+ return {__first,
+ second.__apply_per_chunk(static_cast<_Fp&&>(__fun),
+ __skip_argument(__more)...)};
+ }
+ }
+
+ template <typename _R = _Tp, typename _Fp, typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC auto __apply_r(_Fp&& __fun,
+ const _More&... __more) const
+ {
+ auto&& __first
+ = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first, __more.first...);
+ if constexpr (_S_tuple_size == 1)
+ return __first;
+ else
+ return __simd_tuple_concat<_R>(
+ __first, second.template __apply_r<_R>(static_cast<_Fp&&>(__fun),
+ __more.second...));
+ }
+
+ template <typename _Fp, typename... _More>
+ _GLIBCXX_SIMD_INTRINSIC constexpr friend _SanitizedBitMask<size()>
+ __test(const _Fp& __fun, const _SimdTuple& __x, const _More&... __more)
+ {
+ const _SanitizedBitMask<_S_first_size> __first
+ = _Abi0::_MaskImpl::__to_bits(__fun(__tuple_element_meta<_Tp, _Abi0, 0>(),
+ __x.first, __more.first...));
+ if constexpr (_S_tuple_size == 1)
+ return __first;
+ else
+ return __test(__fun, __x.second, __more.second...)._M_prepend(__first);
+ }
+
+ template <typename _Up, _Up _I>
+ _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+ operator[](std::integral_constant<_Up, _I>) const noexcept
+ {
+ if constexpr (_I < simd_size_v<_Tp, _Abi0>)
+ return __subscript_read(_I);
+ else
+ return second[std::integral_constant<_Up,
+ _I - simd_size_v<_Tp, _Abi0>>()];
+ }
+
+ _Tp operator[](size_t __i) const noexcept
+ {
+ if constexpr (_S_tuple_size == 1)
+ return __subscript_read(__i);
+ else
+ {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+ return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
+#else
+ if constexpr (__is_scalar_abi<_Abi0>())
+ {
+ const _Tp* ptr = &first;
+ return ptr[__i];
+ }
+ else
+ return __i < simd_size_v<_Tp, _Abi0>
+ ? __subscript_read(__i)
+ : second[__i - simd_size_v<_Tp, _Abi0>];
+#endif
+ }
+ }
+
+ void __set(size_t __i, _Tp __val) noexcept
+ {
+ if constexpr (_S_tuple_size == 1)
+ return __subscript_write(__i, __val);
+ else
+ {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+ reinterpret_cast<__may_alias<_Tp>*>(this)[__i] = __val;
+#else
+ if (__i < simd_size_v<_Tp, _Abi0>)
+ __subscript_write(__i, __val);
+ else
+ second.__set(__i - simd_size_v<_Tp, _Abi0>, __val);
+#endif
+ }
+ }
+
+private:
+ // __subscript_read/_write {{{
+ _Tp __subscript_read([[maybe_unused]] size_t __i) const noexcept
+ {
+ if constexpr (__is_vectorizable_v<_FirstType>)
+ return first;
+ else
+ return first[__i];
+ }
+
+ void __subscript_write([[maybe_unused]] size_t __i, _Tp __y) noexcept
+ {
+ if constexpr (__is_vectorizable_v<_FirstType>)
+ first = __y;
+ else
+ first.__set(__i, __y);
+ }
+
+ // }}}
+};
+
+// __make_simd_tuple {{{1
+template <typename _Tp, typename _A0>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+__make_simd_tuple(std::experimental::simd<_Tp, _A0> __x0)
+{
+ return {__data(__x0)};
+}
+template <typename _Tp, typename _A0, typename... _As>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _As...>
+__make_simd_tuple(const std::experimental::simd<_Tp, _A0>& __x0,
+ const std::experimental::simd<_Tp, _As>&... __xs)
+{
+ return {__data(__x0), __make_simd_tuple(__xs...)};
+}
+
+template <typename _Tp, typename _A0>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+__make_simd_tuple(const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0)
+{
+ return {__arg0};
+}
+
+template <typename _Tp, typename _A0, typename _A1, typename... _Abis>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _A1, _Abis...>
+__make_simd_tuple(
+ const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0,
+ const typename _SimdTraits<_Tp, _A1>::_SimdMember& __arg1,
+ const typename _SimdTraits<_Tp, _Abis>::_SimdMember&... __args)
+{
+ return {__arg0, __make_simd_tuple<_Tp, _A1, _Abis...>(__arg1, __args...)};
+}
+
+// __to_simd_tuple {{{1
+template <typename _Tp, size_t _Np, typename _V, size_t _NV, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_Tp, _Np>
+__to_simd_tuple(const std::array<_V, _NV>& __from, const _VX... __fromX);
+
+template <typename _Tp, size_t _Np,
+ size_t _Offset = 0, // skip this many elements in __from0
+ typename _R = __fixed_size_storage_t<_Tp, _Np>, typename _V0,
+ typename _V0VT = _VectorTraits<_V0>, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC _R constexpr __to_simd_tuple(const _V0 __from0,
+ const _VX... __fromX)
+{
+ static_assert(std::is_same_v<typename _V0VT::value_type, _Tp>);
+ static_assert(_Offset < _V0VT::_S_width);
+ using _R0 = __vector_type_t<_Tp, _R::_S_first_size>;
+ if constexpr (_R::_S_tuple_size == 1)
+ {
+ if constexpr (_Np == 1)
+ return _R{__from0[_Offset]};
+ else if constexpr (_Offset == 0 && _V0VT::_S_width >= _Np)
+ return _R{__intrin_bitcast<_R0>(__from0)};
+ else if constexpr (_Offset * 2 == _V0VT::_S_width
+ && _V0VT::_S_width / 2 >= _Np)
+ return _R{__intrin_bitcast<_R0>(__extract_part<1, 2>(__from0))};
+ else if constexpr (_Offset * 4 == _V0VT::_S_width
+ && _V0VT::_S_width / 4 >= _Np)
+ return _R{__intrin_bitcast<_R0>(__extract_part<1, 4>(__from0))};
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ {
+ if constexpr (1 == _R::_S_first_size)
+ { // extract one scalar and recurse
+ if constexpr (_Offset + 1 < _V0VT::_S_width)
+ return _R{__from0[_Offset],
+ __to_simd_tuple<_Tp, _Np - 1, _Offset + 1>(__from0,
+ __fromX...)};
+ else
+ return _R{__from0[_Offset],
+ __to_simd_tuple<_Tp, _Np - 1, 0>(__fromX...)};
+ }
+
+ // place __from0 into _R::first and recurse for __fromX -> _R::second
+ else if constexpr (_V0VT::_S_width == _R::_S_first_size && _Offset == 0)
+ return _R{__from0,
+ __to_simd_tuple<_Tp, _Np - _R::_S_first_size>(__fromX...)};
+
+ // place lower part of __from0 into _R::first and recurse with _Offset
+ else if constexpr (_V0VT::_S_width > _R::_S_first_size && _Offset == 0)
+ return _R{__intrin_bitcast<_R0>(__from0),
+ __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+ _R::_S_first_size>(__from0, __fromX...)};
+
+ // place lower part of second quarter of __from0 into _R::first and
+ // recurse with _Offset
+ else if constexpr (_Offset * 4 == _V0VT::_S_width
+ && _V0VT::_S_width >= 4 * _R::_S_first_size)
+ return _R{__intrin_bitcast<_R0>(__extract_part<2, 4>(__from0)),
+ __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+ _Offset + _R::_S_first_size>(__from0,
+ __fromX...)};
+
+ // place lower half of high half of __from0 into _R::first and recurse
+ // with _Offset
+ else if constexpr (_Offset * 2 == _V0VT::_S_width
+ && _V0VT::_S_width >= 4 * _R::_S_first_size)
+ return _R{__intrin_bitcast<_R0>(__extract_part<2, 4>(__from0)),
+ __to_simd_tuple<_Tp, _Np - _R::_S_first_size,
+ _Offset + _R::_S_first_size>(__from0,
+ __fromX...)};
+
+ // place high half of __from0 into _R::first and recurse with __fromX
+ else if constexpr (_Offset * 2 == _V0VT::_S_width
+ && _V0VT::_S_width / 2 >= _R::_S_first_size)
+ return _R{__intrin_bitcast<_R0>(__extract_part<1, 2>(__from0)),
+ __to_simd_tuple<_Tp, _Np - _R::_S_first_size, 0>(__fromX...)};
+
+ // ill-formed if some unforseen pattern is needed
+ else
+ __assert_unreachable<_Tp>();
+ }
+}
+
+template <typename _Tp, size_t _Np, typename _V, size_t _NV, typename... _VX>
+_GLIBCXX_SIMD_INTRINSIC constexpr __fixed_size_storage_t<_Tp, _Np>
+__to_simd_tuple(const std::array<_V, _NV>& __from, const _VX... __fromX)
+{
+ if constexpr (std::is_same_v<_Tp, _V>)
+ {
+ static_assert(
+ sizeof...(_VX) == 0,
+ "An array of scalars must be the last argument to __to_simd_tuple");
+ return __call_with_subscripts(
+ __from,
+ std::make_index_sequence<_NV>(), [&](const auto... __args) constexpr {
+ return __simd_tuple_concat(
+ _SimdTuple<_Tp, simd_abi::scalar>{__args}..., _SimdTuple<_Tp>());
+ });
+ }
+ else
+ return __call_with_subscripts(
+ __from,
+ std::make_index_sequence<_NV>(), [&](const auto... __args) constexpr {
+ return __to_simd_tuple<_Tp, _Np>(__args..., __fromX...);
+ });
+}
+
+template <size_t, typename _Tp> using __to_tuple_helper = _Tp;
+template <typename _Tp, typename _A0, size_t _NOut, size_t _Np,
+ size_t... _Indexes>
+_GLIBCXX_SIMD_INTRINSIC __fixed_size_storage_t<_Tp, _NOut>
+__to_simd_tuple_impl(
+ std::index_sequence<_Indexes...>,
+ const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+ return __make_simd_tuple<_Tp, __to_tuple_helper<_Indexes, _A0>...>(
+ __args[_Indexes]...);
+}
+
+template <typename _Tp, typename _A0, size_t _NOut, size_t _Np,
+ typename _R = __fixed_size_storage_t<_Tp, _NOut>>
+_GLIBCXX_SIMD_INTRINSIC _R
+__to_simd_tuple_sized(
+ const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+ static_assert(_Np * simd_size_v<_Tp, _A0> >= _NOut);
+ return __to_simd_tuple_impl<_Tp, _A0, _NOut>(
+ std::make_index_sequence<_R::_S_tuple_size>(), __args);
+}
+
+template <typename _Tp, typename _A0, size_t _Np>
+[[deprecated]] _GLIBCXX_SIMD_INTRINSIC auto
+__to_simd_tuple(
+ const std::array<__vector_type_t<_Tp, simd_size_v<_Tp, _A0>>, _Np>& __args)
+{
+ return __to_simd_tuple<_Tp, _Np * simd_size_v<_Tp, _A0>>(__args);
+}
+
+// __optimize_simd_tuple {{{1
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp>
+__optimize_simd_tuple(const _SimdTuple<_Tp>)
+{
+ return {};
+}
+
+template <typename _Tp, typename _Ap>
+_GLIBCXX_SIMD_INTRINSIC const _SimdTuple<_Tp, _Ap>&
+__optimize_simd_tuple(const _SimdTuple<_Tp, _Ap>& __x)
+{
+ return __x;
+}
+
+template <typename _Tp, typename _A0, typename _A1, typename... _Abis,
+ typename _R = __fixed_size_storage_t<
+ _Tp, _SimdTuple<_Tp, _A0, _A1, _Abis...>::size()>>
+_GLIBCXX_SIMD_INTRINSIC _R
+__optimize_simd_tuple(const _SimdTuple<_Tp, _A0, _A1, _Abis...>& __x)
+{
+ using _Tup = _SimdTuple<_Tp, _A0, _A1, _Abis...>;
+ if constexpr (std::is_same_v<_R, _Tup>)
+ return __x;
+ else if constexpr (is_same_v<typename _R::_FirstType,
+ typename _Tup::_FirstType>)
+ return {__x.first, __optimize_simd_tuple(__x.second)};
+ else if constexpr (__is_scalar_abi<_A0>()) // implies all entries are scalar
+ return {
+ __generate_from_n_evaluations<_R::_S_first_size, typename _R::_FirstType>(
+ [&](auto __i) { return __x[__i]; }),
+ __optimize_simd_tuple(__simd_tuple_pop_front<_R::_S_first_size>(__x))};
+ else if constexpr (_R::_S_first_size
+ == simd_size_v<
+ _Tp,
+ _A0> + simd_size_v<_Tp, _A1> && is_same_v<_A0, _A1>)
+ return {__concat(__x.template __at<0>(), __x.template __at<1>()),
+ __optimize_simd_tuple(__x.second.second)};
+ else if constexpr (
+ sizeof...(_Abis) >= 2
+ && _R::_S_first_size
+ == 4
+ * simd_size_v<
+ _Tp,
+ _A0> && simd_size_v<_Tp, _A0> == __simd_tuple_element_t<(sizeof...(_Abis) >= 2 ? 3 : 0), _Tup>::size())
+ return {__concat(__concat(__x.template __at<0>(), __x.template __at<1>()),
+ __concat(__x.template __at<2>(), __x.template __at<3>())),
+ __optimize_simd_tuple(__x.second.second.second.second)};
+ else
+ {
+ _R __r;
+ __builtin_memcpy(__r.__as_charptr(), __x.__as_charptr(),
+ sizeof(_Tp) * _R::size());
+ return __r;
+ }
+}
+
+// __for_each(const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0>& __t, _Fp&& __fun)
+{
+ static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__t), __t.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+ typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __t, _Fp&& __fun)
+{
+ __fun(__make_meta<_Offset>(__t), __t.first);
+ __for_each<_Offset + simd_size<_Tp, _A0>::value>(__t.second,
+ static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(_SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0>& __t, _Fp&& __fun)
+{
+ static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__t), __t.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+ typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __t, _Fp&& __fun)
+{
+ __fun(__make_meta<_Offset>(__t), __t.first);
+ __for_each<_Offset + simd_size<_Tp, _A0>::value>(__t.second,
+ static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(_SimdTuple &, const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
+ _Fp&& __fun)
+{
+ static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+ typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(_SimdTuple<_Tp, _A0, _A1, _As...>& __a,
+ const _SimdTuple<_Tp, _A0, _A1, _As...>& __b, _Fp&& __fun)
+{
+ __fun(__make_meta<_Offset>(__a), __a.first, __b.first);
+ __for_each<_Offset + simd_size<_Tp, _A0>::value>(__a.second, __b.second,
+ static_cast<_Fp&&>(__fun));
+}
+
+// __for_each(const _SimdTuple &, const _SimdTuple &, Fun) {{{1
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0>& __a, const _SimdTuple<_Tp, _A0>& __b,
+ _Fp&& __fun)
+{
+ static_cast<_Fp&&>(__fun)(__make_meta<_Offset>(__a), __a.first, __b.first);
+}
+template <size_t _Offset = 0, typename _Tp, typename _A0, typename _A1,
+ typename... _As, typename _Fp>
+_GLIBCXX_SIMD_INTRINSIC constexpr void
+__for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __a,
+ const _SimdTuple<_Tp, _A0, _A1, _As...>& __b, _Fp&& __fun)
+{
+ __fun(__make_meta<_Offset>(__a), __a.first, __b.first);
+ __for_each<_Offset + simd_size<_Tp, _A0>::value>(__a.second, __b.second,
+ static_cast<_Fp&&>(__fun));
+}
+
+// }}}1
+// __extract_part(_SimdTuple) {{{
+template <int _Index, int _Total, int _Combine, typename _Tp, typename _A0,
+ typename... _As>
+_GLIBCXX_SIMD_INTRINSIC auto // __vector_type_t or _SimdTuple
+__extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x)
+{
+ // worst cases:
+ // (a) 4, 4, 4 => 3, 3, 3, 3 (_Total = 4)
+ // (b) 2, 2, 2 => 3, 3 (_Total = 2)
+ // (c) 4, 2 => 2, 2, 2 (_Total = 3)
+ using _Tuple = _SimdTuple<_Tp, _A0, _As...>;
+ static_assert(_Index + _Combine <= _Total && _Index >= 0 && _Total >= 1);
+ constexpr size_t _Np = _Tuple::size();
+ static_assert(_Np >= _Total && _Np % _Total == 0);
+ constexpr size_t __values_per_part = _Np / _Total;
+ [[maybe_unused]] constexpr size_t __values_to_skip
+ = _Index * __values_per_part;
+ constexpr size_t __return_size = __values_per_part * _Combine;
+ using _RetAbi = simd_abi::deduce_t<_Tp, __return_size>;
+
+ // handle (optimize) the simple cases
+ if constexpr (_Index == 0 && _Tuple::_S_first_size == __return_size)
+ return __x.first._M_data;
+ else if constexpr (_Index == 0 && _Total == _Combine)
+ return __x;
+ else if constexpr (_Index == 0 && _Tuple::_S_first_size >= __return_size)
+ return __intrin_bitcast<__vector_type_t<_Tp, __return_size>>(
+ __as_vector(__x.first));
+
+ // recurse to skip unused data members at the beginning of _SimdTuple
+ else if constexpr (__values_to_skip >= _Tuple::_S_first_size)
+ { // recurse
+ if constexpr (_Tuple::_S_first_size % __values_per_part == 0)
+ {
+ constexpr int __parts_in_first
+ = _Tuple::_S_first_size / __values_per_part;
+ return __extract_part<_Index - __parts_in_first,
+ _Total - __parts_in_first, _Combine>(
+ __x.second);
+ }
+ else
+ return __extract_part<__values_to_skip - _Tuple::_S_first_size,
+ _Np - _Tuple::_S_first_size, __return_size>(
+ __x.second);
+ }
+
+ // extract from multiple _SimdTuple data members
+ else if constexpr (__return_size > _Tuple::_S_first_size - __values_to_skip)
+ {
+#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
+ const __may_alias<_Tp>* const element_ptr
+ = reinterpret_cast<const __may_alias<_Tp>*>(&__x) + __values_to_skip;
+ return __as_vector(simd<_Tp, _RetAbi>(element_ptr, element_aligned));
+#else
+ [[maybe_unused]] constexpr size_t __offset = __values_to_skip;
+ return __as_vector(simd<_Tp, _RetAbi>([&](auto __i) constexpr {
+ constexpr _SizeConstant<__i + __offset> __k;
+ return __x[__k];
+ }));
+#endif
+ }
+
+ // all of the return values are in __x.first
+ else if constexpr (_Tuple::_S_first_size % __values_per_part == 0)
+ return __extract_part<_Index, _Tuple::_S_first_size / __values_per_part,
+ _Combine>(__x.first);
+ else
+ return __extract_part<__values_to_skip, _Tuple::_S_first_size,
+ _Combine * __values_per_part>(__x.first);
+}
+
+// }}}
+// __fixed_size_storage_t<_Tp, _Np>{{{
+template <typename _Tp, int _Np, typename _Tuple,
+ typename _Next = simd<_Tp, _AllNativeAbis::_BestAbi<_Tp, _Np>>,
+ int _Remain = _Np - int(_Next::size())>
+struct __fixed_size_storage_builder;
+
+template <typename _Tp, int _Np>
+struct __fixed_size_storage
+ : public __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp>>
+{
+};
+
+template <typename _Tp, int _Np, typename... _As, typename _Next>
+struct __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp, _As...>, _Next, 0>
+{
+ using type = _SimdTuple<_Tp, _As..., typename _Next::abi_type>;
+};
+
+template <typename _Tp, int _Np, typename... _As, typename _Next, int _Remain>
+struct __fixed_size_storage_builder<_Tp, _Np, _SimdTuple<_Tp, _As...>, _Next,
+ _Remain>
+{
+ using type = typename __fixed_size_storage_builder<
+ _Tp, _Remain, _SimdTuple<_Tp, _As..., typename _Next::abi_type>>::type;
+};
+
+// }}}
+// _AbisInSimdTuple {{{
+template <typename _Tp> struct _SeqOp;
+template <size_t _I0, size_t... _Is>
+struct _SeqOp<std::index_sequence<_I0, _Is...>>
+{
+ using _FirstPlusOne = std::index_sequence<_I0 + 1, _Is...>;
+ using _NotFirstPlusOne = std::index_sequence<_I0, (_Is + 1)...>;
+ template <size_t _First, size_t _Add>
+ using _Prepend = std::index_sequence<_First, _I0 + _Add, (_Is + _Add)...>;
+};
+
+template <typename _Tp> struct _AbisInSimdTuple;
+template <typename _Tp> struct _AbisInSimdTuple<_SimdTuple<_Tp>>
+{
+ using _Counts = std::index_sequence<0>;
+ using _Begins = std::index_sequence<0>;
+};
+template <typename _Tp, typename _Ap>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _Ap>>
+{
+ using _Counts = std::index_sequence<1>;
+ using _Begins = std::index_sequence<0>;
+};
+template <typename _Tp, typename _A0, typename... _As>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A0, _As...>>
+{
+ using _Counts = typename _SeqOp<typename _AbisInSimdTuple<
+ _SimdTuple<_Tp, _A0, _As...>>::_Counts>::_FirstPlusOne;
+ using _Begins = typename _SeqOp<typename _AbisInSimdTuple<
+ _SimdTuple<_Tp, _A0, _As...>>::_Begins>::_NotFirstPlusOne;
+};
+template <typename _Tp, typename _A0, typename _A1, typename... _As>
+struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A1, _As...>>
+{
+ using _Counts = typename _SeqOp<typename _AbisInSimdTuple<
+ _SimdTuple<_Tp, _A1, _As...>>::_Counts>::template _Prepend<1, 0>;
+ using _Begins = typename _SeqOp<typename _AbisInSimdTuple<
+ _SimdTuple<_Tp, _A1, _As...>>::_Begins>::template _Prepend<0, 1>;
+};
+
+// }}}
+// __autocvt_to_simd {{{
+template <typename _Tp, bool = std::is_arithmetic_v<__remove_cvref_t<_Tp>>>
+struct __autocvt_to_simd
+{
+ _Tp _M_data;
+ using _TT = __remove_cvref_t<_Tp>;
+ operator _TT() { return _M_data; }
+ operator _TT&()
+ {
+ static_assert(std::is_lvalue_reference<_Tp>::value, "");
+ static_assert(!std::is_const<_Tp>::value, "");
+ return _M_data;
+ }
+ operator _TT*()
+ {
+ static_assert(std::is_lvalue_reference<_Tp>::value, "");
+ static_assert(!std::is_const<_Tp>::value, "");
+ return &_M_data;
+ }
+
+ constexpr inline __autocvt_to_simd(_Tp dd) : _M_data(dd) {}
+
+ template <typename _Abi> operator simd<typename _TT::value_type, _Abi>()
+ {
+ return {__private_init, _M_data};
+ }
+
+ template <typename _Abi> operator simd<typename _TT::value_type, _Abi> &()
+ {
+ return *reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data);
+ }
+
+ template <typename _Abi> operator simd<typename _TT::value_type, _Abi> *()
+ {
+ return reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data);
+ }
+};
+template <typename _Tp> __autocvt_to_simd(_Tp &&) -> __autocvt_to_simd<_Tp>;
+
+template <typename _Tp> struct __autocvt_to_simd<_Tp, true>
+{
+ using _TT = __remove_cvref_t<_Tp>;
+ _Tp _M_data;
+ fixed_size_simd<_TT, 1> _M_fd;
+
+ constexpr inline __autocvt_to_simd(_Tp dd) : _M_data(dd), _M_fd(_M_data) {}
+ ~__autocvt_to_simd() { _M_data = __data(_M_fd).first; }
+
+ operator fixed_size_simd<_TT, 1>() { return _M_fd; }
+ operator fixed_size_simd<_TT, 1> &()
+ {
+ static_assert(std::is_lvalue_reference<_Tp>::value, "");
+ static_assert(!std::is_const<_Tp>::value, "");
+ return _M_fd;
+ }
+ operator fixed_size_simd<_TT, 1> *()
+ {
+ static_assert(std::is_lvalue_reference<_Tp>::value, "");
+ static_assert(!std::is_const<_Tp>::value, "");
+ return &_M_fd;
+ }
+};
+
+// }}}
+
+struct _CommonImplFixedSize;
+template <int _Np> struct _SimdImplFixedSize;
+template <int _Np> struct _MaskImplFixedSize;
+// simd_abi::_Fixed {{{
+template <int _Np> struct simd_abi::_Fixed
+{
+ template <typename _Tp> static constexpr size_t size = _Np;
+ template <typename _Tp> static constexpr size_t _S_full_size = _Np;
+ // validity traits {{{
+ struct _IsValidAbiTag : public __bool_constant<(_Np > 0)>
+ {
+ };
+ template <typename _Tp>
+ struct _IsValidSizeFor
+ : __bool_constant<(_Np <= simd_abi::max_fixed_size<_Tp>)>
+ {
+ };
+ template <typename _Tp>
+ struct _IsValid
+ : conjunction<_IsValidAbiTag, __is_vectorizable<_Tp>, _IsValidSizeFor<_Tp>>
+ {
+ };
+ template <typename _Tp>
+ static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+ // }}}
+ // __masked {{{
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __masked(_BitMask<_Np> __x)
+ {
+ return __x._M_sanitized();
+ }
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __masked(_SanitizedBitMask<_Np> __x)
+ {
+ return __x;
+ }
+
+ // }}}
+ // _*Impl {{{
+ using _CommonImpl = _CommonImplFixedSize;
+ using _SimdImpl = _SimdImplFixedSize<_Np>;
+ using _MaskImpl = _MaskImplFixedSize<_Np>;
+
+ // }}}
+ // __traits {{{
+ template <typename _Tp, bool = _S_is_valid_v<_Tp>>
+ struct __traits : _InvalidTraits
+ {
+ };
+
+ template <typename _Tp> struct __traits<_Tp, true>
+ {
+ using _IsValid = true_type;
+ using _SimdImpl = _SimdImplFixedSize<_Np>;
+ using _MaskImpl = _MaskImplFixedSize<_Np>;
+
+ // simd and simd_mask member types {{{
+ using _SimdMember = __fixed_size_storage_t<_Tp, _Np>;
+ using _MaskMember = _SanitizedBitMask<_Np>;
+ static constexpr size_t _S_simd_align
+ = __next_power_of_2(_Np * sizeof(_Tp));
+ static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+ // }}}
+ // _SimdBase / base class for simd, providing extra conversions {{{
+ struct _SimdBase
+ {
+ // The following ensures, function arguments are passed via the stack.
+ // This is important for ABI compatibility across TU boundaries
+ _SimdBase(const _SimdBase&) {}
+ _SimdBase() = default;
+
+ explicit operator const _SimdMember &() const
+ {
+ return static_cast<const simd<_Tp, _Fixed>*>(this)->_M_data;
+ }
+ explicit operator std::array<_Tp, _Np>() const
+ {
+ std::array<_Tp, _Np> __r;
+ // _SimdMember can be larger because of higher alignment
+ static_assert(sizeof(__r) <= sizeof(_SimdMember), "");
+ __builtin_memcpy(__r.data(), &static_cast<const _SimdMember&>(*this),
+ sizeof(__r));
+ return __r;
+ }
+ };
+
+ // }}}
+ // _MaskBase {{{
+ // empty. The std::bitset interface suffices
+ struct _MaskBase
+ {
+ };
+
+ // }}}
+ // _SimdCastType {{{
+ struct _SimdCastType
+ {
+ _SimdCastType(const std::array<_Tp, _Np>&);
+ _SimdCastType(const _SimdMember& dd) : _M_data(dd) {}
+ explicit operator const _SimdMember &() const { return _M_data; }
+
+ private:
+ const _SimdMember& _M_data;
+ };
+
+ // }}}
+ // _MaskCastType {{{
+ class _MaskCastType
+ {
+ _MaskCastType() = delete;
+ };
+ // }}}
+ };
+ // }}}
+};
+
+// }}}
+// _CommonImplFixedSize {{{
+struct _CommonImplFixedSize
+{
+ // __store {{{
+ template <typename _Flags, typename _Tp, typename... _As>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __store(const _SimdTuple<_Tp, _As...>& __x, void* __addr, _Flags)
+ {
+ constexpr size_t _Np = _SimdTuple<_Tp, _As...>::size();
+ if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+ __addr = __builtin_assume_aligned(
+ __addr, memory_alignment_v<fixed_size_simd<_Tp, _Np>, _Tp>);
+ else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+ __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+ __builtin_memcpy(__addr, &__x, _Np * sizeof(_Tp));
+ }
+
+ // }}}
+};
+
+// }}}
+// _SimdImplFixedSize {{{1
+// fixed_size should not inherit from _SimdMathFallback in order for
+// specializations in the used _SimdTuple Abis to get used
+template <int _Np> struct _SimdImplFixedSize
+{
+ // member types {{{2
+ using _MaskMember = _SanitizedBitMask<_Np>;
+ template <typename _Tp> using _SimdMember = __fixed_size_storage_t<_Tp, _Np>;
+ template <typename _Tp>
+ static constexpr std::size_t _S_tuple_size = _SimdMember<_Tp>::_S_tuple_size;
+ template <typename _Tp>
+ using _Simd = std::experimental::simd<_Tp, simd_abi::fixed_size<_Np>>;
+ template <typename _Tp> using _TypeTag = _Tp*;
+
+ // broadcast {{{2
+ template <typename _Tp>
+ static constexpr inline _SimdMember<_Tp> __broadcast(_Tp __x) noexcept
+ {
+ return _SimdMember<_Tp>::__generate([&](auto __meta) constexpr {
+ return __meta.__broadcast(__x);
+ });
+ }
+
+ // __generator {{{2
+ template <typename _Fp, typename _Tp>
+ static constexpr inline _SimdMember<_Tp> __generator(_Fp&& __gen,
+ _TypeTag<_Tp>)
+ {
+ return _SimdMember<_Tp>::__generate([&__gen](auto __meta) constexpr {
+ return __meta.__generator(
+ [&](auto __i) constexpr {
+ return __i < _Np ? __gen(_SizeConstant<__meta._S_offset + __i>()) : 0;
+ },
+ _TypeTag<_Tp>());
+ });
+ }
+
+ // __load {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ static inline _SimdMember<_Tp> __load(const _Up* __mem, _Fp __f,
+ _TypeTag<_Tp>) noexcept
+ {
+ return _SimdMember<_Tp>::__generate([&](auto __meta) {
+ return __meta.__load(&__mem[__meta._S_offset], __f, _TypeTag<_Tp>());
+ });
+ }
+
+ // __masked_load {{{2
+ template <typename _Tp, typename... _As, typename _Up, typename _Fp>
+ static inline _SimdTuple<_Tp, _As...>
+ __masked_load(const _SimdTuple<_Tp, _As...>& __old, const _MaskMember __bits,
+ const _Up* __mem, _Fp __f) noexcept
+ {
+ auto __merge = __old;
+ __for_each(__merge, [&](auto __meta, auto& __native) {
+ if (__meta.__submask(__bits).any())
+#pragma GCC diagnostic push
+ // __mem + __mem._S_offset could be UB ([expr.add]/4.3, but it punts the
+ // responsibility for avoiding UB to the caller of the masked load via the
+ // mask. Consequently, the compiler may assume this branch is unreachable,
+ // if the pointer arithmetic is UB.
+#pragma GCC diagnostic ignored "-Warray-bounds"
+ __native = __meta.__masked_load(__native, __meta.__make_mask(__bits),
+ __mem + __meta._S_offset, __f);
+#pragma GCC diagnostic pop
+ });
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ static inline void __store(const _SimdMember<_Tp>& __v, _Up* __mem, _Fp __f,
+ _TypeTag<_Tp>) noexcept
+ {
+ __for_each(__v, [&](auto __meta, auto __native) {
+ __meta.__store(__native, &__mem[__meta._S_offset], __f, _TypeTag<_Tp>());
+ });
+ }
+
+ // __masked_store {{{2
+ template <typename _Tp, typename... _As, typename _Up, typename _Fp>
+ static inline void __masked_store(const _SimdTuple<_Tp, _As...>& __v,
+ _Up* __mem, _Fp __f,
+ const _MaskMember __bits) noexcept
+ {
+ __for_each(__v, [&](auto __meta, auto __native) {
+ if (__meta.__submask(__bits).any())
+#pragma GCC diagnostic push
+ // __mem + __mem._S_offset could be UB ([expr.add]/4.3, but it punts the
+ // responsibility for avoiding UB to the caller of the masked store via the
+ // mask. Consequently, the compiler may assume this branch is unreachable,
+ // if the pointer arithmetic is UB.
+#pragma GCC diagnostic ignored "-Warray-bounds"
+ __meta.__masked_store(__native, __mem + __meta._S_offset, __f,
+ __meta.__make_mask(__bits));
+#pragma GCC diagnostic pop
+ });
+ }
+
+ // negation {{{2
+ template <typename _Tp, typename... _As>
+ static inline _MaskMember
+ __negate(const _SimdTuple<_Tp, _As...>& __x) noexcept
+ {
+ _MaskMember __bits = 0;
+ __for_each(
+ __x, [&__bits](auto __meta, auto __native) constexpr {
+ __bits |= __meta.__mask_to_shifted_ullong(__meta.__negate(__native));
+ });
+ return __bits;
+ }
+
+ // reductions {{{2
+ template <typename _Tp, typename _BinaryOperation>
+ static constexpr inline _Tp __reduce(const _Simd<_Tp>& __x,
+ const _BinaryOperation& __binary_op)
+ {
+ using _Tup = _SimdMember<_Tp>;
+ const _Tup& __tup = __data(__x);
+ if constexpr (_Tup::_S_tuple_size == 1)
+ return _Tup::_FirstAbi::_SimdImpl::__reduce(__tup.template __simd_at<0>(),
+ __binary_op);
+ else if constexpr (_Tup::_S_tuple_size == 2
+ && _Tup::size() > 2
+ && _Tup::_SecondType::size() == 1)
+ {
+ return __binary_op(simd<_Tp, simd_abi::scalar>(
+ reduce(__tup.template __simd_at<0>(),
+ __binary_op)),
+ __tup.template __simd_at<1>())[0];
+ }
+ else if constexpr (_Tup::_S_tuple_size == 2
+ && _Tup::size() > 4
+ && _Tup::_SecondType::size() == 2)
+ {
+ return __binary_op(
+ simd<_Tp, simd_abi::scalar>(
+ reduce(__tup.template __simd_at<0>(), __binary_op)),
+ simd<_Tp, simd_abi::scalar>(
+ reduce(__tup.template __simd_at<1>(), __binary_op)))[0];
+ }
+ else
+ {
+ const auto& __x2
+ = __call_with_n_evaluations<__div_roundup(_Tup::_S_tuple_size, 2)>(
+ [](auto __first_simd, auto... __remaining) {
+ if constexpr (sizeof...(__remaining) == 0)
+ return __first_simd;
+ else
+ {
+ using _Tup2
+ = _SimdTuple<_Tp, typename decltype(__first_simd)::abi_type,
+ typename decltype(__remaining)::abi_type...>;
+ return fixed_size_simd<_Tp, _Tup2::size()>(
+ __private_init,
+ __make_simd_tuple(__first_simd, __remaining...));
+ }
+ },
+ [&](auto __i) {
+ auto __left = __tup.template __simd_at<2 * __i>();
+ if constexpr (2 * __i + 1 == _Tup::_S_tuple_size)
+ return __left;
+ else
+ {
+ auto __right = __tup.template __simd_at<2 * __i + 1>();
+ using _LT = decltype(__left);
+ using _RT = decltype(__right);
+ if constexpr (_LT::size() == _RT::size())
+ return __binary_op(__left, __right);
+ else
+ {
+ _GLIBCXX_SIMD_CONSTEXPR typename _LT::mask_type __k(
+ __private_init, [](auto __j) constexpr {
+ return __j < _RT::size();
+ });
+ _LT __ext_right = __left;
+ where(__k, __ext_right)
+ = __proposed::resizing_simd_cast<_LT>(__right);
+ where(__k, __left) = __binary_op(__left, __ext_right);
+ return __left;
+ }
+ }
+ });
+ return reduce(__x2, __binary_op);
+ }
+ }
+
+ // __min, __max {{{2
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __min(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
+ {
+ return __a.__apply_per_chunk(
+ [](auto __impl, auto __aa, auto __bb) constexpr {
+ return __impl.__min(__aa, __bb);
+ },
+ __b);
+ }
+
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __max(const _SimdTuple<_Tp, _As...>& __a, const _SimdTuple<_Tp, _As...>& __b)
+ {
+ return __a.__apply_per_chunk(
+ [](auto __impl, auto __aa, auto __bb) constexpr {
+ return __impl.__max(__aa, __bb);
+ },
+ __b);
+ }
+
+ // __complement {{{2
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __complement(const _SimdTuple<_Tp, _As...>& __x) noexcept
+ {
+ return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr {
+ return __impl.__complement(__xx);
+ });
+ }
+
+ // __unary_minus {{{2
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __unary_minus(const _SimdTuple<_Tp, _As...>& __x) noexcept
+ {
+ return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr {
+ return __impl.__unary_minus(__xx);
+ });
+ }
+
+ // arithmetic operators {{{2
+
+#define _GLIBCXX_SIMD_FIXED_OP(name_, op_) \
+ template <typename _Tp, typename... _As> \
+ static inline constexpr _SimdTuple<_Tp, _As...> name_( \
+ const _SimdTuple<_Tp, _As...> __x, const _SimdTuple<_Tp, _As...> __y) \
+ { \
+ return __x.__apply_per_chunk( \
+ [](auto __impl, auto __xx, auto __yy) constexpr { \
+ return __impl.name_(__xx, __yy); \
+ }, \
+ __y); \
+ }
+
+ _GLIBCXX_SIMD_FIXED_OP(__plus, +)
+ _GLIBCXX_SIMD_FIXED_OP(__minus, -)
+ _GLIBCXX_SIMD_FIXED_OP(__multiplies, *)
+ _GLIBCXX_SIMD_FIXED_OP(__divides, /)
+ _GLIBCXX_SIMD_FIXED_OP(__modulus, %)
+ _GLIBCXX_SIMD_FIXED_OP(__bit_and, &)
+ _GLIBCXX_SIMD_FIXED_OP(__bit_or, |)
+ _GLIBCXX_SIMD_FIXED_OP(__bit_xor, ^)
+ _GLIBCXX_SIMD_FIXED_OP(__bit_shift_left, <<)
+ _GLIBCXX_SIMD_FIXED_OP(__bit_shift_right, >>)
+#undef _GLIBCXX_SIMD_FIXED_OP
+
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __bit_shift_left(const _SimdTuple<_Tp, _As...>& __x, int __y)
+ {
+ return __x.__apply_per_chunk([__y](auto __impl, auto __xx) constexpr {
+ return __impl.__bit_shift_left(__xx, __y);
+ });
+ }
+
+ template <typename _Tp, typename... _As>
+ static inline constexpr _SimdTuple<_Tp, _As...>
+ __bit_shift_right(const _SimdTuple<_Tp, _As...>& __x, int __y)
+ {
+ return __x.__apply_per_chunk([__y](auto __impl, auto __xx) constexpr {
+ return __impl.__bit_shift_right(__xx, __y);
+ });
+ }
+
+ // math {{{2
+#define _GLIBCXX_SIMD_APPLY_ON_TUPLE(_RetTp, __name) \
+ template <typename _Tp, typename... _As, typename... _More> \
+ static inline __fixed_size_storage_t<_RetTp, \
+ _SimdTuple<_Tp, _As...>::size()> \
+ __##__name(const _SimdTuple<_Tp, _As...>& __x, const _More&... __more) \
+ { \
+ if constexpr (sizeof...(_More) == 0) \
+ { \
+ if constexpr (is_same_v<_Tp, _RetTp>) \
+ return __x.__apply_per_chunk([](auto __impl, auto __xx) constexpr { \
+ using _V = typename decltype(__impl)::simd_type; \
+ return __data(__name(_V(__private_init, __xx))); \
+ }); \
+ else \
+ return __optimize_simd_tuple(__x.template __apply_r<_RetTp>( \
+ [](auto __impl, auto __xx) { return __impl.__##__name(__xx); })); \
+ } \
+ else if constexpr ( \
+ is_same_v< \
+ _Tp, \
+ _RetTp> && (... && std::is_same_v<_SimdTuple<_Tp, _As...>, _More>) ) \
+ return __x.__apply_per_chunk( \
+ [](auto __impl, auto __xx, auto... __pack) constexpr { \
+ using _V = typename decltype(__impl)::simd_type; \
+ return __data( \
+ __name(_V(__private_init, __xx), _V(__private_init, __pack)...)); \
+ }, \
+ __more...); \
+ else if constexpr (is_same_v<_Tp, _RetTp>) \
+ return __x.__apply_per_chunk( \
+ [](auto __impl, auto __xx, auto... __pack) constexpr { \
+ using _V = typename decltype(__impl)::simd_type; \
+ return __data( \
+ __name(_V(__private_init, __xx), __autocvt_to_simd(__pack)...)); \
+ }, \
+ __more...); \
+ else \
+ __assert_unreachable<_Tp>(); \
+ }
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, acos)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, asin)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atan)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atan2)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cos)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sin)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tan)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, acosh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, asinh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, atanh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cosh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sinh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tanh)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, exp)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, exp2)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, expm1)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(int, ilogb)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log10)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log1p)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, log2)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, logb)
+ // modf implemented in simd_math.h
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, scalbn) // double scalbn(double x, int exp);
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, scalbln)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, cbrt)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, abs)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fabs)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, pow)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, sqrt)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, erf)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, erfc)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, lgamma)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, tgamma)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, trunc)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ceil)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, floor)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nearbyint)
+
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, rint)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(long, lrint)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(long long, llrint)
+
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, round)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(long, lround)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(long long, llround)
+
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ldexp)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmod)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, remainder)
+ // copysign in simd_math.h
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nextafter)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fdim)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmax)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmin)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fma)
+ _GLIBCXX_SIMD_APPLY_ON_TUPLE(int, fpclassify)
+#undef _GLIBCXX_SIMD_APPLY_ON_TUPLE
+
+ template <typename _Tp, typename... _Abis>
+ static _SimdTuple<_Tp, _Abis...>
+ __remquo(const _SimdTuple<_Tp, _Abis...>& __x,
+ const _SimdTuple<_Tp, _Abis...>& __y,
+ __fixed_size_storage_t<int, _SimdTuple<_Tp, _Abis...>::size()>* __z)
+ {
+ return __x.__apply_per_chunk(
+ [](auto __impl, const auto __xx, const auto __yy, auto& __zz) {
+ return __impl.__remquo(__xx, __yy, &__zz);
+ },
+ __y, *__z);
+ }
+
+ template <typename _Tp, typename... _As>
+ static inline _SimdTuple<_Tp, _As...>
+ __frexp(const _SimdTuple<_Tp, _As...>& __x,
+ __fixed_size_storage_t<int, _Np>& __exp) noexcept
+ {
+ return __x.__apply_per_chunk(
+ [](auto __impl, const auto& __a, auto& __b) {
+ return __data(
+ frexp(typename decltype(__impl)::simd_type(__private_init, __a),
+ __autocvt_to_simd(__b)));
+ },
+ __exp);
+ }
+
+ template <typename _Tp, typename... _As>
+ static inline __fixed_size_storage_t<int, _Np>
+ __fpclassify(const _SimdTuple<_Tp, _As...>& __x) noexcept
+ {
+ return __optimize_simd_tuple(__x.template __apply_r<int>(
+ [](auto __impl, auto __xx) { return __impl.__fpclassify(__xx); }));
+ }
+
+#define _GLIBCXX_SIMD_TEST_ON_TUPLE_(name_) \
+ template <typename _Tp, typename... _As> \
+ static inline _MaskMember __##name_( \
+ const _SimdTuple<_Tp, _As...>& __x) noexcept \
+ { \
+ return __test([](auto __impl, \
+ auto __xx) { return __impl.__##name_(__xx); }, \
+ __x); \
+ }
+ _GLIBCXX_SIMD_TEST_ON_TUPLE_(isinf)
+ _GLIBCXX_SIMD_TEST_ON_TUPLE_(isfinite)
+ _GLIBCXX_SIMD_TEST_ON_TUPLE_(isnan)
+ _GLIBCXX_SIMD_TEST_ON_TUPLE_(isnormal)
+ _GLIBCXX_SIMD_TEST_ON_TUPLE_(signbit)
+#undef _GLIBCXX_SIMD_TEST_ON_TUPLE_
+
+ // __increment & __decrement{{{2
+ template <typename... _Ts>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr void
+ __increment(_SimdTuple<_Ts...>& __x)
+ {
+ __for_each(
+ __x,
+ [](auto __meta, auto& native) constexpr { __meta.__increment(native); });
+ }
+
+ template <typename... _Ts>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr void
+ __decrement(_SimdTuple<_Ts...>& __x)
+ {
+ __for_each(
+ __x,
+ [](auto __meta, auto& native) constexpr { __meta.__decrement(native); });
+ }
+
+ // compares {{{2
+#define _GLIBCXX_SIMD_CMP_OPERATIONS(__cmp) \
+ template <typename _Tp, typename... _As> \
+ _GLIBCXX_SIMD_INTRINSIC constexpr static _MaskMember __cmp( \
+ const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y) \
+ { \
+ return __test( \
+ [](auto __impl, auto __xx, auto __yy) constexpr { \
+ return __impl.__cmp(__xx, __yy); \
+ }, \
+ __x, __y); \
+ }
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__equal_to)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__not_equal_to)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__less)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__less_equal)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__isless)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__islessequal)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__isgreater)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__isgreaterequal)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__islessgreater)
+ _GLIBCXX_SIMD_CMP_OPERATIONS(__isunordered)
+#undef _GLIBCXX_SIMD_CMP_OPERATIONS
+
+ // smart_reference access {{{2
+ template <typename _Tp, typename... _As, typename _Up>
+ _GLIBCXX_SIMD_INTRINSIC static void __set(_SimdTuple<_Tp, _As...>& __v,
+ int __i, _Up&& __x) noexcept
+ {
+ __v.__set(__i, static_cast<_Up&&>(__x));
+ }
+
+ // __masked_assign {{{2
+ template <typename _Tp, typename... _As>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+ const __id<_SimdTuple<_Tp, _As...>>& __rhs)
+ {
+ __for_each(
+ __lhs,
+ __rhs, [&](auto __meta, auto& __native_lhs, auto __native_rhs) constexpr {
+ __meta.__masked_assign(__meta.__make_mask(__bits), __native_lhs,
+ __native_rhs);
+ });
+ }
+
+ // Optimization for the case where the RHS is a scalar. No need to broadcast
+ // the scalar to a simd first.
+ template <typename _Tp, typename... _As>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+ const __id<_Tp> __rhs)
+ {
+ __for_each(
+ __lhs, [&](auto __meta, auto& __native_lhs) constexpr {
+ __meta.__masked_assign(__meta.__make_mask(__bits), __native_lhs, __rhs);
+ });
+ }
+
+ // __masked_cassign {{{2
+ template <typename _Op, typename _Tp, typename... _As>
+ static inline void
+ __masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
+ const _SimdTuple<_Tp, _As...>& __rhs, _Op __op)
+ {
+ __for_each(
+ __lhs,
+ __rhs, [&](auto __meta, auto& __native_lhs, auto __native_rhs) constexpr {
+ __meta.template __masked_cassign(__meta.__make_mask(__bits),
+ __native_lhs, __native_rhs, __op);
+ });
+ }
+
+ // Optimization for the case where the RHS is a scalar. No need to broadcast
+ // the scalar to a simd first.
+ template <typename _Op, typename _Tp, typename... _As>
+ static inline void __masked_cassign(const _MaskMember __bits,
+ _SimdTuple<_Tp, _As...>& __lhs,
+ const _Tp& __rhs, _Op __op)
+ {
+ __for_each(
+ __lhs, [&](auto __meta, auto& __native_lhs) constexpr {
+ __meta.template __masked_cassign(__meta.__make_mask(__bits),
+ __native_lhs, __rhs, __op);
+ });
+ }
+
+ // __masked_unary {{{2
+ template <template <typename> class _Op, typename _Tp, typename... _As>
+ static inline _SimdTuple<_Tp, _As...>
+ __masked_unary(const _MaskMember __bits,
+ const _SimdTuple<_Tp, _As...> __v) // TODO: const-ref __v?
+ {
+ return __v.__apply_wrapped([&__bits](auto __meta, auto __native) constexpr {
+ return __meta.template __masked_unary<_Op>(__meta.__make_mask(__bits),
+ __native);
+ });
+ }
+
+ // }}}2
+};
+
+// _MaskImplFixedSize {{{1
+template <int _Np> struct _MaskImplFixedSize
+{
+ static_assert(sizeof(_ULLong) * CHAR_BIT >= _Np,
+ "The fixed_size implementation relies on one "
+ "_ULLong being able to store all boolean "
+ "elements."); // required in load & store
+
+ // member types {{{
+ using _Abi = simd_abi::fixed_size<_Np>;
+ template <typename _Tp>
+ using _FirstAbi = typename __fixed_size_storage_t<_Tp, _Np>::_FirstAbi;
+ using _MaskMember = _SanitizedBitMask<_Np>;
+ template <typename _Tp> using _TypeTag = _Tp*;
+
+ // }}}
+ // __broadcast {{{
+ template <typename>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember __broadcast(bool __x)
+ {
+ return __x ? ~_MaskMember() : _MaskMember();
+ }
+
+ // }}}
+ // __load {{{
+ template <typename, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember __load(const bool* __mem)
+ {
+ using _Up = make_unsigned_t<__int_for_sizeof_t<bool>>;
+ const simd<_Up, _Abi> __bools(reinterpret_cast<const __may_alias<_Up>*>(
+ __mem),
+ _Fp());
+ return __data(__bools != 0);
+ }
+
+ // }}}
+ // __to_bits {{{
+ template <bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __to_bits(_BitMask<_Np, _Sanitized> __x)
+ {
+ if constexpr (_Sanitized)
+ return __x;
+ else
+ return __x._M_sanitized();
+ }
+
+ // }}}
+ // __convert {{{
+ template <typename _Tp, typename _Up, typename _UAbi>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
+ __convert(simd_mask<_Up, _UAbi> __x)
+ {
+ return _UAbi::_MaskImpl::__to_bits(__data(__x))
+ .template _M_extract<0, _Np>();
+ }
+
+ // }}}
+ // __from_bitmask {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __from_bitmask(_MaskMember __bits, _TypeTag<_Tp>) noexcept
+ {
+ return __bits;
+ }
+
+ // __load {{{2
+ template <typename _Fp>
+ static inline _MaskMember __load(const bool* __mem, _Fp __f) noexcept
+ {
+ // TODO: _UChar is not necessarily the best type to use here. For smaller
+ // _Np _UShort, _UInt, _ULLong, float, and double can be more efficient.
+ _ULLong __r = 0;
+ using _Vs = __fixed_size_storage_t<_UChar, _Np>;
+ __for_each(_Vs{}, [&](auto __meta, auto) {
+ __r |= __meta.__mask_to_shifted_ullong(
+ __meta._S_mask_impl.__load(&__mem[__meta._S_offset], __f,
+ _SizeConstant<__meta.size()>()));
+ });
+ return __r;
+ }
+
+ // __masked_load {{{2
+ template <typename _Fp>
+ static inline _MaskMember __masked_load(_MaskMember __merge,
+ _MaskMember __mask, const bool* __mem,
+ _Fp) noexcept
+ {
+ _BitOps::__bit_iteration(__mask.to_ullong(),
+ [&](auto __i) { __merge.set(__i, __mem[__i]); });
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Fp>
+ static inline void __store(const _MaskMember __bitmask, bool* __mem,
+ _Fp) noexcept
+ {
+ if constexpr (_Np == 1)
+ __mem[0] = __bitmask[0];
+ else
+ _FirstAbi<_UChar>::_CommonImpl::__store_bool_array(__bitmask, __mem,
+ _Fp());
+ }
+
+ // __masked_store {{{2
+ template <typename _Fp>
+ static inline void __masked_store(const _MaskMember __v, bool* __mem, _Fp,
+ const _MaskMember __k) noexcept
+ {
+ _BitOps::__bit_iteration(__k, [&](auto __i) { __mem[__i] = __v[__i]; });
+ }
+
+ // logical and bitwise operators {{{2
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __logical_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
+ {
+ return __x & __y;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __logical_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
+ {
+ return __x | __y;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
+ __bit_not(const _MaskMember& __x) noexcept
+ {
+ return ~__x;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __bit_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
+ {
+ return __x & __y;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __bit_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
+ {
+ return __x | __y;
+ }
+
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+ __bit_xor(const _MaskMember& __x, const _MaskMember& __y) noexcept
+ {
+ return __x ^ __y;
+ }
+
+ // smart_reference access {{{2
+ _GLIBCXX_SIMD_INTRINSIC static void __set(_MaskMember& __k, int __i,
+ bool __x) noexcept
+ {
+ __k.set(__i, __x);
+ }
+
+ // __masked_assign {{{2
+ _GLIBCXX_SIMD_INTRINSIC static void __masked_assign(const _MaskMember __k,
+ _MaskMember& __lhs,
+ const _MaskMember __rhs)
+ {
+ __lhs = (__lhs & ~__k) | (__rhs & __k);
+ }
+
+ // Optimization for the case where the RHS is a scalar.
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(const _MaskMember __k, _MaskMember& __lhs, const bool __rhs)
+ {
+ if (__rhs)
+ {
+ __lhs |= __k;
+ }
+ else
+ {
+ __lhs &= ~__k;
+ }
+ }
+
+ // }}}2
+ // __all_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __data(__k).all();
+ }
+
+ // }}}
+ // __any_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __data(__k).any();
+ }
+
+ // }}}
+ // __none_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __data(__k).none();
+ }
+
+ // }}}
+ // __some_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool
+ __some_of([[maybe_unused]] simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (_Np == 1)
+ return false;
+ else
+ return __data(__k).any() && !__data(__k).all();
+ }
+
+ // }}}
+ // __popcount {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+ {
+ return __data(__k).count();
+ }
+
+ // }}}
+ // __find_first_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+ {
+ return _BitOps::__firstbit(__data(__k).to_ullong());
+ }
+
+ // }}}
+ // __find_last_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+ {
+ return _BitOps::__lastbit(__data(__k).to_ullong());
+ }
+
+ // }}}
+};
+// }}}1
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_FIXED_SIZE_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
new file mode 100644
index 00000000000..4185a3bcaa1
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -0,0 +1,1451 @@
+// Math overloads for simd -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+
+#if __cplusplus >= 201703L
+
+#include <utility>
+#include <iomanip>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+template <typename _Tp, typename _V>
+using __samesize = fixed_size_simd<_Tp, _V::size()>;
+// __math_return_type {{{
+template <typename _DoubleR, typename _Tp, typename _Abi>
+struct __math_return_type;
+template <typename _DoubleR, typename _Tp, typename _Abi>
+using __math_return_type_t =
+ typename __math_return_type<_DoubleR, _Tp, _Abi>::type;
+
+template <typename _Tp, typename _Abi>
+struct __math_return_type<double, _Tp, _Abi>
+{
+ using type = std::experimental::simd<_Tp, _Abi>;
+};
+template <typename _Tp, typename _Abi>
+struct __math_return_type<bool, _Tp, _Abi>
+{
+ using type = std::experimental::simd_mask<_Tp, _Abi>;
+};
+template <typename _DoubleR, typename _Tp, typename _Abi>
+struct __math_return_type
+{
+ using type
+ = std::experimental::fixed_size_simd<_DoubleR, simd_size_v<_Tp, _Abi>>;
+};
+//}}}
+// _GLIBCXX_SIMD_MATH_CALL_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL_(__name) \
+ template <typename _Tp, typename _Abi, typename..., \
+ typename _R = std::experimental::__math_return_type_t< \
+ decltype(std::__name(std::declval<double>())), _Tp, _Abi>> \
+ enable_if_t<std::is_floating_point_v<_Tp>, _R> __name( \
+ std::experimental::simd<_Tp, _Abi> __x) \
+ { \
+ return {std::experimental::__private_init, \
+ _Abi::_SimdImpl::__##__name(std::experimental::__data(__x))}; \
+ }
+
+// }}}
+//__extra_argument_type{{{
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type;
+
+template <typename _Tp, typename _Abi>
+struct __extra_argument_type<_Tp*, _Tp, _Abi>
+{
+ using type = std::experimental::simd<_Tp, _Abi>*;
+ static constexpr double* declval();
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto __data(type __x)
+ {
+ return &std::experimental::__data(*__x);
+ }
+ static constexpr bool __needs_temporary_scalar = true;
+};
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type<_Up*, _Tp, _Abi>
+{
+ static_assert(std::is_integral_v<_Up>);
+ using type = std::experimental::fixed_size_simd<
+ _Up, std::experimental::simd_size_v<_Tp, _Abi>>*;
+ static constexpr _Up* declval();
+ _GLIBCXX_SIMD_INTRINSIC static constexpr auto __data(type __x)
+ {
+ return &std::experimental::__data(*__x);
+ }
+ static constexpr bool __needs_temporary_scalar = true;
+};
+template <typename _Tp, typename _Abi>
+struct __extra_argument_type<_Tp, _Tp, _Abi>
+{
+ using type = std::experimental::simd<_Tp, _Abi>;
+ static constexpr double declval();
+ _GLIBCXX_SIMD_INTRINSIC static constexpr decltype(auto)
+ __data(const type& __x)
+ {
+ return std::experimental::__data(__x);
+ }
+ static constexpr bool __needs_temporary_scalar = false;
+};
+template <typename _Up, typename _Tp, typename _Abi>
+struct __extra_argument_type
+{
+ static_assert(std::is_integral_v<_Up>);
+ using type = std::experimental::fixed_size_simd<
+ _Up, std::experimental::simd_size_v<_Tp, _Abi>>;
+ static constexpr _Up declval();
+ _GLIBCXX_SIMD_INTRINSIC static constexpr decltype(auto)
+ __data(const type& __x)
+ {
+ return std::experimental::__data(__x);
+ }
+ static constexpr bool __needs_temporary_scalar = false;
+};
+//}}}
+// _GLIBCXX_SIMD_MATH_CALL2_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL2_(__name, arg2_) \
+ template <typename _Tp, typename _Abi, typename..., \
+ typename _Arg2 \
+ = std::experimental::__extra_argument_type<arg2_, _Tp, _Abi>, \
+ typename _R = std::experimental::__math_return_type_t< \
+ decltype(std::__name(std::declval<double>(), _Arg2::declval())), \
+ _Tp, _Abi>> \
+ enable_if_t<std::is_floating_point_v<_Tp>, _R> __name( \
+ const std::experimental::simd<_Tp, _Abi>& __x, \
+ const typename _Arg2::type& __y) \
+ { \
+ return {std::experimental::__private_init, \
+ _Abi::_SimdImpl::__##__name(std::experimental::__data(__x), \
+ _Arg2::__data(__y))}; \
+ } \
+ template <typename _Up, typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC std::experimental::__math_return_type_t< \
+ decltype(std::__name( \
+ std::declval<double>(), \
+ std::declval<enable_if_t< \
+ std::conjunction_v< \
+ std::is_same<arg2_, _Tp>, \
+ std::negation<std::is_same<__remove_cvref_t<_Up>, \
+ std::experimental::simd<_Tp, _Abi>>>, \
+ std::is_convertible<_Up, std::experimental::simd<_Tp, _Abi>>, \
+ std::is_floating_point<_Tp>>, \
+ double>>())), \
+ _Tp, _Abi> \
+ __name(_Up&& __xx, const std::experimental::simd<_Tp, _Abi>& __yy) \
+ { \
+ return std::experimental::__name(std::experimental::simd<_Tp, _Abi>( \
+ static_cast<_Up&&>(__xx)), \
+ __yy); \
+ }
+
+// }}}
+// _GLIBCXX_SIMD_MATH_CALL3_ {{{
+#define _GLIBCXX_SIMD_MATH_CALL3_(__name, arg2_, arg3_) \
+ template <typename _Tp, typename _Abi, typename..., \
+ typename _Arg2 \
+ = std::experimental::__extra_argument_type<arg2_, _Tp, _Abi>, \
+ typename _Arg3 \
+ = std::experimental::__extra_argument_type<arg3_, _Tp, _Abi>, \
+ typename _R = std::experimental::__math_return_type_t< \
+ decltype(std::__name(std::declval<double>(), _Arg2::declval(), \
+ _Arg3::declval())), \
+ _Tp, _Abi>> \
+ enable_if_t<std::is_floating_point_v<_Tp>, _R> __name( \
+ std::experimental::simd<_Tp, _Abi> __x, typename _Arg2::type __y, \
+ typename _Arg3::type __z) \
+ { \
+ return {std::experimental::__private_init, \
+ _Abi::_SimdImpl::__##__name(std::experimental::__data(__x), \
+ _Arg2::__data(__y), \
+ _Arg3::__data(__z))}; \
+ } \
+ template <typename _Tp, typename _Up, typename _V, typename..., \
+ typename _TT = __remove_cvref_t<_Tp>, \
+ typename _UU = __remove_cvref_t<_Up>, \
+ typename _VV = __remove_cvref_t<_V>, \
+ typename _Simd \
+ = std::conditional_t<std::experimental::is_simd_v<_UU>, _UU, _VV>> \
+ _GLIBCXX_SIMD_INTRINSIC decltype( \
+ std::experimental::__name(_Simd(std::declval<_Tp>()), \
+ _Simd(std::declval<_Up>()), \
+ _Simd(std::declval<_V>()))) \
+ __name(_Tp&& __xx, _Up&& __yy, _V&& __zz) \
+ { \
+ return std::experimental::__name(_Simd(static_cast<_Tp&&>(__xx)), \
+ _Simd(static_cast<_Up&&>(__yy)), \
+ _Simd(static_cast<_V&&>(__zz))); \
+ }
+
+// }}}
+// __cosSeries {{{
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<float, _Abi>
+__cosSeries(const simd<float, _Abi>& __x)
+{
+ const simd<float, _Abi> __x2 = __x * __x;
+ simd<float, _Abi> __y;
+ __y = 0x1.ap-16f; // 1/8!
+ __y = __y * __x2 - 0x1.6c1p-10f; // -1/6!
+ __y = __y * __x2 + 0x1.555556p-5f; // 1/4!
+ return __y * (__x2 * __x2) - .5f * __x2 + 1.f;
+}
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<double, _Abi>
+__cosSeries(const simd<double, _Abi>& __x)
+{
+ const simd<double, _Abi> __x2 = __x * __x;
+ simd<double, _Abi> __y;
+ __y = 0x1.AC00000000000p-45; // 1/16!
+ __y = __y * __x2 - 0x1.9394000000000p-37; // -1/14!
+ __y = __y * __x2 + 0x1.1EED8C0000000p-29; // 1/12!
+ __y = __y * __x2 - 0x1.27E4FB7400000p-22; // -1/10!
+ __y = __y * __x2 + 0x1.A01A01A018000p-16; // 1/8!
+ __y = __y * __x2 - 0x1.6C16C16C16C00p-10; // -1/6!
+ __y = __y * __x2 + 0x1.5555555555554p-5; // 1/4!
+ return (__y * __x2 - .5f) * __x2 + 1.f;
+}
+
+// }}}
+// __sinSeries {{{
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<float, _Abi>
+__sinSeries(const simd<float, _Abi>& __x)
+{
+ const simd<float, _Abi> __x2 = __x * __x;
+ simd<float, _Abi> __y;
+ __y = -0x1.9CC000p-13f; // -1/7!
+ __y = __y * __x2 + 0x1.111100p-7f; // 1/5!
+ __y = __y * __x2 - 0x1.555556p-3f; // -1/3!
+ return __y * (__x2 * __x) + __x;
+}
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE static simd<double, _Abi>
+__sinSeries(const simd<double, _Abi>& __x)
+{
+ // __x = [0, 0.7854 = pi/4]
+ // __x² = [0, 0.6169 = pi²/8]
+ const simd<double, _Abi> __x2 = __x * __x;
+ simd<double, _Abi> __y;
+ __y = -0x1.ACF0000000000p-41; // -1/15!
+ __y = __y * __x2 + 0x1.6124400000000p-33; // 1/13!
+ __y = __y * __x2 - 0x1.AE64567000000p-26; // -1/11!
+ __y = __y * __x2 + 0x1.71DE3A5540000p-19; // 1/9!
+ __y = __y * __x2 - 0x1.A01A01A01A000p-13; // -1/7!
+ __y = __y * __x2 + 0x1.1111111111110p-7; // 1/5!
+ __y = __y * __x2 - 0x1.5555555555555p-3; // -1/3!
+ return __y * (__x2 * __x) + __x;
+}
+
+// }}}
+// __zero_low_bits {{{
+template <int _Bits, typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+__zero_low_bits(simd<_Tp, _Abi> __x)
+{
+ const simd<_Tp, _Abi> __bitmask = __bit_cast<_Tp>(
+ ~std::make_unsigned_t<__int_for_sizeof_t<_Tp>>() << _Bits);
+ return {__private_init,
+ _Abi::_SimdImpl::__bit_and(__data(__x), __data(__bitmask))};
+}
+
+// }}}
+// __fold_input {{{
+
+/**\internal
+ * Fold \p x into [-¼π, ¼π] and remember the quadrant it came from:
+ * quadrant 0: [-¼π, ¼π]
+ * quadrant 1: [ ¼π, ¾π]
+ * quadrant 2: [ ¾π, 1¼π]
+ * quadrant 3: [1¼π, 1¾π]
+ *
+ * The algorithm determines `y` as the multiple `x - y * ¼π = [-¼π, ¼π]`. Using
+ * a bitmask, `y` is reduced to `quadrant`. `y` can be calculated as
+ * ```
+ * y = trunc(x / ¼π);
+ * y += fmod(y, 2);
+ * ```
+ * This can be simplified by moving the (implicit) division by 2 into the
+ * truncation expression. The `+= fmod` effect can the be achieved by using
+ * rounding instead of truncation: `y = round(x / ½π) * 2`. If precision allows,
+ * `2/π * x` is better (faster).
+ */
+template <typename _Tp, typename _Abi> struct __folded
+{
+ simd<_Tp, _Abi> _M_x;
+ rebind_simd_t<int, simd<_Tp, _Abi>> _M_quadrant;
+};
+
+namespace __math_float {
+inline constexpr float __pi_over_4 = 0x1.921FB6p-1f; // π/4
+inline constexpr float __2_over_pi = 0x1.45F306p-1f; // 2/π
+inline constexpr float __pi_2_5bits0
+ = 0x1.921fc0p0f; // π/2, 5 0-bits (least significant)
+inline constexpr float __pi_2_5bits0_rem
+ = -0x1.5777a6p-21f; // π/2 - __pi_2_5bits0
+} // namespace __math_float
+namespace __math_double {
+inline constexpr double __pi_over_4 = 0x1.921fb54442d18p-1; // π/4
+inline constexpr double __2_over_pi = 0x1.45F306DC9C883p-1; // 2/π
+inline constexpr double __pi_2 = 0x1.921fb54442d18p0; // π/2
+} // namespace __math_double
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE __folded<float, _Abi>
+__fold_input(const simd<float, _Abi>& __x)
+{
+ using _V = simd<float, _Abi>;
+ using _IV = rebind_simd_t<int, _V>;
+ using namespace __math_float;
+ __folded<float, _Abi> __r;
+ __r._M_x = abs(__x);
+#if 0
+ // zero most mantissa bits:
+ constexpr float __1_over_pi = 0x1.45F306p-2f; // 1/π
+ const auto __y = (__r._M_x * __1_over_pi + 0x1.8p23f) - 0x1.8p23f;
+ // split π into 4 parts, the first three with 13 trailing zeros (to make the following
+ // multiplications precise):
+ constexpr float __pi0 = 0x1.920000p1f;
+ constexpr float __pi1 = 0x1.fb4000p-11f;
+ constexpr float __pi2 = 0x1.444000p-23f;
+ constexpr float __pi3 = 0x1.68c234p-38f;
+ __r._M_x - __y*__pi0 - __y*__pi1 - __y*__pi2 - __y*__pi3
+#else
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__r._M_x < __pi_over_4)))
+ __r._M_quadrant = 0;
+ else if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__r._M_x < 6 * __pi_over_4)))
+ {
+ const _V __y = nearbyint(__r._M_x * __2_over_pi);
+ __r._M_quadrant = static_simd_cast<_IV>(__y) & 3; // __y mod 4
+ __r._M_x -= __y * __pi_2_5bits0;
+ __r._M_x -= __y * __pi_2_5bits0_rem;
+ }
+ else
+ {
+ using __math_double::__2_over_pi;
+ using __math_double::__pi_2;
+ using _VD = rebind_simd_t<double, _V>;
+ _VD __xd = static_simd_cast<_VD>(__r._M_x);
+ _VD __y = nearbyint(__xd * __2_over_pi);
+ __r._M_quadrant = static_simd_cast<_IV>(__y) & 3; // = __y mod 4
+ __r._M_x = static_simd_cast<_V>(__xd - __y * __pi_2);
+ }
+#endif
+ return __r;
+}
+
+template <typename _Abi>
+_GLIBCXX_SIMD_ALWAYS_INLINE __folded<double, _Abi>
+__fold_input(const simd<double, _Abi>& __x)
+{
+ using _V = simd<double, _Abi>;
+ using _IV = rebind_simd_t<int, _V>;
+ using namespace __math_double;
+
+ __folded<double, _Abi> __r;
+ __r._M_x = abs(__x);
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__r._M_x < __pi_over_4)))
+ {
+ __r._M_quadrant = 0;
+ return __r;
+ }
+ const _V __y = nearbyint(__r._M_x / (2 * __pi_over_4));
+ __r._M_quadrant = static_simd_cast<_IV>(__y) & 3;
+
+ if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__r._M_x < 1025 * __pi_over_4)))
+ {
+ // x - y * pi/2, y uses no more than 11 mantissa bits
+ __r._M_x -= __y * 0x1.921FB54443000p0;
+ __r._M_x -= __y * -0x1.73DCB3B39A000p-43;
+ __r._M_x -= __y * 0x1.45C06E0E68948p-86;
+ }
+ else if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__y <= 0x1.0p30)))
+ {
+ // x - y * pi/2, y uses no more than 29 mantissa bits
+ __r._M_x -= __y * 0x1.921FB40000000p0;
+ __r._M_x -= __y * 0x1.4442D00000000p-24;
+ __r._M_x -= __y * 0x1.8469898CC5170p-48;
+ }
+ else
+ {
+ // x - y * pi/2, y may require all mantissa bits
+ const _V __y_hi = __zero_low_bits<26>(__y);
+ const _V __y_lo = __y - __y_hi;
+ const auto __pi_2_1 = 0x1.921FB50000000p0;
+ const auto __pi_2_2 = 0x1.110B460000000p-26;
+ const auto __pi_2_3 = 0x1.1A62630000000p-54;
+ const auto __pi_2_4 = 0x1.8A2E03707344Ap-81;
+ __r._M_x = __r._M_x - __y_hi * __pi_2_1
+ - max(__y_hi * __pi_2_2, __y_lo * __pi_2_1)
+ - min(__y_hi * __pi_2_2, __y_lo * __pi_2_1)
+ - max(__y_hi * __pi_2_3, __y_lo * __pi_2_2)
+ - min(__y_hi * __pi_2_3, __y_lo * __pi_2_2)
+ - max(__y * __pi_2_4, __y_lo * __pi_2_3)
+ - min(__y * __pi_2_4, __y_lo * __pi_2_3);
+ }
+ return __r;
+}
+
+// }}}
+// __extract_exponent_bits {{{
+template <typename _Abi>
+rebind_simd_t<int, simd<float, _Abi>>
+__extract_exponent_bits(const simd<float, _Abi>& __v)
+{
+ using namespace std::experimental::__proposed;
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+ _GLIBCXX_SIMD_CONSTEXPR simd<float, _Abi> __exponent_mask
+ = std::numeric_limits<float>::infinity(); // 0x7f800000
+ return __bit_cast<rebind_simd_t<int, simd<float, _Abi>>>(__v
+ & __exponent_mask);
+}
+
+template <typename _Abi>
+rebind_simd_t<int, simd<double, _Abi>>
+__extract_exponent_bits(const simd<double, _Abi>& __v)
+{
+ using namespace std::experimental::_P0918;
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+ const simd<double, _Abi> __exponent_mask
+ = std::numeric_limits<double>::infinity(); // 0x7ff0000000000000
+ constexpr auto _Np = simd_size_v<double, _Abi> * 2;
+ constexpr auto _Max = simd_abi::max_fixed_size<int>;
+ if constexpr (_Np > _Max)
+ {
+ const auto __tup
+ = split<_Max / 2, (_Np - _Max) / 2>(__v & __exponent_mask);
+ return concat(
+ shuffle<strided<2, 1>>(
+ __bit_cast<simd<int, simd_abi::deduce_t<int, _Max>>>(
+ std::get<0>(__tup))),
+ shuffle<strided<2, 1>>(
+ __bit_cast<simd<int, simd_abi::deduce_t<int, _Np - _Max>>>(
+ std::get<1>(__tup))));
+ }
+ else
+ return shuffle<strided<2, 1>>(
+ __bit_cast<simd<int, simd_abi::deduce_t<int, _Np>>>(__v
+ & __exponent_mask));
+}
+
+// }}}
+// __impl_or_fallback {{{
+template <typename ImplFun, typename FallbackFun, typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC auto
+__impl_or_fallback_dispatch(int, ImplFun&& __impl_fun, FallbackFun&&,
+ _Args&&... __args)
+ -> decltype(__impl_fun(static_cast<_Args&&>(__args)...))
+{
+ return __impl_fun(static_cast<_Args&&>(__args)...);
+}
+
+template <typename ImplFun, typename FallbackFun, typename... _Args>
+inline auto
+__impl_or_fallback_dispatch(float, ImplFun&&, FallbackFun&& __fallback_fun,
+ _Args&&... __args)
+ -> decltype(__fallback_fun(static_cast<_Args&&>(__args)...))
+{
+ return __fallback_fun(static_cast<_Args&&>(__args)...);
+}
+
+template <typename... _Args>
+_GLIBCXX_SIMD_INTRINSIC auto
+__impl_or_fallback(_Args&&... __args)
+{
+ return __impl_or_fallback_dispatch(int(), static_cast<_Args&&>(__args)...);
+} //}}}
+
+// trigonometric functions {{{
+_GLIBCXX_SIMD_MATH_CALL_(acos)
+_GLIBCXX_SIMD_MATH_CALL_(asin)
+_GLIBCXX_SIMD_MATH_CALL_(atan)
+_GLIBCXX_SIMD_MATH_CALL2_(atan2, _Tp)
+
+/*
+ * algorithm for sine and cosine:
+ *
+ * The result can be calculated with sine or cosine depending on the π/4 section
+ * the input is in. sine ≈ __x + __x³ cosine ≈ 1 - __x²
+ *
+ * sine:
+ * Map -__x to __x and invert the output
+ * Extend precision of __x - n * π/4 by calculating
+ * ((__x - n * p1) - n * p2) - n * p3 (p1 + p2 + p3 = π/4)
+ *
+ * Calculate Taylor series with tuned coefficients.
+ * Fix sign.
+ */
+// cos{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+cos(const simd<_Tp, _Abi>& __x)
+{
+ using _V = simd<_Tp, _Abi>;
+ if constexpr (__is_scalar_abi<_Abi>() || __is_fixed_size_abi_v<_Abi>)
+ return {__private_init, _Abi::_SimdImpl::__cos(__data(__x))};
+ else
+ {
+ if constexpr (is_same_v<_Tp, float>)
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(any_of(abs(__x) >= 393382)))
+ return static_simd_cast<_V>(
+ cos(static_simd_cast<rebind_simd_t<double, _V>>(__x)));
+
+ const auto __f = __fold_input(__x);
+ // quadrant | effect
+ // 0 | cosSeries, +
+ // 1 | sinSeries, -
+ // 2 | cosSeries, -
+ // 3 | sinSeries, +
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+ const _V __sign_flip
+ = _V(-0.f) & static_simd_cast<_V>((1 + __f._M_quadrant) << 30);
+
+ const auto __need_cos = (__f._M_quadrant & 1) == 0;
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__need_cos)))
+ return __sign_flip ^ __cosSeries(__f._M_x);
+ else if (_GLIBCXX_SIMD_IS_UNLIKELY(none_of(__need_cos)))
+ return __sign_flip ^ __sinSeries(__f._M_x);
+ else // some_of(__need_cos)
+ {
+ _V __r = __sinSeries(__f._M_x);
+ where(__need_cos.__cvt(), __r) = __cosSeries(__f._M_x);
+ return __r ^ __sign_flip;
+ }
+ }
+}
+
+template <typename _Tp>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+ enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, simd_abi::scalar>>
+ cos(simd<_Tp, simd_abi::scalar> __x)
+{
+ return std::cos(__data(__x));
+}
+//}}}
+// sin{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sin(const simd<_Tp, _Abi>& __x)
+{
+ using _V = simd<_Tp, _Abi>;
+ if constexpr (__is_scalar_abi<_Abi>() || __is_fixed_size_abi_v<_Abi>)
+ return {__private_init, _Abi::_SimdImpl::__sin(__data(__x))};
+ else
+ {
+ if constexpr (is_same_v<_Tp, float>)
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(any_of(abs(__x) >= 527449)))
+ return static_simd_cast<_V>(
+ sin(static_simd_cast<rebind_simd_t<double, _V>>(__x)));
+
+ const auto __f = __fold_input(__x);
+ // quadrant | effect
+ // 0 | sinSeries
+ // 1 | cosSeries
+ // 2 | sinSeries, sign flip
+ // 3 | cosSeries, sign flip
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+ const auto __sign_flip
+ = (__x ^ static_simd_cast<_V>(1 - __f._M_quadrant)) & _V(_Tp(-0.));
+
+ const auto __need_sin = (__f._M_quadrant & 1) == 0;
+ if (_GLIBCXX_SIMD_IS_UNLIKELY(all_of(__need_sin)))
+ return __sign_flip ^ __sinSeries(__f._M_x);
+ else if (_GLIBCXX_SIMD_IS_UNLIKELY(none_of(__need_sin)))
+ return __sign_flip ^ __cosSeries(__f._M_x);
+ else // some_of(__need_sin)
+ {
+ _V __r = __cosSeries(__f._M_x);
+ where(__need_sin.__cvt(), __r) = __sinSeries(__f._M_x);
+ return __sign_flip ^ __r;
+ }
+ }
+}
+
+template <typename _Tp>
+_GLIBCXX_SIMD_ALWAYS_INLINE
+ enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, simd_abi::scalar>>
+ sin(simd<_Tp, simd_abi::scalar> __x)
+{
+ return std::sin(__data(__x));
+}
+//}}}
+
+_GLIBCXX_SIMD_MATH_CALL_(tan)
+_GLIBCXX_SIMD_MATH_CALL_(acosh)
+_GLIBCXX_SIMD_MATH_CALL_(asinh)
+_GLIBCXX_SIMD_MATH_CALL_(atanh)
+_GLIBCXX_SIMD_MATH_CALL_(cosh)
+_GLIBCXX_SIMD_MATH_CALL_(sinh)
+_GLIBCXX_SIMD_MATH_CALL_(tanh)
+// }}}
+// exponential functions {{{
+_GLIBCXX_SIMD_MATH_CALL_(exp)
+_GLIBCXX_SIMD_MATH_CALL_(exp2)
+_GLIBCXX_SIMD_MATH_CALL_(expm1)
+// }}}
+// frexp {{{
+#if _GLIBCXX_SIMD_X86INTRIN
+template <typename _Tp, size_t _Np>
+_SimdWrapper<_Tp, _Np>
+__getexp(_SimdWrapper<_Tp, _Np> __x)
+{
+ if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_getexp_ps(__to_intrin(__x)));
+ else if constexpr (__have_avx512f && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm512_getexp_ps(__auto_bitcast(__to_intrin(__x))));
+ else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+ return _mm_getexp_pd(__x);
+ else if constexpr (__have_avx512f && __is_sse_pd<_Tp, _Np>())
+ return __lo128(_mm512_getexp_pd(__auto_bitcast(__x)));
+ else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+ return _mm256_getexp_ps(__x);
+ else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+ return __lo256(_mm512_getexp_ps(__auto_bitcast(__x)));
+ else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+ return _mm256_getexp_pd(__x);
+ else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+ return __lo256(_mm512_getexp_pd(__auto_bitcast(__x)));
+ else if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_getexp_ps(__x);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_getexp_pd(__x);
+ else
+ __assert_unreachable<_Tp>();
+}
+
+template <typename _Tp, size_t _Np>
+_SimdWrapper<_Tp, _Np>
+__getmant_avx512(_SimdWrapper<_Tp, _Np> __x)
+{
+ if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(
+ _mm_getmant_ps(__to_intrin(__x), _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src));
+ else if constexpr (__have_avx512f && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm512_getmant_ps(__auto_bitcast(__to_intrin(__x)),
+ _MM_MANT_NORM_p5_1,
+ _MM_MANT_SIGN_src));
+ else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+ return _mm_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+ else if constexpr (__have_avx512f && __is_sse_pd<_Tp, _Np>())
+ return __lo128(_mm512_getmant_pd(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+ _MM_MANT_SIGN_src));
+ else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+ return _mm256_getmant_ps(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+ else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+ return __lo256(_mm512_getmant_ps(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+ _MM_MANT_SIGN_src));
+ else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+ return _mm256_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+ else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+ return __lo256(_mm512_getmant_pd(__auto_bitcast(__x), _MM_MANT_NORM_p5_1,
+ _MM_MANT_SIGN_src));
+ else if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_getmant_ps(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_getmant_pd(__x, _MM_MANT_NORM_p5_1, _MM_MANT_SIGN_src);
+ else
+ __assert_unreachable<_Tp>();
+}
+#endif // _GLIBCXX_SIMD_X86INTRIN
+
+/**
+ * splits \p __v into exponent and mantissa, the sign is kept with the mantissa
+ *
+ * The return value will be in the range [0.5, 1.0[
+ * The \p __e value will be an integer defining the power-of-two exponent
+ */
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+frexp(const simd<_Tp, _Abi>& __x, __samesize<int, simd<_Tp, _Abi>>* __exp)
+{
+ if constexpr (simd_size_v<_Tp, _Abi> == 1)
+ {
+ int __tmp;
+ const auto __r = std::frexp(__x[0], &__tmp);
+ (*__exp)[0] = __tmp;
+ return __r;
+ }
+ else if constexpr (__is_fixed_size_abi_v<_Abi>)
+ {
+ return {__private_init,
+ _Abi::_SimdImpl::__frexp(__data(__x), __data(*__exp))};
+#if _GLIBCXX_SIMD_X86INTRIN
+ }
+ else if constexpr (__have_avx512f)
+ {
+ using _IV = __samesize<int, simd<_Tp, _Abi>>;
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ constexpr size_t _NI = _Np < 4 ? 4 : _Np;
+ const auto __v = __data(__x);
+ const auto __isnonzero
+ = _Abi::_SimdImpl::__isnonzerovalue_mask(__v._M_data);
+ const _SimdWrapper<int, _NI> __exp_plus1
+ = 1 + __convert<_SimdWrapper<int, _NI>>(__getexp(__v))._M_data;
+ const _SimdWrapper<int, _Np> __e = __wrapper_bitcast<int, _Np>(
+ _Abi::_CommonImpl::_S_blend(_SimdWrapper<bool, _NI>(__isnonzero),
+ _SimdWrapper<int, _NI>(), __exp_plus1));
+ simd_abi::deduce_t<int, _Np>::_CommonImpl::__store(
+ __e, __exp, overaligned<alignof(_IV)>);
+ return {__private_init,
+ _Abi::_CommonImpl::_S_blend(_SimdWrapper<bool, _Np>(__isnonzero),
+ __v, __getmant_avx512(__v))};
+#endif // _GLIBCXX_SIMD_X86INTRIN
+ }
+ else
+ {
+ // fallback implementation
+ static_assert(sizeof(_Tp) == 4 || sizeof(_Tp) == 8);
+ using _V = simd<_Tp, _Abi>;
+ using _IV = rebind_simd_t<int, _V>;
+ using _Limits = std::numeric_limits<_Tp>;
+ using namespace std::experimental::__proposed;
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+
+ constexpr int __exp_shift = sizeof(_Tp) == 4 ? 23 : 20;
+ constexpr int __exp_adjust = sizeof(_Tp) == 4 ? 0x7e : 0x3fe;
+ constexpr int __exp_offset = sizeof(_Tp) == 4 ? 0x70 : 0x200;
+ constexpr _Tp __subnorm_scale = sizeof(_Tp) == 4 ? 0x1p112 : 0x1p512;
+ _GLIBCXX_SIMD_CONSTEXPR _V __exponent_mask
+ = _Limits::infinity(); // 0x7f800000 or 0x7ff0000000000000
+ _GLIBCXX_SIMD_CONSTEXPR _V __p5_1_exponent
+ = _Tp(sizeof(_Tp) == 4 ? -0x1.fffffep-1 : -0x1.fffffffffffffp-1);
+
+ _V __mant = __p5_1_exponent & (__exponent_mask | __x);
+ const _IV __exponent_bits = __extract_exponent_bits(__x);
+ if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))))
+ {
+ *__exp = simd_cast<__samesize<int, _V>>(
+ (__exponent_bits >> __exp_shift) - __exp_adjust);
+ return __mant;
+ }
+
+ // can't use isunordered(x*inf, x*0) because inf*0 raises invalid
+ const auto __as_int
+ = __bit_cast<rebind_simd_t<__int_for_sizeof_t<_Tp>, _V>>(abs(__x));
+ const auto __inf = __bit_cast<rebind_simd_t<__int_for_sizeof_t<_Tp>, _V>>(
+ _V(std::numeric_limits<_Tp>::infinity()));
+ const auto __iszero_inf_nan = static_simd_cast<typename _V::mask_type>(
+ __as_int == 0 || __as_int >= __inf);
+
+ const _V __scaled_subnormal = __x * __subnorm_scale;
+ const _V __mant_subnormal
+ = __p5_1_exponent & (__exponent_mask | __scaled_subnormal);
+ where(!isnormal(__x), __mant) = __mant_subnormal;
+ where(__iszero_inf_nan, __mant) = __x;
+ _IV __e = __extract_exponent_bits(__scaled_subnormal);
+ using _MaskType = typename std::conditional_t<
+ sizeof(typename _V::mask_type) == sizeof(_IV), _V, _IV>::mask_type;
+ const _MaskType __value_isnormal = isnormal(__x).__cvt();
+ where(__value_isnormal.__cvt(), __e) = __exponent_bits;
+ static_assert(sizeof(_IV) == sizeof(__value_isnormal));
+ const _IV __offset
+ = (__bit_cast<_IV>(__value_isnormal) & _IV(__exp_adjust))
+ | (__bit_cast<_IV>(static_simd_cast<_MaskType>(__exponent_bits == 0)
+ & static_simd_cast<_MaskType>(__x != 0))
+ & _IV(__exp_adjust + __exp_offset));
+ *__exp = simd_cast<__samesize<int, _V>>((__e >> __exp_shift) - __offset);
+ return __mant;
+ }
+}
+// }}}
+_GLIBCXX_SIMD_MATH_CALL2_(ldexp, int)
+_GLIBCXX_SIMD_MATH_CALL_(ilogb)
+
+// logarithms {{{
+_GLIBCXX_SIMD_MATH_CALL_(log)
+_GLIBCXX_SIMD_MATH_CALL_(log10)
+_GLIBCXX_SIMD_MATH_CALL_(log1p)
+_GLIBCXX_SIMD_MATH_CALL_(log2)
+//}}}
+// logb{{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point<_Tp>::value, simd<_Tp, _Abi>>
+logb(const simd<_Tp, _Abi>& __x)
+{
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (_Np == 1)
+ return std::logb(__x[0]);
+ else if constexpr (__is_fixed_size_abi_v<_Abi>)
+ {
+ return {__private_init,
+ __data(__x).__apply_per_chunk([](auto __impl, auto __xx) {
+ using _V = typename decltype(__impl)::simd_type;
+ return __data(
+ std::experimental::logb(_V(__private_init, __xx)));
+ })};
+ }
+#if _GLIBCXX_SIMD_X86INTRIN // {{{
+ else if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+ return {__private_init,
+ __auto_bitcast(_mm_getexp_ps(__to_intrin(__as_vector(__x))))};
+ else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+ return {__private_init, _mm_getexp_pd(__data(__x))};
+ else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+ return {__private_init, _mm256_getexp_ps(__data(__x))};
+ else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+ return {__private_init, _mm256_getexp_pd(__data(__x))};
+ else if constexpr (__have_avx512f && __is_avx_ps<_Tp, _Np>())
+ return {__private_init,
+ __lo256(_mm512_getexp_ps(__auto_bitcast(__data(__x))))};
+ else if constexpr (__have_avx512f && __is_avx_pd<_Tp, _Np>())
+ return {__private_init,
+ __lo256(_mm512_getexp_pd(__auto_bitcast(__data(__x))))};
+ else if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return {__private_init, _mm512_getexp_ps(__data(__x))};
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return {__private_init, _mm512_getexp_pd(__data(__x))};
+#endif // _GLIBCXX_SIMD_X86INTRIN }}}
+ else
+ {
+ using _V = simd<_Tp, _Abi>;
+ using namespace std::experimental::__proposed;
+ auto __is_normal = isnormal(__x);
+
+ // work on __abs(__x) to reflect the return value on Linux for negative
+ // inputs (domain-error => implementation-defined value is returned)
+ const _V abs_x = abs(__x);
+
+ // __exponent(__x) returns the exponent value (bias removed) as simd<_Up>
+ // with integral _Up
+ auto&& __exponent = [](const _V& __v) {
+ using namespace std::experimental::__proposed;
+ using _IV = rebind_simd_t<
+ std::conditional_t<sizeof(_Tp) == sizeof(_LLong), _LLong, int>, _V>;
+ return (__bit_cast<_IV>(__v) >> (std::numeric_limits<_Tp>::digits - 1))
+ - (std::numeric_limits<_Tp>::max_exponent - 1);
+ };
+ _V __r = static_simd_cast<_V>(__exponent(abs_x));
+ if (_GLIBCXX_SIMD_IS_LIKELY(all_of(__is_normal)))
+ // without corner cases (nan, inf, subnormal, zero) we have our
+ // answer:
+ return __r;
+ const auto __is_zero = __x == 0;
+ const auto __is_nan = isnan(__x);
+ const auto __is_inf = isinf(__x);
+ where(__is_zero, __r) = -std::numeric_limits<_Tp>::infinity();
+ where(__is_nan, __r) = __x;
+ where(__is_inf, __r) = std::numeric_limits<_Tp>::infinity();
+ __is_normal |= __is_zero || __is_nan || __is_inf;
+ if (all_of(__is_normal))
+ // at this point everything but subnormals is handled
+ return __r;
+ // subnormals repeat the exponent extraction after multiplication of the
+ // input with __a floating point value that has 112 (0x70) in its exponent
+ // (not too big for sp and large enough for dp)
+ const _V __scaled = abs_x * _Tp(0x1p112);
+ _V __scaled_exp = static_simd_cast<_V>(__exponent(__scaled) - 112);
+ where(__is_normal, __scaled_exp) = __r;
+ return __scaled_exp;
+ }
+}
+//}}}
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+modf(const simd<_Tp, _Abi>& __x, simd<_Tp, _Abi>* __iptr)
+{
+ const auto __integral = trunc(__x);
+ *__iptr = __integral;
+ auto __r = __x - __integral;
+ where(isinf(__x), __r) = _Tp();
+ return copysign(__r, __x);
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(scalbn, int)
+_GLIBCXX_SIMD_MATH_CALL2_(scalbln, long)
+
+_GLIBCXX_SIMD_MATH_CALL_(cbrt)
+
+_GLIBCXX_SIMD_MATH_CALL_(abs)
+_GLIBCXX_SIMD_MATH_CALL_(fabs)
+
+// [parallel.simd.math] only asks for is_floating_point_v<_Tp> and forgot to
+// allow signed integral _Tp
+template <typename _Tp, typename _Abi>
+enable_if_t<!std::is_floating_point_v<_Tp> && std::is_signed_v<_Tp>,
+ simd<_Tp, _Abi>>
+abs(const simd<_Tp, _Abi>& __x)
+{
+ return {__private_init, _Abi::_SimdImpl::__abs(__data(__x))};
+}
+template <typename _Tp, typename _Abi>
+enable_if_t<!std::is_floating_point_v<_Tp> && std::is_signed_v<_Tp>,
+ simd<_Tp, _Abi>>
+fabs(const simd<_Tp, _Abi>& __x)
+{
+ return {__private_init, _Abi::_SimdImpl::__abs(__data(__x))};
+}
+
+// the following are overloads for functions in <cstdlib> and not covered by
+// [parallel.simd.math]. I don't see much value in making them work, though
+/*
+template <typename _Abi> simd<long, _Abi> labs(const simd<long, _Abi> &__x)
+{
+ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))};
+}
+template <typename _Abi> simd<long long, _Abi> llabs(const simd<long long, _Abi>
+&__x)
+{
+ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))};
+}
+*/
+
+#define _GLIBCXX_SIMD_CVTING2(_NAME) \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y) \
+ { \
+ return _NAME(__x, __y); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y) \
+ { \
+ return _NAME(__x, __y); \
+ }
+
+#define _GLIBCXX_SIMD_CVTING3(_NAME) \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y, \
+ const simd<_Tp, _Abi>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y, \
+ const simd<_Tp, _Abi>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y, \
+ const __id<simd<_Tp, _Abi>>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const simd<_Tp, _Abi>& __x, const __id<simd<_Tp, _Abi>>& __y, \
+ const __id<simd<_Tp, _Abi>>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const __id<simd<_Tp, _Abi>>& __x, const simd<_Tp, _Abi>& __y, \
+ const __id<simd<_Tp, _Abi>>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ } \
+ template <typename _Tp, typename _Abi> \
+ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
+ const __id<simd<_Tp, _Abi>>& __x, const __id<simd<_Tp, _Abi>>& __y, \
+ const simd<_Tp, _Abi>& __z) \
+ { \
+ return _NAME(__x, __y, __z); \
+ }
+
+template <typename _R, typename _ToApply, typename _Tp, typename... _Tps>
+_GLIBCXX_SIMD_INTRINSIC _R
+__fixed_size_apply(_ToApply&& __apply, const _Tp& __arg0, const _Tps&... __args)
+{
+ return {__private_init,
+ __data(__arg0).__apply_per_chunk(
+ [&](auto __impl, const auto&... __inner) {
+ using _V = typename decltype(__impl)::simd_type;
+ return __data(__apply(_V(__private_init, __inner)...));
+ },
+ __data(__args)...)};
+}
+
+template <typename _VV>
+__remove_cvref_t<_VV>
+__hypot(_VV __x, _VV __y)
+{
+ using _V = __remove_cvref_t<_VV>;
+ using _Tp = typename _V::value_type;
+ if constexpr (_V::size() == 1)
+ return std::hypot(_Tp(__x[0]), _Tp(__y[0]));
+ else if constexpr (__is_fixed_size_abi_v<typename _V::abi_type>)
+ {
+ return __fixed_size_apply<_V>([](auto __a,
+ auto __b) { return hypot(__a, __b); },
+ __x, __y);
+ }
+ else
+ {
+ // A simple solution for _Tp == float would be to cast to double and
+ // simply calculate sqrt(x²+y²) as it can't over-/underflow anymore with
+ // dp. It still needs the Annex F fixups though and isn't faster on
+ // Skylake-AVX512 (not even for SSE and AVX vectors, and really bad for
+ // AVX-512).
+ using namespace __proposed::float_bitwise_operators;
+ using _Limits = std::numeric_limits<_Tp>;
+ _V __absx = abs(__x); // no error
+ _V __absy = abs(__y); // no error
+ _V __hi = max(__absx, __absy); // no error
+ _V __lo = min(__absy, __absx); // no error
+
+ // round __hi down to the next power-of-2:
+ _GLIBCXX_SIMD_CONSTEXPR _V __inf(_Limits::infinity());
+
+ if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))
+ && all_of(isnormal(__y))))
+ {
+ const _V __hi_exp = __hi & __inf;
+ //((__hi + __hi) & __inf) ^ __inf almost works for computing __scale,
+ // except when (__hi + __hi) & __inf == __inf, in which case __scale
+ // becomes 0 (should be min/2 instead) and thus loses the information
+ // from __lo.
+ const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
+ _GLIBCXX_SIMD_CONSTEXPR _V __mant_mask
+ = _Limits::min() - _Limits::denorm_min();
+ const _V __h1 = (__hi & __mant_mask) | _V(1);
+ const _V __l1 = __lo * __scale;
+ return __hi_exp * sqrt(__h1 * __h1 + __l1 * __l1);
+ }
+ else
+ {
+ // slower path to support subnormals
+ // if __hi is subnormal, avoid scaling by inf & final mul by 0 (which
+ // yields NaN) by using min()
+ _V __scale = _V(1 / _Limits::min());
+ // invert exponent w/o error and w/o using the slow divider unit:
+ // xor inverts the exponent but off by 1. Multiplication with .5
+ // adjusts for the discrepancy.
+ where(__hi >= _Limits::min(), __scale)
+ = ((__hi & __inf) ^ __inf) * _Tp(.5);
+ // adjust final exponent for subnormal inputs
+ _V __hi_exp = _Limits::min();
+ where(__hi >= _Limits::min(), __hi_exp) = __hi & __inf; // no error
+ _V __h1 = __hi * __scale; // no error
+ _V __l1 = __lo * __scale; // no error
+
+ // sqrt(x²+y²) = e*sqrt((x/e)²+(y/e)²):
+ // this ensures no overflow in the argument to sqrt
+ _V __r = __hi_exp * sqrt(__h1 * __h1 + __l1 * __l1);
+#ifdef __STDC_IEC_559__
+ // fixup for Annex F requirements
+ // the naive fixup goes like this:
+ //
+ // where(__l1 == 0, __r) = __hi;
+ // where(isunordered(__x, __y), __r) = _Limits::quiet_NaN();
+ // where(isinf(__absx) || isinf(__absy), __r) = __inf;
+ //
+ // The fixup can be prepared in parallel with the sqrt, requiring a
+ // single blend step after hi_exp * sqrt, reducing latency and
+ // throughput:
+ _V __fixup = __hi; // __lo == 0
+ where(isunordered(__x, __y), __fixup) = _Limits::quiet_NaN();
+ where(isinf(__absx) || isinf(__absy), __fixup) = __inf;
+ where(!(__lo == 0 || isunordered(__x, __y)
+ || (isinf(__absx) || isinf(__absy))),
+ __fixup)
+ = __r;
+ __r = __fixup;
+#endif
+ return __r;
+ }
+ }
+}
+
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+hypot(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y)
+{
+ return __hypot<conditional_t<__is_fixed_size_abi_v<_Abi>,
+ const simd<_Tp, _Abi>&, simd<_Tp, _Abi>>>(__x,
+ __y);
+}
+_GLIBCXX_SIMD_CVTING2(hypot)
+
+template <typename _VV>
+__remove_cvref_t<_VV>
+__hypot(_VV __x, _VV __y, _VV __z)
+{
+ using _V = __remove_cvref_t<_VV>;
+ using _Abi = typename _V::abi_type;
+ using _Tp = typename _V::value_type;
+ /* FIXME: enable after PR77776 is resolved
+ if constexpr (_V::size() == 1)
+ return std::hypot(_Tp(__x[0]), _Tp(__y[0]), _Tp(__z[0]));
+ else
+ */
+ if constexpr (__is_fixed_size_abi_v<_Abi> && _V::size() > 1)
+ {
+ return __fixed_size_apply<simd<_Tp, _Abi>>(
+ [](auto __a, auto __b, auto __c) { return hypot(__a, __b, __c); }, __x,
+ __y, __z);
+ }
+ else
+ {
+ using namespace __proposed::float_bitwise_operators;
+ using _Limits = std::numeric_limits<_Tp>;
+ const _V __absx = abs(__x); // no error
+ const _V __absy = abs(__y); // no error
+ const _V __absz = abs(__z); // no error
+ _V __hi = max(max(__absx, __absy), __absz); // no error
+ _V __l0 = min(__absz, max(__absx, __absy)); // no error
+ _V __l1 = min(__absy, __absx); // no error
+ if constexpr (numeric_limits<_Tp>::digits == 64
+ && numeric_limits<_Tp>::max_exponent == 0x4000
+ && numeric_limits<_Tp>::min_exponent == -0x3FFD
+ && _V::size() == 1)
+ { // Seems like x87 fp80, where bit 63 is always 1 unless subnormal or
+ // NaN. In this case the bit-tricks don't work, they require IEC559
+ // binary32 or binary64 format.
+#ifdef __STDC_IEC_559__
+ // fixup for Annex F requirements
+ if (isinf(__absx[0]) || isinf(__absy[0]) || isinf(__absz[0]))
+ return _Limits::infinity();
+ else if (isunordered(__absx[0], __absy[0] + __absz[0]))
+ return _Limits::quiet_NaN();
+ else if (__l0[0] == 0 && __l1[0] == 0)
+ return __hi;
+#endif
+ _V __hi_exp = __hi;
+ const _ULLong __tmp = 0x8000'0000'0000'0000ull;
+ __builtin_memcpy(&__hi_exp, &__tmp, 8);
+ const _V __scale = 1 / __hi_exp;
+ __hi *= __scale;
+ __l0 *= __scale;
+ __l1 *= __scale;
+ return __hi_exp * sqrt((__l0 * __l0 + __l1 * __l1) + __hi * __hi);
+ }
+ else
+ {
+ // round __hi down to the next power-of-2:
+ _GLIBCXX_SIMD_CONSTEXPR _V __inf(_Limits::infinity());
+
+ if (_GLIBCXX_SIMD_IS_LIKELY(all_of(isnormal(__x))
+ && all_of(isnormal(__y))
+ && all_of(isnormal(__z))))
+ {
+ const _V __hi_exp = __hi & __inf;
+ //((__hi + __hi) & __inf) ^ __inf almost works for computing
+ //__scale, except when (__hi + __hi) & __inf == __inf, in which
+ // case __scale
+ // becomes 0 (should be min/2 instead) and thus loses the
+ // information from __lo.
+ const _V __scale = (__hi_exp ^ __inf) * _Tp(.5);
+ _GLIBCXX_SIMD_CONSTEXPR _V __mant_mask
+ = _Limits::min() - _Limits::denorm_min();
+ const _V __h1 = (__hi & __mant_mask) | _V(1);
+ __l0 *= __scale;
+ __l1 *= __scale;
+ const _V __lo
+ = __l0 * __l0 + __l1 * __l1; // add the two smaller values first
+ return __hi_exp * sqrt(__lo + __h1 * __h1);
+ }
+ else
+ {
+ // slower path to support subnormals
+ // if __hi is subnormal, avoid scaling by inf & final mul by 0
+ // (which yields NaN) by using min()
+ _V __scale = _V(1 / _Limits::min());
+ // invert exponent w/o error and w/o using the slow divider unit:
+ // xor inverts the exponent but off by 1. Multiplication with .5
+ // adjusts for the discrepancy.
+ where(__hi >= _Limits::min(), __scale)
+ = ((__hi & __inf) ^ __inf) * _Tp(.5);
+ // adjust final exponent for subnormal inputs
+ _V __hi_exp = _Limits::min();
+ where(__hi >= _Limits::min(), __hi_exp)
+ = __hi & __inf; // no error
+ _V __h1 = __hi * __scale; // no error
+ __l0 *= __scale; // no error
+ __l1 *= __scale; // no error
+ _V __lo
+ = __l0 * __l0 + __l1 * __l1; // add the two smaller values first
+ _V __r = __hi_exp * sqrt(__lo + __h1 * __h1);
+#ifdef __STDC_IEC_559__
+ // fixup for Annex F requirements
+ _V __fixup = __hi; // __lo == 0
+ // where(__lo == 0, __fixup) = __hi;
+ where(isunordered(__x, __y + __z), __fixup)
+ = _Limits::quiet_NaN();
+ where(isinf(__absx) || isinf(__absy) || isinf(__absz), __fixup)
+ = __inf;
+ // Instead of __lo == 0, the following could depend on __h1² ==
+ // __h1² + __lo (i.e. __hi is so much larger than the other two
+ // inputs that the result is exactly __hi). While this may improve
+ // precision, it is likely to reduce efficiency if the ISA has
+ // FMAs (because __h1² + __lo is an FMA, but the intermediate
+ // __h1² must be kept)
+ where(!(__lo == 0 || isunordered(__x, __y + __z) || isinf(__absx)
+ || isinf(__absy) || isinf(__absz)),
+ __fixup)
+ = __r;
+ __r = __fixup;
+#endif
+ return __r;
+ }
+ }
+ }
+}
+
+template <typename _Tp, typename _Abi>
+_GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi>
+hypot(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y,
+ const simd<_Tp, _Abi>& __z)
+{
+ return __hypot<conditional_t<__is_fixed_size_abi_v<_Abi>,
+ const simd<_Tp, _Abi>&, simd<_Tp, _Abi>>>(__x,
+ __y,
+ __z);
+}
+_GLIBCXX_SIMD_CVTING3(hypot)
+
+_GLIBCXX_SIMD_MATH_CALL2_(pow, _Tp)
+
+_GLIBCXX_SIMD_MATH_CALL_(sqrt)
+_GLIBCXX_SIMD_MATH_CALL_(erf)
+_GLIBCXX_SIMD_MATH_CALL_(erfc)
+_GLIBCXX_SIMD_MATH_CALL_(lgamma)
+_GLIBCXX_SIMD_MATH_CALL_(tgamma)
+_GLIBCXX_SIMD_MATH_CALL_(ceil)
+_GLIBCXX_SIMD_MATH_CALL_(floor)
+_GLIBCXX_SIMD_MATH_CALL_(nearbyint)
+_GLIBCXX_SIMD_MATH_CALL_(rint)
+_GLIBCXX_SIMD_MATH_CALL_(lrint)
+_GLIBCXX_SIMD_MATH_CALL_(llrint)
+
+_GLIBCXX_SIMD_MATH_CALL_(round)
+_GLIBCXX_SIMD_MATH_CALL_(lround)
+_GLIBCXX_SIMD_MATH_CALL_(llround)
+
+_GLIBCXX_SIMD_MATH_CALL_(trunc)
+
+_GLIBCXX_SIMD_MATH_CALL2_(fmod, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(remainder, _Tp)
+_GLIBCXX_SIMD_MATH_CALL3_(remquo, _Tp, int*)
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+copysign(const simd<_Tp, _Abi>& __x, const simd<_Tp, _Abi>& __y)
+{
+ using namespace std::experimental::__proposed::float_bitwise_operators;
+ const auto __signmask = -simd<_Tp, _Abi>();
+ return (__x & (__x ^ __signmask)) | (__y & __signmask);
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(nextafter, _Tp)
+// not covered in [parallel.simd.math]:
+// _GLIBCXX_SIMD_MATH_CALL2_(nexttoward, long double)
+_GLIBCXX_SIMD_MATH_CALL2_(fdim, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(fmax, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(fmin, _Tp)
+
+_GLIBCXX_SIMD_MATH_CALL3_(fma, _Tp, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(fpclassify)
+_GLIBCXX_SIMD_MATH_CALL_(isfinite)
+
+// isnan and isinf require special treatment because old glibc may declare
+// `int std::isinf(double)`.
+template <typename _Tp, typename _Abi, typename...,
+ typename _R
+ = std::experimental::__math_return_type_t<bool, _Tp, _Abi>>
+enable_if_t<std::is_floating_point_v<_Tp>, _R>
+isinf(std::experimental::simd<_Tp, _Abi> __x)
+{
+ return {std::experimental::__private_init,
+ _Abi::_SimdImpl::__isinf(std::experimental::__data(__x))};
+}
+template <typename _Tp, typename _Abi, typename...,
+ typename _R
+ = std::experimental::__math_return_type_t<bool, _Tp, _Abi>>
+enable_if_t<std::is_floating_point_v<_Tp>, _R>
+isnan(std::experimental::simd<_Tp, _Abi> __x)
+{
+ return {std::experimental::__private_init,
+ _Abi::_SimdImpl::__isnan(std::experimental::__data(__x))};
+}
+_GLIBCXX_SIMD_MATH_CALL_(isnormal)
+
+template <typename..., typename _Tp, typename _Abi>
+std::experimental::simd_mask<_Tp, _Abi>
+signbit(std::experimental::simd<_Tp, _Abi> __x)
+{
+ if constexpr (std::is_integral_v<_Tp>)
+ {
+ if constexpr (std::is_unsigned_v<_Tp>)
+ return std::experimental::simd_mask<_Tp, _Abi>{}; // false
+ else
+ return __x < 0;
+ }
+ else
+ return {std::experimental::__private_init,
+ _Abi::_SimdImpl::__signbit(std::experimental::__data(__x))};
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(isgreater, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isgreaterequal, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isless, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(islessequal, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(islessgreater, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(isunordered, _Tp)
+
+/* not covered in [parallel.simd.math]
+template <typename _Abi> __doublev<_Abi> nan(const char* tagp);
+template <typename _Abi> __floatv<_Abi> nanf(const char* tagp);
+template <typename _Abi> __ldoublev<_Abi> nanl(const char* tagp);
+
+template <typename _V> struct simd_div_t {
+ _V quot, rem;
+};
+template <typename _Abi>
+simd_div_t<_SCharv<_Abi>> div(_SCharv<_Abi> numer,
+ _SCharv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__shortv<_Abi>> div(__shortv<_Abi> numer,
+ __shortv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__intv<_Abi>> div(__intv<_Abi> numer, __intv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__longv<_Abi>> div(__longv<_Abi> numer,
+ __longv<_Abi> denom);
+template <typename _Abi>
+simd_div_t<__llongv<_Abi>> div(__llongv<_Abi> numer,
+ __llongv<_Abi> denom);
+*/
+
+// special math {{{
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+assoc_laguerre(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+ return std::assoc_laguerre(__n[__i], __m[__i], __x[__i]);
+ });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+assoc_legendre(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+ return std::assoc_legendre(__n[__i], __m[__i], __x[__i]);
+ });
+}
+
+_GLIBCXX_SIMD_MATH_CALL2_(beta, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(comp_ellint_1)
+_GLIBCXX_SIMD_MATH_CALL_(comp_ellint_2)
+_GLIBCXX_SIMD_MATH_CALL2_(comp_ellint_3, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_i, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_j, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_bessel_k, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(cyl_neumann, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(ellint_1, _Tp)
+_GLIBCXX_SIMD_MATH_CALL2_(ellint_2, _Tp)
+_GLIBCXX_SIMD_MATH_CALL3_(ellint_3, _Tp, _Tp)
+_GLIBCXX_SIMD_MATH_CALL_(expint)
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+hermite(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>(
+ [&](auto __i) { return std::hermite(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+laguerre(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>(
+ [&](auto __i) { return std::laguerre(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+legendre(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>(
+ [&](auto __i) { return std::legendre(__n[__i], __x[__i]); });
+}
+
+_GLIBCXX_SIMD_MATH_CALL_(riemann_zeta)
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_bessel(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>(
+ [&](auto __i) { return std::sph_bessel(__n[__i], __x[__i]); });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_legendre(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __l,
+ const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __m,
+ const std::experimental::simd<_Tp, _Abi>& theta)
+{
+ return std::experimental::simd<_Tp, _Abi>([&](auto __i) {
+ return std::assoc_legendre(__l[__i], __m[__i], theta[__i]);
+ });
+}
+
+template <typename _Tp, typename _Abi>
+enable_if_t<std::is_floating_point_v<_Tp>, simd<_Tp, _Abi>>
+sph_neumann(const std::experimental::fixed_size_simd<
+ unsigned, std::experimental::simd_size_v<_Tp, _Abi>>& __n,
+ const std::experimental::simd<_Tp, _Abi>& __x)
+{
+ return std::experimental::simd<_Tp, _Abi>(
+ [&](auto __i) { return std::sph_neumann(__n[__i], __x[__i]); });
+}
+// }}}
+
+#undef _GLIBCXX_SIMD_MATH_CALL_
+#undef _GLIBCXX_SIMD_MATH_CALL2_
+#undef _GLIBCXX_SIMD_MATH_CALL3_
+
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_MATH_H_
+
+// vim: foldmethod=marker sw=2 ts=8 noet sts=2
diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++-v3/include/experimental/bits/simd_neon.h
new file mode 100644
index 00000000000..efff0150b8a
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_neon.h
@@ -0,0 +1,466 @@
+// Simd NEON specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+
+#if __cplusplus >= 201703L
+
+#if !_GLIBCXX_SIMD_HAVE_NEON
+#error "simd_neon.h may only be included when NEON on ARM is available"
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// _CommonImplNeon {{{
+struct _CommonImplNeon : _CommonImplBuiltin
+{
+ // __store {{{
+ using _CommonImplBuiltin::__store;
+
+ // }}}
+};
+
+// }}}
+// _SimdImplNeon {{{
+template <typename _Abi> struct _SimdImplNeon : _SimdImplBuiltin<_Abi>
+{
+ using _Base = _SimdImplBuiltin<_Abi>;
+ template <typename _Tp> static constexpr size_t _S_max_store_size = 16;
+
+ // __masked_load {{{
+ template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+ static inline _SimdWrapper<_Tp, _Np>
+ __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __k,
+ const _Up* __mem, _Fp) noexcept
+ {
+ __execute_n_times<_Np>([&](auto __i) {
+ if (__k[__i] != 0)
+ __merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+ });
+ return __merge;
+ }
+
+ // }}}
+ // __masked_store_nocvt {{{
+ template <typename _Tp, std::size_t _Np, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+ _SimdWrapper<_Tp, _Np> __k)
+ {
+ __execute_n_times<_Np>([&](auto __i) {
+ if (__k[__i] != 0)
+ __mem[__i] = __v[__i];
+ });
+ }
+
+ // }}}
+ // __reduce {{{
+ template <typename _Tp, typename _BinaryOperation>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __reduce(simd<_Tp, _Abi> __x,
+ _BinaryOperation&& __binary_op)
+ {
+ constexpr size_t _Np = __x.size();
+ if constexpr (sizeof(__x) == 16 && _Np >= 4 && !_Abi::_S_is_partial)
+ {
+ const auto __halves = split<simd<_Tp, simd_abi::_Neon<8>>>(__x);
+ const auto __y = __binary_op(__halves[0], __halves[1]);
+ return _SimdImplNeon<simd_abi::_Neon<8>>::__reduce(
+ __y, static_cast<_BinaryOperation&&>(__binary_op));
+ }
+ else if constexpr (_Np == 8)
+ {
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<1, 0, 3, 2, 5, 4, 7, 6>(
+ __x._M_data)));
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<3, 2, 1, 0, 7, 6, 5, 4>(
+ __x._M_data)));
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<7, 6, 5, 4, 3, 2, 1, 0>(
+ __x._M_data)));
+ return __x[0];
+ }
+ else if constexpr (_Np == 4)
+ {
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<1, 0, 3, 2>(__x._M_data)));
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<3, 2, 1, 0>(__x._M_data)));
+ return __x[0];
+ }
+ else if constexpr (_Np == 2)
+ {
+ __x = __binary_op(__x, _Base::template __make_simd<_Tp, _Np>(
+ __vector_permute<1, 0>(__x._M_data)));
+ return __x[0];
+ }
+ else
+ return _Base::__reduce(__x, static_cast<_BinaryOperation&&>(__binary_op));
+ }
+
+ // }}}
+ // math {{{
+ // __sqrt {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __sqrt(_Tp __x)
+ {
+ if constexpr (__have_neon_a64)
+ {
+ const auto __intrin = __to_intrin(__x);
+ if constexpr (_TVT::template __is<float, 2>)
+ return vsqrt_f32(__intrin);
+ else if constexpr (_TVT::template __is<float, 4>)
+ return vsqrtq_f32(__intrin);
+ else if constexpr (_TVT::template __is<double, 1>)
+ return vsqrt_f64(__intrin);
+ else if constexpr (_TVT::template __is<double, 2>)
+ return vsqrtq_f64(__intrin);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__sqrt(__x);
+ } // }}}
+ // __trunc {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __trunc(_Tp __x)
+ {
+ if constexpr (__have_neon_a32)
+ {
+ const auto __intrin = __to_intrin(__x);
+ if constexpr (_TVT::template __is<float, 2>)
+ return vrnd_f32(__intrin);
+ else if constexpr (_TVT::template __is<float, 4>)
+ return vrndq_f32(__intrin);
+ else if constexpr (_TVT::template __is<double, 1>)
+ return vrnd_f64(__intrin);
+ else if constexpr (_TVT::template __is<double, 2>)
+ return vrndq_f64(__intrin);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__trunc(__x);
+ } // }}}
+ // __floor {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __floor(_Tp __x)
+ {
+ if constexpr (__have_neon_a32)
+ {
+ const auto __intrin = __to_intrin(__x);
+ if constexpr (_TVT::template __is<float, 2>)
+ return vrndm_f32(__intrin);
+ else if constexpr (_TVT::template __is<float, 4>)
+ return vrndmq_f32(__intrin);
+ else if constexpr (_TVT::template __is<double, 1>)
+ return vrndm_f64(__intrin);
+ else if constexpr (_TVT::template __is<double, 2>)
+ return vrndmq_f64(__intrin);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__floor(__x);
+ } // }}}
+ // __ceil {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __ceil(_Tp __x)
+ {
+ if constexpr (__have_neon_a32)
+ {
+ const auto __intrin = __to_intrin(__x);
+ if constexpr (_TVT::template __is<float, 2>)
+ return vrndp_f32(__intrin);
+ else if constexpr (_TVT::template __is<float, 4>)
+ return vrndpq_f32(__intrin);
+ else if constexpr (_TVT::template __is<double, 1>)
+ return vrndp_f64(__intrin);
+ else if constexpr (_TVT::template __is<double, 2>)
+ return vrndpq_f64(__intrin);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__ceil(__x);
+ } //}}}
+ //}}}
+}; // }}}
+// _MaskImplNeonMixin {{{
+struct _MaskImplNeonMixin
+{
+ using _Base = _MaskImplBuiltinMixin;
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __to_bits(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if (__builtin_is_constant_evaluated())
+ return _Base::__to_bits(__x);
+
+ using _I = __int_for_sizeof_t<_Tp>;
+ if constexpr (sizeof(__x) == 16)
+ {
+ auto __asint = __vector_bitcast<_I>(__x);
+#ifdef __aarch64__
+ [[maybe_unused]] constexpr auto __zero = decltype(__asint)();
+#else
+ [[maybe_unused]] constexpr auto __zero = decltype(__lo64(__asint))();
+#endif
+ if constexpr (sizeof(_Tp) == 1)
+ {
+ constexpr auto __bitsel
+ = __generate_from_n_evaluations<16, __vector_type_t<_I, 16>>(
+ [&](auto __i) {
+ return static_cast<_I>(
+ __i < _Np ? (__i < 8 ? 1 << __i : 1 << (__i - 8)) : 0);
+ });
+ __asint &= __bitsel;
+#ifdef __aarch64__
+ return __vector_bitcast<_UShort>(
+ vpaddq_s8(vpaddq_s8(vpaddq_s8(__asint, __zero), __zero),
+ __zero))[0];
+#else
+ return __vector_bitcast<_UShort>(
+ vpadd_s8(vpadd_s8(vpadd_s8(__lo64(__asint), __hi64(__asint)),
+ __zero),
+ __zero))[0];
+#endif
+ }
+ else if constexpr (sizeof(_Tp) == 2)
+ {
+ constexpr auto __bitsel
+ = __generate_from_n_evaluations<8, __vector_type_t<_I, 8>>(
+ [&](auto __i) {
+ return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+ });
+ __asint &= __bitsel;
+#ifdef __aarch64__
+ return vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero),
+ __zero)[0];
+#else
+ return vpadd_s16(
+ vpadd_s16(vpadd_s16(__lo64(__asint), __hi64(__asint)), __zero),
+ __zero)[0];
+#endif
+ }
+ else if constexpr (sizeof(_Tp) == 4)
+ {
+ constexpr auto __bitsel
+ = __generate_from_n_evaluations<4, __vector_type_t<_I, 4>>(
+ [&](auto __i) {
+ return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+ });
+ __asint &= __bitsel;
+#ifdef __aarch64__
+ return vpaddq_s32(vpaddq_s32(__asint, __zero), __zero)[0];
+#else
+ return vpadd_s32(vpadd_s32(__lo64(__asint), __hi64(__asint)),
+ __zero)[0];
+#endif
+ }
+ else if constexpr (sizeof(_Tp) == 8)
+ return (__asint[0] & 1) | (__asint[1] & 2);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__x) == 8)
+ {
+ auto __asint = __vector_bitcast<_I>(__x);
+ [[maybe_unused]] constexpr auto __zero = decltype(__asint)();
+ if constexpr (sizeof(_Tp) == 1)
+ {
+ constexpr auto __bitsel
+ = __generate_from_n_evaluations<8, __vector_type_t<_I, 8>>(
+ [&](auto __i) {
+ return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+ });
+ __asint &= __bitsel;
+ return vpadd_s8(vpadd_s8(vpadd_s8(__asint, __zero), __zero),
+ __zero)[0];
+ }
+ else if constexpr (sizeof(_Tp) == 2)
+ {
+ constexpr auto __bitsel
+ = __generate_from_n_evaluations<4, __vector_type_t<_I, 4>>(
+ [&](auto __i) {
+ return static_cast<_I>(__i < _Np ? 1 << __i : 0);
+ });
+ __asint &= __bitsel;
+ return vpadd_s16(vpadd_s16(__asint, __zero), __zero)[0];
+ }
+ else if constexpr (sizeof(_Tp) == 4)
+ {
+ __asint &= __make_vector<_I>(0x1, 0x2);
+ return vpadd_s32(__asint, __zero)[0];
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__to_bits(__x);
+ }
+};
+
+// }}}
+// _MaskImplNeon {{{
+template <typename _Abi>
+struct _MaskImplNeon : _MaskImplNeonMixin, _MaskImplBuiltin<_Abi>
+{
+ using _MaskImplBuiltinMixin::__to_maskvector;
+ using _MaskImplNeonMixin::__to_bits;
+ using _Base = _MaskImplBuiltin<_Abi>;
+ using _Base::__convert;
+
+ // __all_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+ {
+ const auto __kk
+ = __vector_bitcast<char>(__k._M_data)
+ | ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+ if constexpr (sizeof(__k) == 16)
+ {
+ const auto __x = __vector_bitcast<long long>(__kk);
+ return __x[0] + __x[1] == -2;
+ }
+ else if constexpr (sizeof(__k) <= 8)
+ return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) == -1;
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __any_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+ {
+ const auto __kk
+ = __vector_bitcast<char>(__k._M_data)
+ | ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+ if constexpr (sizeof(__k) == 16)
+ {
+ const auto __x = __vector_bitcast<long long>(__kk);
+ return (__x[0] | __x[1]) != 0;
+ }
+ else if constexpr (sizeof(__k) <= 8)
+ return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) != 0;
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __none_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+ {
+ const auto __kk
+ = __vector_bitcast<char>(__k._M_data)
+ | ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+ if constexpr (sizeof(__k) == 16)
+ {
+ const auto __x = __vector_bitcast<long long>(__kk);
+ return (__x[0] | __x[1]) == 0;
+ }
+ else if constexpr (sizeof(__k) <= 8)
+ return __bit_cast<__int_for_sizeof_t<decltype(__kk)>>(__kk) == 0;
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __some_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (sizeof(__k) <= 8)
+ {
+ const auto __kk
+ = __vector_bitcast<char>(__k._M_data)
+ | ~__vector_bitcast<char>(_Abi::template __implicit_mask<_Tp>());
+ using _Up = std::make_unsigned_t<__int_for_sizeof_t<decltype(__kk)>>;
+ return __bit_cast<_Up>(__kk) + 1 > 1;
+ }
+ else
+ return _Base::__some_of(__k);
+ }
+
+ // }}}
+ // __popcount {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (sizeof(_Tp) == 1)
+ {
+ const auto __s8 = __vector_bitcast<_SChar>(__k._M_data);
+ int8x8_t __tmp = __lo64(__s8) + __hi64z(__s8);
+ return -vpadd_s8(vpadd_s8(vpadd_s8(__tmp, int8x8_t()), int8x8_t()),
+ int8x8_t())[0];
+ }
+ else if constexpr (sizeof(_Tp) == 2)
+ {
+ const auto __s16 = __vector_bitcast<short>(__k._M_data);
+ int16x4_t __tmp = __lo64(__s16) + __hi64z(__s16);
+ return -vpadd_s16(vpadd_s16(__tmp, int16x4_t()), int16x4_t())[0];
+ }
+ else if constexpr (sizeof(_Tp) == 4)
+ {
+ const auto __s32 = __vector_bitcast<int>(__k._M_data);
+ int32x2_t __tmp = __lo64(__s32) + __hi64z(__s32);
+ return -vpadd_s32(__tmp, int32x2_t())[0];
+ }
+ else if constexpr (sizeof(_Tp) == 8)
+ {
+ static_assert(sizeof(__k) == 16);
+ const auto __s64 = __vector_bitcast<long>(__k._M_data);
+ return -(__s64[0] + __s64[1]);
+ }
+ }
+
+ // }}}
+ // __find_first_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+ {
+ // TODO: the _Base implementation is not optimal for NEON
+ return _Base::__find_first_set(__k);
+ }
+
+ // }}}
+ // __find_last_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+ {
+ // TODO: the _Base implementation is not optimal for NEON
+ return _Base::__find_last_set(__k);
+ }
+
+ // }}}
+}; // }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_NEON_H_
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
new file mode 100644
index 00000000000..fc4ffe12298
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -0,0 +1,877 @@
+// Simd scalar ABI specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+#if __cplusplus >= 201703L
+
+#include <cmath>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __promote_preserving_unsigned{{{
+// work around crazy semantics of unsigned integers of lower rank than int:
+// Before applying an operator the operands are promoted to int. In which case
+// over- or underflow is UB, even though the operand types were unsigned.
+template <typename _Tp>
+_GLIBCXX_SIMD_INTRINSIC constexpr decltype(auto)
+__promote_preserving_unsigned(const _Tp& __x)
+{
+ if constexpr (std::is_signed_v<decltype(+__x)> && std::is_unsigned_v<_Tp>)
+ return static_cast<unsigned int>(__x);
+ else
+ return __x;
+}
+
+// }}}
+
+struct _CommonImplScalar;
+struct _CommonImplBuiltin;
+struct _SimdImplScalar;
+struct _MaskImplScalar;
+// simd_abi::_Scalar {{{
+struct simd_abi::_Scalar
+{
+ template <typename _Tp> static constexpr size_t size = 1;
+ template <typename _Tp> static constexpr size_t _S_full_size = 1;
+ static constexpr bool _S_is_partial = false;
+ struct _IsValidAbiTag : true_type
+ {
+ };
+ template <typename _Tp> struct _IsValidSizeFor : true_type
+ {
+ };
+ template <typename _Tp> struct _IsValid : __is_vectorizable<_Tp>
+ {
+ };
+ template <typename _Tp>
+ static constexpr bool _S_is_valid_v = _IsValid<_Tp>::value;
+
+ _GLIBCXX_SIMD_INTRINSIC static constexpr bool __masked(bool __x)
+ {
+ return __x;
+ }
+
+ using _CommonImpl = _CommonImplScalar;
+ using _SimdImpl = _SimdImplScalar;
+ using _MaskImpl = _MaskImplScalar;
+
+ template <typename _Tp, bool = _S_is_valid_v<_Tp>>
+ struct __traits : _InvalidTraits
+ {
+ };
+
+ template <typename _Tp> struct __traits<_Tp, true>
+ {
+ using _IsValid = true_type;
+ using _SimdImpl = _SimdImplScalar;
+ using _MaskImpl = _MaskImplScalar;
+ using _SimdMember = _Tp;
+ using _MaskMember = bool;
+ static constexpr size_t _S_simd_align = alignof(_SimdMember);
+ static constexpr size_t _S_mask_align = alignof(_MaskMember);
+
+ // nothing the user can spell converts to/from simd/simd_mask
+ struct _SimdCastType
+ {
+ _SimdCastType() = delete;
+ };
+ struct _MaskCastType
+ {
+ _MaskCastType() = delete;
+ };
+ struct _SimdBase
+ {
+ };
+ struct _MaskBase
+ {
+ };
+ };
+};
+// }}}
+// _CommonImplScalar {{{
+struct _CommonImplScalar
+{
+ // __store {{{
+ template <typename _Flags, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_Tp __x, void* __addr, _Flags)
+ {
+ __builtin_memcpy(__addr, &__x, sizeof(_Tp));
+ }
+
+ // }}}
+ // __store_bool_array(_BitMask) {{{
+ template <size_t _Np, typename _Flags, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr void
+ __store_bool_array(_BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+ {
+ __make_dependent_t<_Flags, _CommonImplBuiltin>::__store_bool_array(__x, __mem,
+ _Flags());
+ }
+
+ // }}}
+};
+
+// }}}
+// _SimdImplScalar {{{
+struct _SimdImplScalar
+{
+ // member types {{{2
+ using abi_type = simd_abi::scalar;
+ template <typename _Tp> using _TypeTag = _Tp*;
+
+ // broadcast {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp __broadcast(_Tp __x) noexcept
+ {
+ return __x;
+ }
+
+ // __generator {{{2
+ template <typename _Fp, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp __generator(_Fp&& __gen,
+ _TypeTag<_Tp>)
+ {
+ return __gen(_SizeConstant<0>());
+ }
+
+ // __load {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __load(const _Up* __mem, _Fp,
+ _TypeTag<_Tp>) noexcept
+ {
+ return static_cast<_Tp>(__mem[0]);
+ }
+
+ // __masked_load {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ static inline _Tp __masked_load(_Tp __merge, bool __k, const _Up* __mem,
+ _Fp) noexcept
+ {
+ if (__k)
+ __merge = static_cast<_Tp>(__mem[0]);
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ static inline void __store(_Tp __v, _Up* __mem, _Fp, _TypeTag<_Tp>) noexcept
+ {
+ __mem[0] = static_cast<_Tp>(__v);
+ }
+
+ // __masked_store {{{2
+ template <typename _Tp, typename _Up, typename _Fp>
+ static inline void __masked_store(const _Tp __v, _Up* __mem, _Fp,
+ const bool __k) noexcept
+ {
+ if (__k)
+ __mem[0] = __v;
+ }
+
+ // __negate {{{2
+ template <typename _Tp>
+ static constexpr inline bool __negate(_Tp __x) noexcept
+ {
+ return !__x;
+ }
+
+ // __reduce {{{2
+ template <typename _Tp, typename _BinaryOperation>
+ static constexpr inline _Tp __reduce(const simd<_Tp, simd_abi::scalar>& __x,
+ _BinaryOperation&)
+ {
+ return __x._M_data;
+ }
+
+ // __min, __max {{{2
+ template <typename _Tp>
+ static constexpr inline _Tp __min(const _Tp __a, const _Tp __b)
+ {
+ return std::min(__a, __b);
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __max(const _Tp __a, const _Tp __b)
+ {
+ return std::max(__a, __b);
+ }
+
+ // __complement {{{2
+ template <typename _Tp>
+ static constexpr inline _Tp __complement(_Tp __x) noexcept
+ {
+ return static_cast<_Tp>(~__x);
+ }
+
+ // __unary_minus {{{2
+ template <typename _Tp>
+ static constexpr inline _Tp __unary_minus(_Tp __x) noexcept
+ {
+ return static_cast<_Tp>(-__x);
+ }
+
+ // arithmetic operators {{{2
+ template <typename _Tp> static constexpr inline _Tp __plus(_Tp __x, _Tp __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ + __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp> static constexpr inline _Tp __minus(_Tp __x, _Tp __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ - __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __multiplies(_Tp __x, _Tp __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ * __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __divides(_Tp __x, _Tp __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ / __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __modulus(_Tp __x, _Tp __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ % __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __bit_and(_Tp __x, _Tp __y)
+ {
+ if constexpr (is_floating_point_v<_Tp>)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+ & reinterpret_cast<const __may_alias<_I>&>(__y);
+ return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+ }
+ else
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ & __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp> static constexpr inline _Tp __bit_or(_Tp __x, _Tp __y)
+ {
+ if constexpr (is_floating_point_v<_Tp>)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+ | reinterpret_cast<const __may_alias<_I>&>(__y);
+ return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+ }
+ else
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ | __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __bit_xor(_Tp __x, _Tp __y)
+ {
+ if constexpr (is_floating_point_v<_Tp>)
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const _I __r = reinterpret_cast<const __may_alias<_I>&>(__x)
+ ^ reinterpret_cast<const __may_alias<_I>&>(__y);
+ return reinterpret_cast<const __may_alias<_Tp>&>(__r);
+ }
+ else
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x)
+ ^ __promote_preserving_unsigned(__y));
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __bit_shift_left(_Tp __x, int __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x) << __y);
+ }
+
+ template <typename _Tp>
+ static constexpr inline _Tp __bit_shift_right(_Tp __x, int __y)
+ {
+ return static_cast<_Tp>(__promote_preserving_unsigned(__x) >> __y);
+ }
+
+ // math {{{2
+ // frexp, modf and copysign implemented in simd_math.h
+ template <typename _Tp> using _ST = _SimdTuple<_Tp, simd_abi::scalar>;
+
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __acos(_Tp __x)
+ {
+ return std::acos(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __asin(_Tp __x)
+ {
+ return std::asin(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __atan(_Tp __x)
+ {
+ return std::atan(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cos(_Tp __x)
+ {
+ return std::cos(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sin(_Tp __x)
+ {
+ return std::sin(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tan(_Tp __x)
+ {
+ return std::tan(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __acosh(_Tp __x)
+ {
+ return std::acosh(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __asinh(_Tp __x)
+ {
+ return std::asinh(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __atanh(_Tp __x)
+ {
+ return std::atanh(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cosh(_Tp __x)
+ {
+ return std::cosh(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sinh(_Tp __x)
+ {
+ return std::sinh(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tanh(_Tp __x)
+ {
+ return std::tanh(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __atan2(_Tp __x, _Tp __y)
+ {
+ return std::atan2(__x, __y);
+ }
+
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __exp(_Tp __x)
+ {
+ return std::exp(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __exp2(_Tp __x)
+ {
+ return std::exp2(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __expm1(_Tp __x)
+ {
+ return std::expm1(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log(_Tp __x)
+ {
+ return std::log(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log10(_Tp __x)
+ {
+ return std::log10(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log1p(_Tp __x)
+ {
+ return std::log1p(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __log2(_Tp __x)
+ {
+ return std::log2(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __logb(_Tp __x)
+ {
+ return std::logb(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _ST<int> __ilogb(_Tp __x)
+ {
+ return {std::ilogb(__x)};
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __pow(_Tp __x, _Tp __y)
+ {
+ return std::pow(__x, __y);
+ }
+
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __abs(_Tp __x)
+ {
+ return std::abs(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __fabs(_Tp __x)
+ {
+ return std::fabs(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __sqrt(_Tp __x)
+ {
+ return std::sqrt(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __cbrt(_Tp __x)
+ {
+ return std::cbrt(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __erf(_Tp __x)
+ {
+ return std::erf(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __erfc(_Tp __x)
+ {
+ return std::erfc(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __lgamma(_Tp __x)
+ {
+ return std::lgamma(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __tgamma(_Tp __x)
+ {
+ return std::tgamma(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __trunc(_Tp __x)
+ {
+ return std::trunc(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __floor(_Tp __x)
+ {
+ return std::floor(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __ceil(_Tp __x)
+ {
+ return std::ceil(__x);
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x)
+ {
+ return std::nearbyint(__x);
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x)
+ {
+ return std::rint(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _ST<long> __lrint(_Tp __x)
+ {
+ return {std::lrint(__x)};
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _ST<long long> __llrint(_Tp __x)
+ {
+ return {std::llrint(__x)};
+ }
+ template <typename _Tp> _GLIBCXX_SIMD_INTRINSIC static _Tp __round(_Tp __x)
+ {
+ return std::round(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _ST<long> __lround(_Tp __x)
+ {
+ return {std::lround(__x)};
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _ST<long long> __llround(_Tp __x)
+ {
+ return {std::llround(__x)};
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __ldexp(_Tp __x, _ST<int> __y)
+ {
+ return std::ldexp(__x, __y.first);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __scalbn(_Tp __x, _ST<int> __y)
+ {
+ return std::scalbn(__x, __y.first);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __scalbln(_Tp __x, _ST<long> __y)
+ {
+ return std::scalbln(__x, __y.first);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __fmod(_Tp __x, _Tp __y)
+ {
+ return std::fmod(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __remainder(_Tp __x, _Tp __y)
+ {
+ return std::remainder(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __nextafter(_Tp __x, _Tp __y)
+ {
+ return std::nextafter(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __fdim(_Tp __x, _Tp __y)
+ {
+ return std::fdim(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __fmax(_Tp __x, _Tp __y)
+ {
+ return std::fmax(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __fmin(_Tp __x, _Tp __y)
+ {
+ return std::fmin(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __fma(_Tp __x, _Tp __y, _Tp __z)
+ {
+ return std::fma(__x, __y, __z);
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __remquo(_Tp __x, _Tp __y, _ST<int>* __z)
+ {
+ return std::remquo(__x, __y, &__z->first);
+ }
+ template <typename _Tp>
+ [[deprecated]] _GLIBCXX_SIMD_INTRINSIC static _Tp __remquo(_Tp __x, _Tp __y,
+ int* __z)
+ {
+ return std::remquo(__x, __y, __z);
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static _ST<int> __fpclassify(_Tp __x)
+ {
+ return {std::fpclassify(__x)};
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isfinite(_Tp __x)
+ {
+ return std::isfinite(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isinf(_Tp __x)
+ {
+ return std::isinf(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isnan(_Tp __x)
+ {
+ return std::isnan(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isnormal(_Tp __x)
+ {
+ return std::isnormal(__x);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __signbit(_Tp __x)
+ {
+ return std::signbit(__x);
+ }
+
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isgreater(_Tp __x, _Tp __y)
+ {
+ return std::isgreater(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isgreaterequal(_Tp __x, _Tp __y)
+ {
+ return std::isgreaterequal(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isless(_Tp __x, _Tp __y)
+ {
+ return std::isless(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __islessequal(_Tp __x, _Tp __y)
+ {
+ return std::islessequal(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __islessgreater(_Tp __x, _Tp __y)
+ {
+ return std::islessgreater(__x, __y);
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __isunordered(_Tp __x, _Tp __y)
+ {
+ return std::isunordered(__x, __y);
+ }
+
+ // __increment & __decrement{{{2
+ template <typename _Tp> constexpr static inline void __increment(_Tp& __x)
+ {
+ ++__x;
+ }
+ template <typename _Tp> constexpr static inline void __decrement(_Tp& __x)
+ {
+ --__x;
+ }
+
+ // compares {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __equal_to(_Tp __x, _Tp __y)
+ {
+ return __x == __y;
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __not_equal_to(_Tp __x, _Tp __y)
+ {
+ return __x != __y;
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __less(_Tp __x, _Tp __y)
+ {
+ return __x < __y;
+ }
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __less_equal(_Tp __x, _Tp __y)
+ {
+ return __x <= __y;
+ }
+
+ // smart_reference access {{{2
+ template <typename _Tp, typename _Up>
+ constexpr static void __set(_Tp& __v, [[maybe_unused]] int __i,
+ _Up&& __x) noexcept
+ {
+ _GLIBCXX_DEBUG_ASSERT(__i == 0);
+ __v = static_cast<_Up&&>(__x);
+ }
+
+ // __masked_assign {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static void
+ __masked_assign(bool __k, _Tp& __lhs, _Tp __rhs)
+ {
+ if (__k)
+ __lhs = __rhs;
+ }
+
+ // __masked_cassign {{{2
+ template <typename _Op, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static void
+ __masked_cassign(const bool __k, _Tp& __lhs, const _Tp __rhs, _Op __op)
+ {
+ if (__k)
+ __lhs = __op(_SimdImplScalar{}, __lhs, __rhs);
+ }
+
+ // __masked_unary {{{2
+ template <template <typename> class _Op, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static _Tp __masked_unary(const bool __k,
+ const _Tp __v)
+ {
+ return static_cast<_Tp>(__k ? _Op<_Tp>{}(__v) : __v);
+ }
+
+ // }}}2
+};
+
+// }}}
+// _MaskImplScalar {{{
+struct _MaskImplScalar
+{
+ // member types {{{
+ template <typename _Tp> using _TypeTag = _Tp*;
+
+ // }}}
+ // __broadcast {{{
+ template <typename>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr bool __broadcast(bool __x)
+ {
+ return __x;
+ }
+
+ // }}}
+ // __load {{{
+ template <typename, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr bool __load(const bool* __mem)
+ {
+ return __mem[0];
+ }
+
+ // }}}
+ // __to_bits {{{
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<1>
+ __to_bits(bool __x)
+ {
+ return __x;
+ }
+
+ // }}}
+ // __convert {{{
+ template <typename _Tp, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+ __convert(_BitMask<1, _Sanitized> __x)
+ {
+ return __x[0];
+ }
+
+ template <typename _Tp, typename _Up, typename _UAbi>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr bool
+ __convert(simd_mask<_Up, _UAbi> __x)
+ {
+ return __x[0];
+ }
+
+ // }}}
+ // __from_bitmask {{{2
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+ __from_bitmask(_SanitizedBitMask<1> __bits, _TypeTag<_Tp>) noexcept
+ {
+ return __bits[0];
+ }
+
+ // __masked_load {{{2
+ template <typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+ __masked_load(bool __merge, bool __mask, const bool* __mem, _Fp) noexcept
+ {
+ if (__mask)
+ __merge = __mem[0];
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(bool __v, bool* __mem,
+ _Fp) noexcept
+ {
+ __mem[0] = __v;
+ }
+
+ // __masked_store {{{2
+ template <typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_store(const bool __v, bool* __mem, _Fp, const bool __k) noexcept
+ {
+ if (__k)
+ __mem[0] = __v;
+ }
+
+ // logical and bitwise operators {{{2
+ static constexpr bool __logical_and(bool __x, bool __y) { return __x && __y; }
+ static constexpr bool __logical_or(bool __x, bool __y) { return __x || __y; }
+ static constexpr bool __bit_not(bool __x) { return !__x; }
+ static constexpr bool __bit_and(bool __x, bool __y) { return __x && __y; }
+ static constexpr bool __bit_or(bool __x, bool __y) { return __x || __y; }
+ static constexpr bool __bit_xor(bool __x, bool __y) { return __x != __y; }
+
+ // smart_reference access {{{2
+ constexpr static void __set(bool& __k, [[maybe_unused]] int __i,
+ bool __x) noexcept
+ {
+ _GLIBCXX_DEBUG_ASSERT(__i == 0);
+ __k = __x;
+ }
+
+ // __masked_assign {{{2
+ _GLIBCXX_SIMD_INTRINSIC static void __masked_assign(bool __k, bool& __lhs,
+ bool __rhs)
+ {
+ if (__k)
+ __lhs = __rhs;
+ }
+
+ // }}}2
+ // __all_of {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+ __all_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __k._M_data;
+ }
+
+ // }}}
+ // __any_of {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+ __any_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return __k._M_data;
+ }
+
+ // }}}
+ // __none_of {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+ __none_of(simd_mask<_Tp, _Abi> __k)
+ {
+ return !__k._M_data;
+ }
+
+ // }}}
+ // __some_of {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static bool __some_of(simd_mask<_Tp, _Abi>)
+ {
+ return false;
+ }
+
+ // }}}
+ // __popcount {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static int
+ __popcount(simd_mask<_Tp, _Abi> __k)
+ {
+ return __k._M_data;
+ }
+
+ // }}}
+ // __find_first_set {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static int
+ __find_first_set(simd_mask<_Tp, _Abi>)
+ {
+ return 0;
+ }
+
+ // }}}
+ // __find_last_set {{{
+ template <typename _Tp, typename _Abi>
+ _GLIBCXX_SIMD_INTRINSIC constexpr static int
+ __find_last_set(simd_mask<_Tp, _Abi>)
+ {
+ return 0;
+ }
+
+ // }}}
+};
+
+// }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_SCALAR_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
new file mode 100644
index 00000000000..4e15aac8b62
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -0,0 +1,5037 @@
+// Simd x86 specific implementations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+#define _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+
+#if __cplusplus >= 201703L
+
+#if !_GLIBCXX_SIMD_X86INTRIN
+#error \
+ "simd_x86.h may only be included when MMX or SSE on x86(_64) are available"
+#endif
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+
+// __interleave128_lo {{{
+template <typename _Ap, typename _B, typename _Tp = std::common_type_t<_Ap, _B>,
+ typename _Trait = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
+__interleave128_lo(const _Ap& __av, const _B& __bv)
+{
+ const _Tp __a(__av);
+ const _Tp __b(__bv);
+ if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 2)
+ return _Tp{__a[0], __b[0]};
+ else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 4)
+ return _Tp{__a[0], __b[0], __a[1], __b[1]};
+ else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 8)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3], __b[3]};
+ else if constexpr (sizeof(_Tp) == 16 && _Trait::_S_width == 16)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3], __b[3],
+ __a[4], __b[4], __a[5], __b[5], __a[6], __b[6], __a[7], __b[7]};
+ else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 4)
+ return _Tp{__a[0], __b[0], __a[2], __b[2]};
+ else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 8)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[4], __b[4], __a[5], __b[5]};
+ else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 16)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2],
+ __a[3], __b[3], __a[8], __b[8], __a[9], __b[9],
+ __a[10], __b[10], __a[11], __b[11]};
+ else if constexpr (sizeof(_Tp) == 32 && _Trait::_S_width == 32)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3],
+ __b[3], __a[4], __b[4], __a[5], __b[5], __a[6], __b[6],
+ __a[7], __b[7], __a[16], __b[16], __a[17], __b[17], __a[18],
+ __b[18], __a[19], __b[19], __a[20], __b[20], __a[21], __b[21],
+ __a[22], __b[22], __a[23], __b[23]};
+ else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 8)
+ return _Tp{__a[0], __b[0], __a[2], __b[2], __a[4], __b[4], __a[6], __b[6]};
+ else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 16)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[4], __b[4],
+ __a[5], __b[5], __a[8], __b[8], __a[9], __b[9],
+ __a[12], __b[12], __a[13], __b[13]};
+ else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 32)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3],
+ __b[3], __a[8], __b[8], __a[9], __b[9], __a[10], __b[10],
+ __a[11], __b[11], __a[16], __b[16], __a[17], __b[17], __a[18],
+ __b[18], __a[19], __b[19], __a[24], __b[24], __a[25], __b[25],
+ __a[26], __b[26], __a[27], __b[27]};
+ else if constexpr (sizeof(_Tp) == 64 && _Trait::_S_width == 64)
+ return _Tp{__a[0], __b[0], __a[1], __b[1], __a[2], __b[2], __a[3],
+ __b[3], __a[4], __b[4], __a[5], __b[5], __a[6], __b[6],
+ __a[7], __b[7], __a[16], __b[16], __a[17], __b[17], __a[18],
+ __b[18], __a[19], __b[19], __a[20], __b[20], __a[21], __b[21],
+ __a[22], __b[22], __a[23], __b[23], __a[32], __b[32], __a[33],
+ __b[33], __a[34], __b[34], __a[35], __b[35], __a[36], __b[36],
+ __a[37], __b[37], __a[38], __b[38], __a[39], __b[39], __a[48],
+ __b[48], __a[49], __b[49], __a[50], __b[50], __a[51], __b[51],
+ __a[52], __b[52], __a[53], __b[53], __a[54], __b[54], __a[55],
+ __b[55]};
+ else
+ __assert_unreachable<_Tp>();
+}
+
+// }}}
+// __is_zero{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC constexpr bool
+__is_zero(_Tp __a)
+{
+ if (!__builtin_is_constant_evaluated())
+ {
+ if constexpr (__have_avx)
+ {
+ if constexpr (_TVT::template __is<float, 8>)
+ return _mm256_testz_ps(__a, __a);
+ else if constexpr (_TVT::template __is<double, 4>)
+ return _mm256_testz_pd(__a, __a);
+ else if constexpr (sizeof(_Tp) == 32)
+ return _mm256_testz_si256(__to_intrin(__a), __to_intrin(__a));
+ else if constexpr (_TVT::template __is<float>)
+ return _mm_testz_ps(__to_intrin(__a), __to_intrin(__a));
+ else if constexpr (_TVT::template __is<double, 2>)
+ return _mm_testz_pd(__a, __a);
+ else
+ return _mm_testz_si128(__to_intrin(__a), __to_intrin(__a));
+ }
+ else if constexpr (__have_sse4_1)
+ return _mm_testz_si128(__intrin_bitcast<__m128i>(__a),
+ __intrin_bitcast<__m128i>(__a));
+ }
+ else if constexpr (sizeof(_Tp) <= 8)
+ return reinterpret_cast<__int_for_sizeof_t<_Tp>>(__a) == 0;
+ else
+ {
+ const auto __b = __vector_bitcast<_LLong>(__a);
+ if constexpr (sizeof(__b) == 16)
+ return (__b[0] | __b[1]) == 0;
+ else if constexpr (sizeof(__b) == 32)
+ return __is_zero(__lo128(__b) | __hi128(__b));
+ else if constexpr (sizeof(__b) == 64)
+ return __is_zero(__lo256(__b) | __hi256(__b));
+ else
+ __assert_unreachable<_Tp>();
+ }
+}
+// }}}
+// __movemask{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST int
+__movemask(_Tp __a)
+{
+ if constexpr (sizeof(_Tp) == 32)
+ {
+ if constexpr (_TVT::template __is<float>)
+ return _mm256_movemask_ps(__to_intrin(__a));
+ else if constexpr (_TVT::template __is<double>)
+ return _mm256_movemask_pd(__to_intrin(__a));
+ else
+ return _mm256_movemask_epi8(__to_intrin(__a));
+ }
+ else if constexpr (_TVT::template __is<float>)
+ return _mm_movemask_ps(__to_intrin(__a));
+ else if constexpr (_TVT::template __is<double>)
+ return _mm_movemask_pd(__to_intrin(__a));
+ else
+ return _mm_movemask_epi8(__to_intrin(__a));
+}
+
+// }}}
+// __testz{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testz(_Tp __a, typename _TVT::type __b)
+{
+ if (!__builtin_is_constant_evaluated())
+ {
+ if constexpr (sizeof(_Tp) == 32)
+ {
+ if constexpr (_TVT::template __is<float>)
+ return _mm256_testz_ps(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (_TVT::template __is<double>)
+ return _mm256_testz_pd(__to_intrin(__a), __to_intrin(__b));
+ else
+ return _mm256_testz_si256(__to_intrin(__a), __to_intrin(__b));
+ }
+ else if constexpr (_TVT::template __is<float> && __have_avx)
+ return _mm_testz_ps(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (_TVT::template __is<double> && __have_avx)
+ return _mm_testz_pd(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (__have_sse4_1)
+ return _mm_testz_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+ __intrin_bitcast<__m128i>(__to_intrin(__b)));
+ else
+ return __movemask(0 == __and(__a, __b)) != 0;
+ }
+ else
+ return __is_zero(__and(__a, __b));
+}
+
+// }}}
+// __testc{{{
+// requires SSE4.1 or above
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testc(_Tp __a, typename _TVT::type __b)
+{
+ if (__builtin_is_constant_evaluated())
+ return __is_zero(__andnot(__a, __b));
+
+ if constexpr (sizeof(_Tp) == 32)
+ {
+ if constexpr (_TVT::template __is<float>)
+ return _mm256_testc_ps(__a, __b);
+ else if constexpr (_TVT::template __is<double>)
+ return _mm256_testc_pd(__a, __b);
+ else
+ return _mm256_testc_si256(__to_intrin(__a), __to_intrin(__b));
+ }
+ else if constexpr (_TVT::template __is<float> && __have_avx)
+ return _mm_testc_ps(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (_TVT::template __is<double> && __have_avx)
+ return _mm_testc_pd(__to_intrin(__a), __to_intrin(__b));
+ else
+ {
+ static_assert(is_same_v<_Tp, _Tp> && __have_sse4_1);
+ return _mm_testc_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+ __intrin_bitcast<__m128i>(__to_intrin(__b)));
+ }
+}
+
+// }}}
+// __testnzc{{{
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr int
+__testnzc(_Tp __a, typename _TVT::type __b)
+{
+ if (!__builtin_is_constant_evaluated())
+ {
+ if constexpr (sizeof(_Tp) == 32)
+ {
+ if constexpr (_TVT::template __is<float>)
+ return _mm256_testnzc_ps(__a, __b);
+ else if constexpr (_TVT::template __is<double>)
+ return _mm256_testnzc_pd(__a, __b);
+ else
+ return _mm256_testnzc_si256(__to_intrin(__a), __to_intrin(__b));
+ }
+ else if constexpr (_TVT::template __is<float> && __have_avx)
+ return _mm_testnzc_ps(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (_TVT::template __is<double> && __have_avx)
+ return _mm_testnzc_pd(__to_intrin(__a), __to_intrin(__b));
+ else if constexpr (__have_sse4_1)
+ return _mm_testnzc_si128(__intrin_bitcast<__m128i>(__to_intrin(__a)),
+ __intrin_bitcast<__m128i>(__to_intrin(__b)));
+ else
+ return __movemask(0 == __and(__a, __b)) == 0
+ && __movemask(0 == __andnot(__a, __b)) == 0;
+ }
+ else
+ return !(__is_zero(__and(__a, __b)) || __is_zero(__andnot(__a, __b)));
+}
+
+// }}}
+// __xzyw{{{
+// shuffles the complete vector, swapping the inner two quarters. Often useful
+// for AVX for fixing up a shuffle result.
+template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+_GLIBCXX_SIMD_INTRINSIC _Tp
+__xzyw(_Tp __a)
+{
+ if constexpr (sizeof(_Tp) == 16)
+ {
+ const auto __x = __vector_bitcast<conditional_t<
+ is_floating_point_v<typename _TVT::value_type>, float, int>>(__a);
+ return reinterpret_cast<_Tp>(
+ decltype(__x){__x[0], __x[2], __x[1], __x[3]});
+ }
+ else if constexpr (sizeof(_Tp) == 32)
+ {
+ const auto __x = __vector_bitcast<conditional_t<
+ is_floating_point_v<typename _TVT::value_type>, double, _LLong>>(__a);
+ return reinterpret_cast<_Tp>(
+ decltype(__x){__x[0], __x[2], __x[1], __x[3]});
+ }
+ else if constexpr (sizeof(_Tp) == 64)
+ {
+ const auto __x = __vector_bitcast<conditional_t<
+ is_floating_point_v<typename _TVT::value_type>, double, _LLong>>(__a);
+ return reinterpret_cast<_Tp>(decltype(
+ __x){__x[0], __x[1], __x[4], __x[5], __x[2], __x[3], __x[6], __x[7]});
+ }
+ else
+ __assert_unreachable<_Tp>();
+}
+
+// }}}
+
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+#include "simd_x86_conversions.h"
+#endif
+
+// ISA & type detection {{{
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_sse_ps()
+{
+ return __have_sse
+ && std::is_same_v<_Tp,
+ float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 16;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_sse_pd()
+{
+ return __have_sse2
+ && std::is_same_v<
+ _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 16;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx_ps()
+{
+ return __have_avx
+ && std::is_same_v<_Tp,
+ float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 32;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx_pd()
+{
+ return __have_avx
+ && std::is_same_v<
+ _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 32;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx512_ps()
+{
+ return __have_avx512f
+ && std::is_same_v<_Tp,
+ float> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 64;
+}
+template <typename _Tp, size_t _Np>
+constexpr bool
+__is_avx512_pd()
+{
+ return __have_avx512f
+ && std::is_same_v<
+ _Tp, double> && sizeof(__intrinsic_type_t<_Tp, _Np>) == 64;
+}
+
+// }}}
+struct _MaskImplX86Mixin;
+// _CommonImplX86 {{{
+struct _CommonImplX86 : _CommonImplBuiltin
+{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR85048
+ // __converts_via_decomposition {{{
+ template <typename _From, typename _To, size_t _ToSize>
+ static constexpr bool __converts_via_decomposition()
+ {
+ if constexpr (is_integral_v<
+ _From> && is_integral_v<_To> && sizeof(_From) == 8
+ && _ToSize == 16)
+ return (sizeof(_To) == 2 && !__have_ssse3)
+ || (sizeof(_To) == 1 && !__have_avx512f);
+ else if constexpr (is_floating_point_v<_From> && is_integral_v<_To>)
+ return ((sizeof(_From) == 4 || sizeof(_From) == 8) && sizeof(_To) == 8
+ && !__have_avx512dq)
+ || (sizeof(_From) == 8 && sizeof(_To) == 4 && !__have_sse4_1
+ && _ToSize == 16);
+ else if constexpr (
+ is_integral_v<_From> && is_floating_point_v<_To> && sizeof(_From) == 8
+ && !__have_avx512dq)
+ return (sizeof(_To) == 4 && _ToSize == 16)
+ || (sizeof(_To) == 8 && _ToSize < 64);
+ else
+ return false;
+ }
+
+ template <typename _From, typename _To, size_t _ToSize>
+ static inline constexpr bool __converts_via_decomposition_v
+ = __converts_via_decomposition<_From, _To, _ToSize>();
+
+ // }}}
+#endif
+ // __store {{{
+ using _CommonImplBuiltin::__store;
+
+ template <typename _Flags, typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __x,
+ void* __addr, _Flags)
+ {
+ constexpr size_t _Bytes = _Np * sizeof(_Tp);
+
+ if constexpr ((_Bytes & (_Bytes - 1)) != 0 && __have_avx512bw_vl)
+ {
+ const auto __v = __to_intrin(__x);
+ if constexpr (std::is_same_v<_Flags, vector_aligned_tag>)
+ __addr
+ = __builtin_assume_aligned(__addr, alignof(_SimdWrapper<_Tp, _Np>));
+ else if constexpr (!std::is_same_v<_Flags, element_aligned_tag>)
+ __addr = __builtin_assume_aligned(__addr, _Flags::_S_alignment);
+
+ if constexpr (_Bytes & 1)
+ {
+ if constexpr (_Bytes < 16)
+ _mm_mask_storeu_epi8(__addr, 0xffffu >> (16 - _Bytes),
+ __intrin_bitcast<__m128i>(__v));
+ else if constexpr (_Bytes < 32)
+ _mm256_mask_storeu_epi8(__addr, 0xffffffffu >> (32 - _Bytes),
+ __intrin_bitcast<__m256i>(__v));
+ else
+ _mm512_mask_storeu_epi8(__addr,
+ 0xffffffffffffffffull >> (64 - _Bytes),
+ __intrin_bitcast<__m512i>(__v));
+ }
+ else if constexpr (_Bytes & 2)
+ {
+ if constexpr (_Bytes < 16)
+ _mm_mask_storeu_epi16(__addr, 0xffu >> (8 - _Bytes / 2),
+ __intrin_bitcast<__m128i>(__v));
+ else if constexpr (_Bytes < 32)
+ _mm256_mask_storeu_epi16(__addr, 0xffffu >> (16 - _Bytes / 2),
+ __intrin_bitcast<__m256i>(__v));
+ else
+ _mm512_mask_storeu_epi16(__addr,
+ 0xffffffffull >> (32 - _Bytes / 2),
+ __intrin_bitcast<__m512i>(__v));
+ }
+ else if constexpr (_Bytes & 4)
+ {
+ if constexpr (_Bytes < 16)
+ _mm_mask_storeu_epi32(__addr, 0xfu >> (4 - _Bytes / 4),
+ __intrin_bitcast<__m128i>(__v));
+ else if constexpr (_Bytes < 32)
+ _mm256_mask_storeu_epi32(__addr, 0xffu >> (8 - _Bytes / 4),
+ __intrin_bitcast<__m256i>(__v));
+ else
+ _mm512_mask_storeu_epi32(__addr, 0xffffull >> (16 - _Bytes / 4),
+ __intrin_bitcast<__m512i>(__v));
+ }
+ else
+ {
+ static_assert(
+ _Bytes > 16,
+ "_Bytes < 16 && (_Bytes & 7) == 0 && (_Bytes & (_Bytes "
+ "- 1)) != 0 is impossible");
+ if constexpr (_Bytes < 32)
+ _mm256_mask_storeu_epi64(__addr, 0xfu >> (4 - _Bytes / 8),
+ __intrin_bitcast<__m256i>(__v));
+ else
+ _mm512_mask_storeu_epi64(__addr, 0xffull >> (8 - _Bytes / 8),
+ __intrin_bitcast<__m512i>(__v));
+ }
+ }
+ else
+ _CommonImplBuiltin::__store(__x, __addr, _Flags());
+ }
+
+ // }}}
+ // __store_bool_array(_BitMask) {{{
+ template <size_t _Np, typename _Flags, bool _Sanitized>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr void
+ __store_bool_array(const _BitMask<_Np, _Sanitized> __x, bool* __mem, _Flags)
+ {
+ if constexpr (__have_avx512bw_vl) // don't care for BW w/o VL
+ __store<_Np>(1 & __vector_bitcast<_UChar, _Np>([=]() constexpr {
+ if constexpr (_Np <= 16)
+ return _mm_movm_epi8(__x._M_to_bits());
+ else if constexpr (_Np <= 32)
+ return _mm256_movm_epi8(__x._M_to_bits());
+ else if constexpr (_Np <= 64)
+ return _mm512_movm_epi8(__x._M_to_bits());
+ else
+ __assert_unreachable<_SizeConstant<_Np>>();
+ }()),
+ __mem, _Flags());
+ else if constexpr (__have_bmi2)
+ {
+ if constexpr (_Np <= 4)
+ __store<_Np>(_pdep_u32(__x._M_to_bits(), 0x01010101U), __mem,
+ _Flags());
+ else
+ __execute_n_times<__div_roundup(_Np, sizeof(size_t))>([&](auto __i) {
+ constexpr size_t __offset = __i * sizeof(size_t);
+ constexpr int __todo = std::min(sizeof(size_t), _Np - __offset);
+ if constexpr (__todo == 1)
+ __mem[__offset] = __x[__offset];
+ else
+ {
+ const auto __bools =
+#ifdef __x86_64__
+ _pdep_u64(__x.template _M_extract<__offset>().to_ullong(),
+ 0x0101010101010101ULL);
+#else // __x86_64__
+ _pdep_u32(__x.template _M_extract<__offset>()._M_to_bits(),
+ 0x01010101U);
+#endif // __x86_64__
+ __store<__todo>(__bools, __mem + __offset, _Flags());
+ }
+ });
+ }
+ else if constexpr (__have_sse2 && _Np > 7)
+ __execute_n_times<__div_roundup(_Np, 16)>([&](auto __i) {
+ constexpr int __offset = __i * 16;
+ constexpr int __todo = std::min(16, int(_Np) - __offset);
+ const int __bits = __x.template _M_extract<__offset>()._M_to_bits();
+ __vector_type16_t<_UChar> __bools;
+ if constexpr (__have_avx512f)
+ {
+ auto __as32bits
+ = _mm512_maskz_mov_epi32(__bits,
+ __to_intrin(__vector_broadcast<16>(1)));
+ auto __as16bits = __xzyw(
+ _mm256_packs_epi32(__lo256(__as32bits),
+ __todo > 8 ? __hi256(__as32bits) : __m256i()));
+ __bools = __vector_bitcast<_UChar>(
+ _mm_packs_epi16(__lo128(__as16bits), __hi128(__as16bits)));
+ }
+ else
+ {
+ using _V = __vector_type_t<_UChar, 16>;
+ auto __tmp = _mm_cvtsi32_si128(__bits);
+ __tmp = _mm_unpacklo_epi8(__tmp, __tmp);
+ __tmp = _mm_unpacklo_epi16(__tmp, __tmp);
+ __tmp = _mm_unpacklo_epi32(__tmp, __tmp);
+ _V __tmp2 = reinterpret_cast<_V>(__tmp);
+ __tmp2 &= _V{1, 2, 4, 8, 16, 32, 64, 128,
+ 1, 2, 4, 8, 16, 32, 64, 128}; // mask bit index
+ __bools = (__tmp2 == 0) + 1; // 0xff -> 0x00 | 0x00 -> 0x01
+ }
+ __store<__todo>(__bools, __mem + __offset, _Flags());
+ });
+ else
+ _CommonImplBuiltin::__store_bool_array(__x, __mem, _Flags());
+ }
+
+ // }}}
+ // _S_blend_avx512 {{{
+ // Returns: __k ? __b : __a
+ // TODO: reverse __a and __b to match COND_EXPR
+ // Requires: _TV to be a __vector_type_t matching valuetype for the bitmask
+ // __k
+ template <typename _Kp, typename _TV>
+ _GLIBCXX_SIMD_INTRINSIC static _TV
+ _S_blend_avx512(const _Kp __k, const _TV __a, const _TV __b) noexcept
+ {
+ static_assert(__is_vector_type_v<_TV>);
+ using _Tp = typename _VectorTraits<_TV>::value_type;
+ static_assert(sizeof(_TV) >= 16);
+ static_assert(sizeof(_Tp) <= 8);
+ using _IntT = conditional_t<(sizeof(_Tp) > 2),
+ conditional_t<sizeof(_Tp) == 4, int, long long>,
+ conditional_t<sizeof(_Tp) == 1, char, short>>;
+ [[maybe_unused]] const auto __aa = __vector_bitcast<_IntT>(__a);
+ [[maybe_unused]] const auto __bb = __vector_bitcast<_IntT>(__b);
+ if constexpr (sizeof(_TV) == 64)
+ {
+ if constexpr (sizeof(_Tp) == 1)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmb_512_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 2)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmw_512_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmps_512_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 4)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmd_512_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmpd_512_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 8)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmq_512_mask(__aa, __bb, __k));
+ }
+ else if constexpr (sizeof(_TV) == 32)
+ {
+ if constexpr (sizeof(_Tp) == 1)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmb_256_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 2)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmw_256_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmps_256_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 4)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmd_256_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmpd_256_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 8)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmq_256_mask(__aa, __bb, __k));
+ }
+ else if constexpr (sizeof(_TV) == 16)
+ {
+ if constexpr (sizeof(_Tp) == 1)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmb_128_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 2)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmw_128_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 4 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmps_128_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 4)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmd_128_mask(__aa, __bb, __k));
+ else if constexpr (sizeof(_Tp) == 8 && is_floating_point_v<_Tp>)
+ return __builtin_ia32_blendmpd_128_mask(__a, __b, __k);
+ else if constexpr (sizeof(_Tp) == 8)
+ return reinterpret_cast<_TV>(
+ __builtin_ia32_blendmq_128_mask(__aa, __bb, __k));
+ }
+ }
+
+ // }}}
+ // _S_blend_intrin {{{
+ // Returns: __k ? __b : __a
+ // TODO: reverse __a and __b to match COND_EXPR
+ // Requires: _Tp to be an intrinsic type (integers blend per byte) and 16/32
+ // Bytes wide
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp _S_blend_intrin(_Tp __k, _Tp __a,
+ _Tp __b) noexcept
+ {
+ static_assert(is_same_v<decltype(__to_intrin(__a)), _Tp>);
+ constexpr struct
+ {
+ _GLIBCXX_SIMD_INTRINSIC __m128 operator()(__m128 __a, __m128 __b,
+ __m128 __k) const noexcept
+ {
+ return __builtin_ia32_blendvps(__a, __b, __k);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __m128d operator()(__m128d __a, __m128d __b,
+ __m128d __k) const noexcept
+ {
+ return __builtin_ia32_blendvpd(__a, __b, __k);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __m128i operator()(__m128i __a, __m128i __b,
+ __m128i __k) const noexcept
+ {
+ return reinterpret_cast<__m128i>(
+ __builtin_ia32_pblendvb128(reinterpret_cast<__v16qi>(__a),
+ reinterpret_cast<__v16qi>(__b),
+ reinterpret_cast<__v16qi>(__k)));
+ }
+ _GLIBCXX_SIMD_INTRINSIC __m256 operator()(__m256 __a, __m256 __b,
+ __m256 __k) const noexcept
+ {
+ return __builtin_ia32_blendvps256(__a, __b, __k);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __m256d operator()(__m256d __a, __m256d __b,
+ __m256d __k) const noexcept
+ {
+ return __builtin_ia32_blendvpd256(__a, __b, __k);
+ }
+ _GLIBCXX_SIMD_INTRINSIC __m256i operator()(__m256i __a, __m256i __b,
+ __m256i __k) const noexcept
+ {
+ return reinterpret_cast<__m256i>(
+ __builtin_ia32_pblendvb256(reinterpret_cast<__v32qi>(__a),
+ reinterpret_cast<__v32qi>(__b),
+ reinterpret_cast<__v32qi>(__k)));
+ }
+ } __eval;
+ return __eval(__a, __b, __k);
+ }
+
+ // }}}
+ // _S_blend {{{
+ // Returns: __k ? __at1 : __at0
+ // TODO: reverse __at0 and __at1 to match COND_EXPR
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ _S_blend(_SimdWrapper<bool, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+ _SimdWrapper<_Tp, _Np> __at1)
+ {
+ static_assert(is_same_v<_Tp, _Tp> && __have_avx512f);
+ if (__k._M_is_constprop() && __at0._M_is_constprop()
+ && __at1._M_is_constprop())
+ return __generate_from_n_evaluations<_Np, __vector_type_t<_Tp, _Np>>([&](
+ auto __i) constexpr { return __k[__i] ? __at1[__i] : __at0[__i]; });
+ else if constexpr (sizeof(__at0) == 64
+ || (__have_avx512vl && sizeof(__at0) >= 16))
+ return _S_blend_avx512(__k._M_data, __at0._M_data, __at1._M_data);
+ else
+ {
+ static_assert((__have_avx512vl && sizeof(__at0) < 16)
+ || !__have_avx512vl);
+ constexpr size_t __size = (__have_avx512vl ? 16 : 64) / sizeof(_Tp);
+ return __vector_bitcast<_Tp, _Np>(
+ _S_blend_avx512(__k._M_data, __vector_bitcast<_Tp, __size>(__at0),
+ __vector_bitcast<_Tp, __size>(__at1)));
+ }
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ _S_blend(_SimdWrapper<_Tp, _Np> __k, _SimdWrapper<_Tp, _Np> __at0,
+ _SimdWrapper<_Tp, _Np> __at1)
+ {
+ if (__builtin_is_constant_evaluated()
+ || (__k._M_is_constprop() && __at0._M_is_constprop()
+ && __at1._M_is_constprop()))
+ {
+ auto __r = __or(__andnot(__k, __at0), __and(__k, __at1));
+ if (__r._M_is_constprop())
+ return __r;
+ }
+ if constexpr (((__have_avx512f && sizeof(__at0) == 64)
+ || __have_avx512vl)
+ && (sizeof(_Tp) >= 4 || __have_avx512bw))
+ // convert to bitmask and call overload above
+ return _S_blend(_SimdWrapper<bool, _Np>(
+ __make_dependent_t<_Tp, _MaskImplX86Mixin>::__to_bits(
+ __k)
+ ._M_to_bits()),
+ __at0, __at1);
+ else
+ {
+ // Since GCC does not assume __k to be a mask, using the builtin
+ // conditional operator introduces an extra compare against 0 before
+ // blending. So we rather call the intrinsic here.
+ if constexpr (__have_sse4_1)
+ return _S_blend_intrin(__to_intrin(__k), __to_intrin(__at0),
+ __to_intrin(__at1));
+ else
+ return __or(__andnot(__k, __at0), __and(__k, __at1));
+ }
+ }
+
+ // }}}
+};
+
+// }}}
+// _SimdImplX86 {{{
+template <typename _Abi> struct _SimdImplX86 : _SimdImplBuiltin<_Abi>
+{
+ using _Base = _SimdImplBuiltin<_Abi>;
+ template <typename _Tp>
+ using _MaskMember = typename _Base::template _MaskMember<_Tp>;
+ template <typename _Tp>
+ static constexpr size_t _S_full_size = _Abi::template _S_full_size<_Tp>;
+ template <typename _Tp>
+ static constexpr size_t size = _Abi::template size<_Tp>;
+ template <typename _Tp>
+ static constexpr size_t _S_max_store_size
+ = (sizeof(_Tp) >= 4 && __have_avx512f) || __have_avx512bw
+ ? 64
+ : (std::is_floating_point_v<_Tp>&& __have_avx) || __have_avx2 ? 32 : 16;
+ using _MaskImpl = typename _Abi::_MaskImpl;
+
+ // __masked_load {{{
+ template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+ static inline _SimdWrapper<_Tp, _Np>
+ __masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
+ const _Up* __mem, _Fp) noexcept
+ {
+ static_assert(_Np == size<_Tp>);
+ if constexpr (std::is_same_v<_Tp, _Up> || // no conversion
+ (sizeof(_Tp) == sizeof(_Up)
+ && std::is_integral_v<
+ _Tp> == std::is_integral_v<_Up>) // conversion via bit
+ // reinterpretation
+ )
+ {
+ [[maybe_unused]] const auto __intrin = __to_intrin(__merge);
+ if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512bw_vl)
+ && sizeof(_Tp) == 1)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_epi8(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__merge) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_epi8(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__merge) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_epi8(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512bw_vl)
+ && sizeof(_Tp) == 2)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_epi16(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_epi16(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_epi16(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+ && sizeof(_Tp) == 4 && std::is_integral_v<_Up>)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_epi32(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_epi32(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_epi32(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+ && sizeof(_Tp) == 4 && std::is_floating_point_v<_Up>)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_ps(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_ps(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_ps(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+ && std::is_integral_v<_Up>)
+ {
+ if constexpr (sizeof(__intrin) == 16)
+ __merge
+ = __or(__andnot(__k._M_data, __merge._M_data),
+ __vector_bitcast<_Tp, _Np>(
+ _mm_maskload_epi32(reinterpret_cast<const int*>(__mem),
+ __to_intrin(__k))));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge
+ = (~__k._M_data & __merge._M_data)
+ | __vector_bitcast<_Tp, _Np>(
+ _mm256_maskload_epi32(reinterpret_cast<const int*>(__mem),
+ __to_intrin(__k)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx && sizeof(_Tp) == 4)
+ {
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __or(__andnot(__k._M_data, __merge._M_data),
+ __vector_bitcast<_Tp, _Np>(_mm_maskload_ps(
+ reinterpret_cast<const float*>(__mem),
+ __intrin_bitcast<__m128i>(__as_vector(__k)))));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge
+ = __or(__andnot(__k._M_data, __merge._M_data),
+ _mm256_maskload_ps(reinterpret_cast<const float*>(__mem),
+ __vector_bitcast<_LLong>(__k)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+ && sizeof(_Tp) == 8 && std::is_integral_v<_Up>)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_epi64(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_epi64(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_epi64(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr ((__is_avx512_abi<_Abi>() || __have_avx512vl)
+ && sizeof(_Tp) == 8 && std::is_floating_point_v<_Up>)
+ {
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm_mask_loadu_pd(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm256_mask_loadu_pd(__intrin, __kk, __mem));
+ else if constexpr (sizeof(__intrin) == 64)
+ __merge = __vector_bitcast<_Tp, _Np>(
+ _mm512_mask_loadu_pd(__intrin, __kk, __mem));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+ && std::is_integral_v<_Up>)
+ {
+ if constexpr (sizeof(__intrin) == 16)
+ __merge = __or(__andnot(__k._M_data, __merge._M_data),
+ __vector_bitcast<_Tp, _Np>(_mm_maskload_epi64(
+ reinterpret_cast<const _LLong*>(__mem),
+ __to_intrin(__k))));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge
+ = (~__k._M_data & __merge._M_data)
+ | __vector_bitcast<_Tp>(_mm256_maskload_epi64(
+ reinterpret_cast<const _LLong*>(__mem), __to_intrin(__k)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx && sizeof(_Tp) == 8)
+ {
+ if constexpr (sizeof(__intrin) == 16)
+ __merge
+ = __or(__andnot(__k._M_data, __merge._M_data),
+ __vector_bitcast<_Tp, _Np>(
+ _mm_maskload_pd(reinterpret_cast<const double*>(__mem),
+ __vector_bitcast<_LLong>(__k))));
+ else if constexpr (sizeof(__intrin) == 32)
+ __merge = __or(__andnot(__k._M_data, __merge._M_data),
+ _mm256_maskload_pd(reinterpret_cast<const double*>(
+ __mem),
+ __vector_bitcast<_LLong>(__k)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ _BitOps::__bit_iteration(_MaskImpl::__to_bits(__k), [&](auto __i) {
+ __merge.__set(__i, static_cast<_Tp>(__mem[__i]));
+ });
+ }
+ /* Very uncertain, that the following improves anything. Needs benchmarking
+ * before it's activated.
+ else if constexpr (sizeof(_Up) <= 8 && // no long double
+ !__converts_via_decomposition_v<
+ _Up, _Tp,
+ sizeof(__merge)> // conversion via decomposition
+ // is better handled via the
+ // bit_iteration fallback below
+ )
+ {
+ // TODO: copy pattern from __masked_store, which doesn't resort to
+ // fixed_size
+ using _Ap = simd_abi::deduce_t<_Up, _Np>;
+ using _ATraits = _SimdTraits<_Up, _Ap>;
+ using _AImpl = typename _ATraits::_SimdImpl;
+ typename _ATraits::_SimdMember __uncvted{};
+ typename _ATraits::_MaskMember __kk = _Ap::_MaskImpl::template
+ __convert<_Up>(__k);
+ __uncvted = _AImpl::__masked_load(__uncvted, __kk, __mem, _Fp());
+ _SimdConverter<_Up, _Ap, _Tp, _Abi> __converter;
+ _Base::__masked_assign(__k, __merge, __converter(__uncvted));
+ }
+ */
+ else
+ __merge = _Base::__masked_load(__merge, __k, __mem, _Fp());
+ return __merge;
+ return __merge;
+ }
+
+ // }}}
+ // __masked_store_nocvt {{{
+ template <typename _Tp, std::size_t _Np, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _Fp,
+ _SimdWrapper<bool, _Np> __k)
+ {
+ [[maybe_unused]] const auto __vi = __to_intrin(__v);
+ if constexpr (sizeof(__vi) == 64)
+ {
+ static_assert(sizeof(__v) == 64 && __have_avx512f);
+ if constexpr (__have_avx512bw && sizeof(_Tp) == 1)
+ _mm512_mask_storeu_epi8(__mem, __k, __vi);
+ else if constexpr (__have_avx512bw && sizeof(_Tp) == 2)
+ _mm512_mask_storeu_epi16(__mem, __k, __vi);
+ else if constexpr (__have_avx512f && sizeof(_Tp) == 4)
+ {
+ if constexpr (__is_aligned_v<_Fp, 64> && std::is_integral_v<_Tp>)
+ _mm512_mask_store_epi32(__mem, __k, __vi);
+ else if constexpr (__is_aligned_v<
+ _Fp, 64> && std::is_floating_point_v<_Tp>)
+ _mm512_mask_store_ps(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm512_mask_storeu_epi32(__mem, __k, __vi);
+ else
+ _mm512_mask_storeu_ps(__mem, __k, __vi);
+ }
+ else if constexpr (__have_avx512f && sizeof(_Tp) == 8)
+ {
+ if constexpr (__is_aligned_v<_Fp, 64> && std::is_integral_v<_Tp>)
+ _mm512_mask_store_epi64(__mem, __k, __vi);
+ else if constexpr (__is_aligned_v<
+ _Fp, 64> && std::is_floating_point_v<_Tp>)
+ _mm512_mask_store_pd(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm512_mask_storeu_epi64(__mem, __k, __vi);
+ else
+ _mm512_mask_storeu_pd(__mem, __k, __vi);
+ }
+#if 0 // with KNL either sizeof(_Tp) >= 4 or sizeof(_vi) <= 32
+ // with Skylake-AVX512, __have_avx512bw is true
+ else if constexpr (__have_sse2)
+ {
+ using _M = __vector_type_t<_Tp, _Np>;
+ using _MVT = _VectorTraits<_M>;
+ _mm_maskmoveu_si128(__auto_bitcast(__extract<0, 4>(__v._M_data)),
+ __auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(__k._M_data)),
+ reinterpret_cast<char*>(__mem));
+ _mm_maskmoveu_si128(__auto_bitcast(__extract<1, 4>(__v._M_data)),
+ __auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+ __k._M_data >> 1 * _MVT::_S_width)),
+ reinterpret_cast<char*>(__mem) + 1 * 16);
+ _mm_maskmoveu_si128(__auto_bitcast(__extract<2, 4>(__v._M_data)),
+ __auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+ __k._M_data >> 2 * _MVT::_S_width)),
+ reinterpret_cast<char*>(__mem) + 2 * 16);
+ if constexpr (_Np > 48 / sizeof(_Tp))
+ _mm_maskmoveu_si128(
+ __auto_bitcast(__extract<3, 4>(__v._M_data)),
+ __auto_bitcast(_MaskImpl::template __convert<_Tp, _Np>(
+ __k._M_data >> 3 * _MVT::_S_width)),
+ reinterpret_cast<char*>(__mem) + 3 * 16);
+ }
+#endif
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__vi) == 32)
+ {
+ if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+ _mm256_mask_storeu_epi8(__mem, __k, __vi);
+ else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+ _mm256_mask_storeu_epi16(__mem, __k, __vi);
+ else if constexpr (__have_avx512vl && sizeof(_Tp) == 4)
+ {
+ if constexpr (__is_aligned_v<_Fp, 32> && std::is_integral_v<_Tp>)
+ _mm256_mask_store_epi32(__mem, __k, __vi);
+ else if constexpr (__is_aligned_v<
+ _Fp, 32> && std::is_floating_point_v<_Tp>)
+ _mm256_mask_store_ps(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm256_mask_storeu_epi32(__mem, __k, __vi);
+ else
+ _mm256_mask_storeu_ps(__mem, __k, __vi);
+ }
+ else if constexpr (__have_avx512vl && sizeof(_Tp) == 8)
+ {
+ if constexpr (__is_aligned_v<_Fp, 32> && std::is_integral_v<_Tp>)
+ _mm256_mask_store_epi64(__mem, __k, __vi);
+ else if constexpr (__is_aligned_v<
+ _Fp, 32> && std::is_floating_point_v<_Tp>)
+ _mm256_mask_store_pd(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm256_mask_storeu_epi64(__mem, __k, __vi);
+ else
+ _mm256_mask_storeu_pd(__mem, __k, __vi);
+ }
+ else if constexpr (__have_avx512f
+ && (sizeof(_Tp) >= 4 || __have_avx512bw))
+ {
+ // use a 512-bit maskstore, using zero-extension of the bitmask
+ __masked_store_nocvt(
+ _SimdWrapper64<_Tp>(
+ __intrin_bitcast<__vector_type64_t<_Tp>>(__v._M_data)),
+ __mem,
+ // careful, vector_aligned has a stricter meaning in the
+ // 512-bit maskstore:
+ std::conditional_t<std::is_same_v<_Fp, vector_aligned_tag>,
+ overaligned_tag<32>, _Fp>(),
+ _SimdWrapper<bool, 64 / sizeof(_Tp)>(__k._M_data));
+ }
+ else
+ __masked_store_nocvt(
+ __v, __mem, _Fp(),
+ _MaskImpl::template __to_maskvector<_Tp, 32 / sizeof(_Tp)>(__k));
+ }
+ else if constexpr (sizeof(__vi) == 16)
+ {
+ // the store is aligned if _Fp is overaligned_tag<16> (or higher) or _Fp
+ // is vector_aligned_tag while __v is actually a 16-Byte vector (could
+ // be 2/4/8 as well)
+ [[maybe_unused]] constexpr bool __aligned
+ = __is_aligned_v<
+ _Fp,
+ 16> && (sizeof(__v) == 16 || !std::is_same_v<_Fp, vector_aligned_tag>);
+ if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+ _mm_mask_storeu_epi8(__mem, __k, __vi);
+ else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+ _mm_mask_storeu_epi16(__mem, __k, __vi);
+ else if constexpr (__have_avx512vl && sizeof(_Tp) == 4)
+ {
+ if constexpr (__aligned && std::is_integral_v<_Tp>)
+ _mm_mask_store_epi32(__mem, __k, __vi);
+ else if constexpr (__aligned && std::is_floating_point_v<_Tp>)
+ _mm_mask_store_ps(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm_mask_storeu_epi32(__mem, __k, __vi);
+ else
+ _mm_mask_storeu_ps(__mem, __k, __vi);
+ }
+ else if constexpr (__have_avx512vl && sizeof(_Tp) == 8)
+ {
+ if constexpr (__aligned && std::is_integral_v<_Tp>)
+ _mm_mask_store_epi64(__mem, __k, __vi);
+ else if constexpr (__aligned && std::is_floating_point_v<_Tp>)
+ _mm_mask_store_pd(__mem, __k, __vi);
+ else if constexpr (std::is_integral_v<_Tp>)
+ _mm_mask_storeu_epi64(__mem, __k, __vi);
+ else
+ _mm_mask_storeu_pd(__mem, __k, __vi);
+ }
+ else if constexpr (__have_avx512f
+ && (sizeof(_Tp) >= 4 || __have_avx512bw))
+ {
+ // use a 512-bit maskstore, using zero-extension of the bitmask
+ __masked_store_nocvt(
+ _SimdWrapper64<_Tp>(
+ __intrin_bitcast<__intrinsic_type64_t<_Tp>>(__v._M_data)),
+ __mem,
+ // careful, vector_aligned has a stricter meaning in the 512-bit
+ // maskstore:
+ std::conditional_t<std::is_same_v<_Fp, vector_aligned_tag>,
+ overaligned_tag<sizeof(__v)>, _Fp>(),
+ _SimdWrapper<bool, 64 / sizeof(_Tp)>(__k._M_data));
+ }
+ else
+ __masked_store_nocvt(
+ __v, __mem, _Fp(),
+ _MaskImpl::template __to_maskvector<_Tp, 16 / sizeof(_Tp)>(__k));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ template <typename _TW,
+ typename _Tp = typename _VectorTraits<_TW>::value_type,
+ typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void __masked_store_nocvt(_TW __v, _Tp* __mem,
+ _Fp, _TW __k)
+ {
+ if constexpr (sizeof(_TW) <= 16)
+ {
+ [[maybe_unused]] const auto __vi
+ = __intrin_bitcast<__m128i>(__as_vector(__v));
+ [[maybe_unused]] const auto __ki
+ = __intrin_bitcast<__m128i>(__as_vector(__k));
+ if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+ _mm_mask_storeu_epi8(__mem, _mm_movepi8_mask(__ki), __vi);
+ else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+ _mm_mask_storeu_epi16(__mem, _mm_movepi16_mask(__ki), __vi);
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+ && std::is_integral_v<_Tp>)
+ _mm_maskstore_epi32(reinterpret_cast<int*>(__mem), __ki, __vi);
+ else if constexpr (__have_avx && sizeof(_Tp) == 4)
+ _mm_maskstore_ps(reinterpret_cast<float*>(__mem), __ki,
+ __vector_bitcast<float>(__vi));
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+ && std::is_integral_v<_Tp>)
+ _mm_maskstore_epi64(reinterpret_cast<_LLong*>(__mem), __ki, __vi);
+ else if constexpr (__have_avx && sizeof(_Tp) == 8)
+ _mm_maskstore_pd(reinterpret_cast<double*>(__mem), __ki,
+ __vector_bitcast<double>(__vi));
+ else if constexpr (__have_sse2)
+ _mm_maskmoveu_si128(__vi, __ki, reinterpret_cast<char*>(__mem));
+ }
+ else if constexpr (sizeof(_TW) == 32)
+ {
+ [[maybe_unused]] const auto __vi
+ = __intrin_bitcast<__m256i>(__as_vector(__v));
+ [[maybe_unused]] const auto __ki
+ = __intrin_bitcast<__m256i>(__as_vector(__k));
+ if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 1)
+ _mm256_mask_storeu_epi8(__mem, _mm256_movepi8_mask(__ki), __vi);
+ else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 2)
+ _mm256_mask_storeu_epi16(__mem, _mm256_movepi16_mask(__ki), __vi);
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 4
+ && std::is_integral_v<_Tp>)
+ _mm256_maskstore_epi32(reinterpret_cast<int*>(__mem), __ki, __vi);
+ else if constexpr (sizeof(_Tp) == 4)
+ _mm256_maskstore_ps(reinterpret_cast<float*>(__mem), __ki,
+ __vector_bitcast<float>(__v));
+ else if constexpr (__have_avx2 && sizeof(_Tp) == 8
+ && std::is_integral_v<_Tp>)
+ _mm256_maskstore_epi64(reinterpret_cast<_LLong*>(__mem), __ki, __vi);
+ else if constexpr (__have_avx && sizeof(_Tp) == 8)
+ _mm256_maskstore_pd(reinterpret_cast<double*>(__mem), __ki,
+ __vector_bitcast<double>(__v));
+ else if constexpr (__have_sse2)
+ {
+ _mm_maskmoveu_si128(__lo128(__vi), __lo128(__ki),
+ reinterpret_cast<char*>(__mem));
+ _mm_maskmoveu_si128(__hi128(__vi), __hi128(__ki),
+ reinterpret_cast<char*>(__mem) + 16);
+ }
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __masked_store {{{
+ template <typename _Tp, size_t _Np, typename _Up, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_store(const _SimdWrapper<_Tp, _Np> __v, _Up* __mem, _Fp,
+ const _MaskMember<_Tp> __k) noexcept
+ {
+ if constexpr (std::is_integral_v<
+ _Tp> && std::is_integral_v<_Up> && sizeof(_Tp) > sizeof(_Up)
+ && __have_avx512f && (sizeof(_Tp) >= 4 || __have_avx512bw)
+ && (sizeof(__v) == 64 || __have_avx512vl))
+ { // truncating store
+ const auto __vi = __to_intrin(__v);
+ const auto __kk = _MaskImpl::__to_bits(__k)._M_to_bits();
+ if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 4
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi64_storeu_epi32(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 2
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi64_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 8 && sizeof(_Up) == 1
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi64_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 2
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi32_storeu_epi16(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 4 && sizeof(_Up) == 1
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi32_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+ && sizeof(__vi) == 64)
+ _mm512_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+ && sizeof(__vi) == 32)
+ _mm256_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+ else if constexpr (sizeof(_Tp) == 2 && sizeof(_Up) == 1
+ && sizeof(__vi) == 16)
+ _mm_mask_cvtepi16_storeu_epi8(__mem, __kk, __vi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ _Base::__masked_store(__v, __mem, _Fp(), __k);
+ }
+
+ // }}}
+ // __multiplies {{{
+ template <typename _V, typename _VVT = _VectorTraits<_V>>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _V __multiplies(_V __x, _V __y)
+ {
+ using _Tp = typename _VVT::value_type;
+ if (__builtin_is_constant_evaluated() || __x._M_is_constprop()
+ || __y._M_is_constprop())
+ return __as_vector(__x) * __as_vector(__y);
+ else if constexpr (sizeof(_Tp) == 1)
+ {
+ if constexpr (sizeof(_V) == 2)
+ {
+ const auto __xs = reinterpret_cast<short>(__x._M_data);
+ const auto __ys = reinterpret_cast<short>(__y._M_data);
+ return reinterpret_cast<__vector_type_t<_Tp, 2>>(
+ short(((__xs * __ys) & 0xff) | ((__xs >> 8) * (__ys & 0xff00))));
+ }
+ else if constexpr (sizeof(_V) == 4 && _VVT::_S_partial_width == 3)
+ {
+ const auto __xi = reinterpret_cast<int>(__x._M_data);
+ const auto __yi = reinterpret_cast<int>(__y._M_data);
+ return reinterpret_cast<__vector_type_t<_Tp, 3>>(
+ ((__xi * __yi) & 0xff)
+ | (((__xi >> 8) * (__yi & 0xff00)) & 0xff00)
+ | ((__xi >> 16) * (__yi & 0xff0000)));
+ }
+ else if constexpr (sizeof(_V) == 4)
+ {
+ const auto __xi = reinterpret_cast<int>(__x._M_data);
+ const auto __yi = reinterpret_cast<int>(__y._M_data);
+ return reinterpret_cast<__vector_type_t<_Tp, 4>>(
+ ((__xi * __yi) & 0xff)
+ | (((__xi >> 8) * (__yi & 0xff00)) & 0xff00)
+ | (((__xi >> 16) * (__yi & 0xff0000)) & 0xff0000)
+ | ((__xi >> 24) * (__yi & 0xff000000u)));
+ }
+ else if constexpr (sizeof(_V) == 8 && __have_avx2
+ && std::is_signed_v<_Tp>)
+ return __convert<typename _VVT::type>(
+ __vector_bitcast<short>(_mm_cvtepi8_epi16(__to_intrin(__x)))
+ * __vector_bitcast<short>(_mm_cvtepi8_epi16(__to_intrin(__y))));
+ else if constexpr (sizeof(_V) == 8 && __have_avx2
+ && std::is_unsigned_v<_Tp>)
+ return __convert<typename _VVT::type>(
+ __vector_bitcast<short>(_mm_cvtepu8_epi16(__to_intrin(__x)))
+ * __vector_bitcast<short>(_mm_cvtepu8_epi16(__to_intrin(__y))));
+ else
+ {
+ // codegen of `x*y` is suboptimal (as of GCC 9.0.1)
+ constexpr size_t __full_size = _VVT::_S_width;
+ constexpr int _Np = sizeof(_V) >= 16 ? __full_size / 2 : 8;
+ using _ShortW = _SimdWrapper<short, _Np>;
+ const _ShortW __even = __vector_bitcast<short, _Np>(__x)
+ * __vector_bitcast<short, _Np>(__y);
+ _ShortW __high_byte = _ShortW()._M_data - 256;
+ //[&]() { asm("" : "+x"(__high_byte._M_data)); }();
+ const _ShortW __odd
+ = (__vector_bitcast<short, _Np>(__x) >> 8)
+ * (__vector_bitcast<short, _Np>(__y) & __high_byte._M_data);
+ if constexpr (__have_avx512bw && sizeof(_V) > 2)
+ return _CommonImplX86::_S_blend_avx512(
+ 0xaaaa'aaaa'aaaa'aaaaLL, __vector_bitcast<_Tp>(__even),
+ __vector_bitcast<_Tp>(__odd));
+ else if constexpr (__have_sse4_1 && sizeof(_V) > 2)
+ return _CommonImplX86::_S_blend_intrin(__to_intrin(__high_byte),
+ __to_intrin(__even),
+ __to_intrin(__odd));
+ else
+ return __to_intrin(__or(__andnot(__high_byte, __even), __odd));
+ }
+ }
+ else
+ return _Base::__multiplies(__x, __y);
+ }
+
+ // }}}
+ // __divides {{{
+#ifdef _GLIBCXX_SIMD_WORKAROUND_PR90993
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __divides(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if (!__builtin_is_constant_evaluated()
+ && !__builtin_constant_p(__y._M_data))
+ if constexpr (is_integral_v<_Tp> && sizeof(_Tp) <= 4)
+ { // use divps - codegen of `x/y` is suboptimal (as of GCC 9.0.1)
+ // Note that using floating-point division is likely to raise the
+ // *Inexact* exception flag and thus appears like an invalid "as-if"
+ // transformation. However, C++ doesn't specify how the fpenv can be
+ // observed and points to C. C says that function calls are assumed to
+ // potentially raise fp exceptions, unless documented otherwise.
+ // Consequently, operator/, which is a function call, may raise fp
+ // exceptions.
+ /*const struct _CsrGuard
+ {
+ const unsigned _M_data = _mm_getcsr();
+ _CsrGuard()
+ {
+ _mm_setcsr(0x9f80); // turn off FP exceptions and flush-to-zero
+ }
+ ~_CsrGuard() { _mm_setcsr(_M_data); }
+ } __csr;*/
+ using _Float = conditional_t<sizeof(_Tp) == 4, double, float>;
+ constexpr size_t __n_intermediate
+ = std::min(_Np, (__have_avx512f ? 64 : __have_avx ? 32 : 16)
+ / sizeof(_Float));
+ using _FloatV = __vector_type_t<_Float, __n_intermediate>;
+ constexpr size_t __n_floatv = __div_roundup(_Np, __n_intermediate);
+ using _R = __vector_type_t<_Tp, _Np>;
+ const auto __xf = __convert_all<_FloatV, __n_floatv>(__x);
+ const auto __yf = __convert_all<_FloatV, __n_floatv>(
+ _Abi::__make_padding_nonzero(__as_vector(__y)));
+ return __call_with_n_evaluations<__n_floatv>(
+ [](auto... __quotients) {
+ return __vector_convert<_R>(__quotients...);
+ },
+ [&__xf, &__yf](auto __i) { return __xf[__i] / __yf[__i]; });
+ }
+ /* 64-bit int division is potentially optimizable via double division if
+ * the value in __x is small enough and the conversion between
+ * int<->double is efficient enough:
+ else if constexpr (is_integral_v<_Tp> && is_unsigned_v<_Tp> &&
+ sizeof(_Tp) == 8)
+ {
+ if constexpr (__have_sse4_1 && sizeof(__x) == 16)
+ {
+ if (_mm_test_all_zeros(__x, __m128i{0xffe0'0000'0000'0000ull,
+ 0xffe0'0000'0000'0000ull}))
+ {
+ __x._M_data | 0x __vector_convert<__m128d>(__x._M_data)
+ }
+ }
+ }
+ */
+ return _Base::__divides(__x, __y);
+ }
+#endif // _GLIBCXX_SIMD_WORKAROUND_PR90993
+
+ // }}}
+ // __modulus {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __modulus(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if (__builtin_is_constant_evaluated() || __builtin_constant_p(__y._M_data)
+ || sizeof(_Tp) >= 8)
+ return _Base::__modulus(__x, __y);
+ else
+ return _Base::__minus(__x, __multiplies(__y, __divides(__x, __y)));
+ }
+
+ // }}}
+ // __bit_shift_left {{{
+ // Notes on UB. C++2a [expr.shift] says:
+ // -1- [...] The operands shall be of integral or unscoped enumeration type
+ // and integral promotions are performed. The type of the result is that
+ // of the promoted left operand. The behavior is undefined if the right
+ // operand is negative, or greater than or equal to the width of the
+ // promoted left operand.
+ // -2- The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo
+ // 2^N, where N is the width of the type of the result.
+ //
+ // C++17 [expr.shift] says:
+ // -2- The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated
+ // bits are zero-filled. If E1 has an unsigned type, the value of the
+ // result is E1 × 2^E2 , reduced modulo one more than the maximum value
+ // representable in the result type. Otherwise, if E1 has a signed type
+ // and non-negative value, and E1 × 2^E2 is representable in the
+ // corresponding unsigned type of the result type, then that value,
+ // converted to the result type, is the resulting value; otherwise, the
+ // behavior is undefined.
+ //
+ // Consequences:
+ // With C++2a signed and unsigned types have the same UB
+ // characteristics:
+ // - left shift is not UB for 0 <= RHS < max(32, #bits(T))
+ //
+ // With C++17 there's little room for optimizations because the standard
+ // requires all shifts to happen on promoted integrals (i.e. int). Thus,
+ // short and char shifts must assume shifts affect bits of neighboring
+ // values.
+#ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ inline _GLIBCXX_CONST static typename _TVT::type __bit_shift_left(_Tp __xx,
+ int __y)
+ {
+ using _V = typename _TVT::type;
+ using _Up = typename _TVT::value_type;
+ _V __x = __xx;
+ [[maybe_unused]] const auto __ix = __to_intrin(__x);
+ if (__builtin_is_constant_evaluated())
+ return __x << __y;
+#if __cplusplus > 201703
+ // after C++17, signed shifts have no UB, and behave just like unsigned
+ // shifts
+ else if constexpr (sizeof(_Up) == 1 && is_signed_v<_Up>)
+ return __vector_bitcast<_Up>(
+ __bit_shift_left(__vector_bitcast<make_unsigned_t<_Up>>(__x), __y));
+#endif
+ else if constexpr (sizeof(_Up) == 1)
+ {
+ // (cf. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83894)
+ if (__builtin_constant_p(__y))
+ {
+ if (__y == 0)
+ return __x;
+ else if (__y == 1)
+ return __x + __x;
+ else if (__y == 2)
+ {
+ __x = __x + __x;
+ return __x + __x;
+ }
+ else if (__y > 2 && __y < 8)
+ {
+ if constexpr (sizeof(__x) > sizeof(unsigned))
+ {
+ const _UChar __mask = 0xff << __y; // precomputed vector
+ return __vector_bitcast<_Up>(
+ __vector_bitcast<_UChar>(__vector_bitcast<unsigned>(__x)
+ << __y)
+ & __mask);
+ }
+ else
+ {
+ const unsigned __mask
+ = (0xff & (0xff << __y)) * 0x01010101u;
+ return reinterpret_cast<_V>(
+ static_cast<__int_for_sizeof_t<_V>>(
+ unsigned(reinterpret_cast<__int_for_sizeof_t<_V>>(__x)
+ << __y)
+ & __mask));
+ }
+ }
+ else if (__y >= 8 && __y < 32)
+ return _V();
+ else
+ __builtin_unreachable();
+ }
+ // general strategy in the following: use an sllv instead of sll
+ // instruction, because it's 2 to 4 times faster:
+ else if constexpr (__have_avx512bw_vl && sizeof(__x) == 16)
+ return __vector_bitcast<_Up>(
+ _mm256_cvtepi16_epi8(_mm256_sllv_epi16(_mm256_cvtepi8_epi16(__ix),
+ _mm256_set1_epi16(__y))));
+ else if constexpr (__have_avx512bw && sizeof(__x) == 32)
+ return __vector_bitcast<_Up>(
+ _mm512_cvtepi16_epi8(_mm512_sllv_epi16(_mm512_cvtepi8_epi16(__ix),
+ _mm512_set1_epi16(__y))));
+ else if constexpr (__have_avx512bw && sizeof(__x) == 64)
+ {
+ const auto __shift = _mm512_set1_epi16(__y);
+ return __vector_bitcast<_Up>(
+ __concat(_mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+ _mm512_cvtepi8_epi16(__lo256(__ix)), __shift)),
+ _mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+ _mm512_cvtepi8_epi16(__hi256(__ix)), __shift))));
+ }
+ else if constexpr (__have_avx2 && sizeof(__x) == 32)
+ {
+#if 1
+ const auto __shift = _mm_cvtsi32_si128(__y);
+ auto __k
+ = _mm256_sll_epi16(_mm256_slli_epi16(~__m256i(), 8), __shift);
+ __k |= _mm256_srli_epi16(__k, 8);
+ return __vector_bitcast<_Up>(_mm256_sll_epi32(__ix, __shift) & __k);
+#else
+ const _Up __k = 0xff << __y;
+ return __vector_bitcast<_Up>(__vector_bitcast<int>(__x) << __y)
+ & __k;
+#endif
+ }
+ else
+ {
+ const auto __shift = _mm_cvtsi32_si128(__y);
+ auto __k = _mm_sll_epi16(_mm_slli_epi16(~__m128i(), 8), __shift);
+ __k |= _mm_srli_epi16(__k, 8);
+ return __intrin_bitcast<_V>(_mm_sll_epi16(__ix, __shift) & __k);
+ }
+ }
+ return __x << __y;
+ }
+
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ inline _GLIBCXX_CONST static typename _TVT::type
+ __bit_shift_left(_Tp __xx, typename _TVT::type __y)
+ {
+ using _V = typename _TVT::type;
+ using _Up = typename _TVT::value_type;
+ _V __x = __xx;
+ [[maybe_unused]] const auto __ix = __to_intrin(__x);
+ [[maybe_unused]] const auto __iy = __to_intrin(__y);
+ if (__builtin_is_constant_evaluated())
+ return __x << __y;
+#if __cplusplus > 201703
+ // after C++17, signed shifts have no UB, and behave just like unsigned
+ // shifts
+ else if constexpr (is_signed_v<_Up>)
+ return __vector_bitcast<_Up>(
+ __bit_shift_left(__vector_bitcast<make_unsigned_t<_Up>>(__x),
+ __vector_bitcast<make_unsigned_t<_Up>>(__y)));
+#endif
+ else if constexpr (sizeof(_Up) == 1)
+ {
+ if constexpr (sizeof __ix == 64 && __have_avx512bw)
+ return __vector_bitcast<_Up>(
+ __concat(_mm512_cvtepi16_epi8(
+ _mm512_sllv_epi16(_mm512_cvtepu8_epi16(__lo256(__ix)),
+ _mm512_cvtepu8_epi16(__lo256(__iy)))),
+ _mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+ _mm512_cvtepu8_epi16(__hi256(__ix)),
+ _mm512_cvtepu8_epi16(__hi256(__iy))))));
+ else if constexpr (sizeof __ix == 32 && __have_avx512bw)
+ return __vector_bitcast<_Up>(_mm512_cvtepi16_epi8(
+ _mm512_sllv_epi16(_mm512_cvtepu8_epi16(__ix),
+ _mm512_cvtepu8_epi16(__iy))));
+ else if constexpr (sizeof __x <= 8 && __have_avx512bw_vl)
+ return __intrin_bitcast<_V>(_mm_cvtepi16_epi8(
+ _mm_sllv_epi16(_mm_cvtepu8_epi16(__ix), _mm_cvtepu8_epi16(__iy))));
+ else if constexpr (sizeof __ix == 16 && __have_avx512bw_vl)
+ return __intrin_bitcast<_V>(_mm256_cvtepi16_epi8(
+ _mm256_sllv_epi16(_mm256_cvtepu8_epi16(__ix),
+ _mm256_cvtepu8_epi16(__iy))));
+ else if constexpr (sizeof __ix == 16 && __have_avx512bw)
+ return __intrin_bitcast<_V>(
+ __lo128(_mm512_cvtepi16_epi8(_mm512_sllv_epi16(
+ _mm512_cvtepu8_epi16(_mm256_castsi128_si256(__ix)),
+ _mm512_cvtepu8_epi16(_mm256_castsi128_si256(__iy))))));
+ else if constexpr (__have_sse4_1 && sizeof(__x) == 16)
+ {
+ auto __mask
+ = __vector_bitcast<_Up>(__vector_bitcast<short>(__y) << 5);
+ auto __x4
+ = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 4);
+ __x4 &= char(0xf0);
+ __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x4)));
+ __mask += __mask;
+ auto __x2
+ = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 2);
+ __x2 &= char(0xfc);
+ __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x2)));
+ __mask += __mask;
+ auto __x1 = __x + __x;
+ __x = reinterpret_cast<_V>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x), __to_intrin(__x1)));
+ return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+ }
+ else if constexpr (sizeof(__x) == 16)
+ {
+ auto __mask
+ = __vector_bitcast<_UChar>(__vector_bitcast<short>(__y) << 5);
+ auto __x4
+ = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 4);
+ __x4 &= char(0xf0);
+ __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x4 : __x;
+ __mask += __mask;
+ auto __x2
+ = __vector_bitcast<_Up>(__vector_bitcast<short>(__x) << 2);
+ __x2 &= char(0xfc);
+ __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x2 : __x;
+ __mask += __mask;
+ auto __x1 = __x + __x;
+ __x = __vector_bitcast<_SChar>(__mask) < 0 ? __x1 : __x;
+ return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+ }
+ else
+ return __x << __y;
+ }
+ else if constexpr (sizeof(_Up) == 2)
+ {
+ if constexpr (sizeof __ix == 64 && __have_avx512bw)
+ return __vector_bitcast<_Up>(_mm512_sllv_epi16(__ix, __iy));
+ else if constexpr (sizeof __ix == 32 && __have_avx512bw_vl)
+ return __vector_bitcast<_Up>(_mm256_sllv_epi16(__ix, __iy));
+ else if constexpr (sizeof __ix == 32 && __have_avx512bw)
+ return __vector_bitcast<_Up>(
+ __lo256(_mm512_sllv_epi16(_mm512_castsi256_si512(__ix),
+ _mm512_castsi256_si512(__iy))));
+ else if constexpr (sizeof __ix == 32 && __have_avx2)
+ {
+ const auto __ux = __vector_bitcast<unsigned>(__x);
+ const auto __uy = __vector_bitcast<unsigned>(__y);
+ return __vector_bitcast<_Up>(_mm256_blend_epi16(
+ __auto_bitcast(__ux << (__uy & 0x0000ffffu)),
+ __auto_bitcast((__ux & 0xffff0000u) << (__uy >> 16)), 0xaa));
+ }
+ else if constexpr (sizeof __ix == 16 && __have_avx512bw_vl)
+ return __intrin_bitcast<_V>(_mm_sllv_epi16(__ix, __iy));
+ else if constexpr (sizeof __ix == 16 && __have_avx512bw)
+ return __intrin_bitcast<_V>(
+ __lo128(_mm512_sllv_epi16(_mm512_castsi128_si512(__ix),
+ _mm512_castsi128_si512(__iy))));
+ else if constexpr (sizeof __ix == 16 && __have_avx2)
+ {
+ const auto __ux = __vector_bitcast<unsigned>(__ix);
+ const auto __uy = __vector_bitcast<unsigned>(__iy);
+ return __intrin_bitcast<_V>(_mm_blend_epi16(
+ __auto_bitcast(__ux << (__uy & 0x0000ffffu)),
+ __auto_bitcast((__ux & 0xffff0000u) << (__uy >> 16)), 0xaa));
+ }
+ else if constexpr (sizeof __ix == 16)
+ {
+ __y += 0x3f8 >> 3;
+ return __x
+ * __intrin_bitcast<_V>(
+ __vector_convert<__vector_type16_t<int>>(
+ __vector_bitcast<float>(
+ __vector_bitcast<unsigned>(__to_intrin(__y)) << 23))
+ | (__vector_convert<__vector_type16_t<int>>(
+ __vector_bitcast<float>(
+ (__vector_bitcast<unsigned>(__to_intrin(__y)) >> 16)
+ << 23))
+ << 16));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(_Up) == 4 && sizeof __ix == 16 && !__have_avx2)
+ // latency is suboptimal, but throughput is at full speedup
+ return __intrin_bitcast<_V>(
+ __vector_bitcast<unsigned>(__ix)
+ * __vector_convert<__vector_type16_t<int>>(__vector_bitcast<float>(
+ (__vector_bitcast<unsigned, 4>(__y) << 23) + 0x3f80'0000)));
+ else if constexpr (sizeof(_Up) == 8 && sizeof __ix == 16 && !__have_avx2)
+ {
+ const auto __lo = _mm_sll_epi64(__ix, __iy);
+ const auto __hi = _mm_sll_epi64(__ix, _mm_unpackhi_epi64(__iy, __iy));
+ if constexpr (__have_sse4_1)
+ return __vector_bitcast<_Up>(_mm_blend_epi16(__lo, __hi, 0xf0));
+ else
+ return __vector_bitcast<_Up>(
+ _mm_move_sd(__vector_bitcast<double>(__hi),
+ __vector_bitcast<double>(__lo)));
+ }
+ else
+ return __x << __y;
+ }
+#endif // _GLIBCXX_SIMD_NO_SHIFT_OPT
+
+ // }}}
+ // __bit_shift_right {{{
+#ifndef _GLIBCXX_SIMD_NO_SHIFT_OPT
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ inline _GLIBCXX_CONST static typename _TVT::type __bit_shift_right(_Tp __xx,
+ int __y)
+ {
+ using _V = typename _TVT::type;
+ using _Up = typename _TVT::value_type;
+ _V __x = __xx;
+ [[maybe_unused]] const auto __ix = __to_intrin(__x);
+ if (__builtin_is_constant_evaluated())
+ return __x >> __y;
+ else if (__builtin_constant_p(__y)
+ && std::is_unsigned_v<_Up> && __y >= int(sizeof(_Up) * CHAR_BIT))
+ return _V();
+ else if constexpr (sizeof(_Up) == 1 && is_unsigned_v<_Up>) //{{{
+ return __intrin_bitcast<_V>(__vector_bitcast<_UShort>(__ix) >> __y)
+ & _Up(0xff >> __y);
+ //}}}
+ else if constexpr (sizeof(_Up) == 1 && is_signed_v<_Up>) //{{{
+ return __intrin_bitcast<_V>(
+ (__vector_bitcast<_UShort>(__vector_bitcast<short>(__ix) >> (__y + 8))
+ << 8)
+ | (__vector_bitcast<_UShort>(
+ __vector_bitcast<short>(__vector_bitcast<_UShort>(__ix) << 8)
+ >> __y)
+ >> 8));
+ //}}}
+ // GCC optimizes sizeof == 2, 4, and unsigned 8 as expected
+ else if constexpr (sizeof(_Up) == 8 && is_signed_v<_Up>) //{{{
+ {
+ if (__y > 32)
+ return (__intrin_bitcast<_V>(__vector_bitcast<int>(__ix) >> 32)
+ & _Up(0xffff'ffff'0000'0000ull))
+ | __vector_bitcast<_Up>(
+ __vector_bitcast<int>(__vector_bitcast<_ULLong>(__ix) >> 32)
+ >> (__y - 32));
+ else
+ return __intrin_bitcast<_V>(__vector_bitcast<_ULLong>(__ix) >> __y)
+ | __vector_bitcast<_Up>(
+ __vector_bitcast<int>(__ix & -0x8000'0000'0000'0000ll)
+ >> __y);
+ }
+ //}}}
+ else
+ return __x >> __y;
+ }
+
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ inline _GLIBCXX_CONST static typename _TVT::type
+ __bit_shift_right(_Tp __xx, typename _TVT::type __y)
+ {
+ using _V = typename _TVT::type;
+ using _Up = typename _TVT::value_type;
+ _V __x = __xx;
+ [[maybe_unused]] const auto __ix = __to_intrin(__x);
+ [[maybe_unused]] const auto __iy = __to_intrin(__y);
+ if (__builtin_is_constant_evaluated()
+ || (__builtin_constant_p(__x) && __builtin_constant_p(__y)))
+ return __x >> __y;
+ else if constexpr (sizeof(_Up) == 1) //{{{
+ {
+ if constexpr (sizeof(__x) <= 8 && __have_avx512bw_vl)
+ return __intrin_bitcast<_V>(_mm_cvtepi16_epi8(
+ is_signed_v<_Up>
+ ? _mm_srav_epi16(_mm_cvtepi8_epi16(__ix), _mm_cvtepi8_epi16(__iy))
+ : _mm_srlv_epi16(_mm_cvtepu8_epi16(__ix),
+ _mm_cvtepu8_epi16(__iy))));
+ if constexpr (sizeof(__x) == 16 && __have_avx512bw_vl)
+ return __intrin_bitcast<_V>(_mm256_cvtepi16_epi8(
+ is_signed_v<_Up> ? _mm256_srav_epi16(_mm256_cvtepi8_epi16(__ix),
+ _mm256_cvtepi8_epi16(__iy))
+ : _mm256_srlv_epi16(_mm256_cvtepu8_epi16(__ix),
+ _mm256_cvtepu8_epi16(__iy))));
+ else if constexpr (sizeof(__x) == 32 && __have_avx512bw)
+ return __vector_bitcast<_Up>(_mm512_cvtepi16_epi8(
+ is_signed_v<_Up> ? _mm512_srav_epi16(_mm512_cvtepi8_epi16(__ix),
+ _mm512_cvtepi8_epi16(__iy))
+ : _mm512_srlv_epi16(_mm512_cvtepu8_epi16(__ix),
+ _mm512_cvtepu8_epi16(__iy))));
+ else if constexpr (sizeof(__x) == 64 && is_signed_v<_Up>)
+ return __vector_bitcast<_Up>(_mm512_mask_mov_epi8(
+ _mm512_srav_epi16(__ix, _mm512_srli_epi16(__iy, 8)),
+ 0x5555'5555'5555'5555ull,
+ _mm512_srav_epi16(_mm512_slli_epi16(__ix, 8),
+ _mm512_maskz_add_epi8(0x5555'5555'5555'5555ull,
+ __iy,
+ _mm512_set1_epi16(8)))));
+ else if constexpr (sizeof(__x) == 64 && is_unsigned_v<_Up>)
+ return __vector_bitcast<_Up>(_mm512_mask_mov_epi8(
+ _mm512_srlv_epi16(__ix, _mm512_srli_epi16(__iy, 8)),
+ 0x5555'5555'5555'5555ull,
+ _mm512_srlv_epi16(
+ _mm512_maskz_mov_epi8(0x5555'5555'5555'5555ull, __ix),
+ _mm512_maskz_mov_epi8(0x5555'5555'5555'5555ull, __iy))));
+ /* This has better throughput but higher latency than the impl below
+ else if constexpr (__have_avx2 && sizeof(__x) == 16 &&
+ is_unsigned_v<_Up>)
+ {
+ const auto __shorts = __to_intrin(__bit_shift_right(
+ __vector_bitcast<_UShort>(_mm256_cvtepu8_epi16(__ix)),
+ __vector_bitcast<_UShort>(_mm256_cvtepu8_epi16(__iy))));
+ return __vector_bitcast<_Up>(
+ _mm_packus_epi16(__lo128(__shorts), __hi128(__shorts)));
+ }
+ */
+ else if constexpr (__have_avx2 && sizeof(__x) > 8)
+ // the following uses vpsr[al]vd, which requires AVX2
+ if constexpr (is_signed_v<_Up>)
+ {
+ const auto r3 = __vector_bitcast<_UInt>(
+ (__vector_bitcast<int>(__x)
+ >> (__vector_bitcast<_UInt>(__y) >> 24)))
+ & 0xff000000u;
+ const auto r2 = __vector_bitcast<_UInt>((
+ (__vector_bitcast<int>(__x) << 8)
+ >> ((__vector_bitcast<_UInt>(__y) << 8) >> 24)))
+ & 0xff000000u;
+ const auto r1
+ = __vector_bitcast<_UInt>(
+ ((__vector_bitcast<int>(__x) << 16)
+ >> ((__vector_bitcast<_UInt>(__y) << 16) >> 24)))
+ & 0xff000000u;
+ const auto r0 = __vector_bitcast<_UInt>(
+ (__vector_bitcast<int>(__x) << 24)
+ >> ((__vector_bitcast<_UInt>(__y) << 24) >> 24));
+ return __vector_bitcast<_Up>(r3 | (r2 >> 8) | (r1 >> 16)
+ | (r0 >> 24));
+ }
+ else
+ {
+ const auto r3 = (__vector_bitcast<_UInt>(__x)
+ >> (__vector_bitcast<_UInt>(__y) >> 24))
+ & 0xff000000u;
+ const auto r2 = ((__vector_bitcast<_UInt>(__x) << 8)
+ >> ((__vector_bitcast<_UInt>(__y) << 8) >> 24))
+ & 0xff000000u;
+ const auto r1 = ((__vector_bitcast<_UInt>(__x) << 16)
+ >> ((__vector_bitcast<_UInt>(__y) << 16) >> 24))
+ & 0xff000000u;
+ const auto r0 = (__vector_bitcast<_UInt>(__x) << 24)
+ >> ((__vector_bitcast<_UInt>(__y) << 24) >> 24);
+ return __vector_bitcast<_Up>(r3 | (r2 >> 8) | (r1 >> 16)
+ | (r0 >> 24));
+ }
+ else if constexpr (__have_sse4_1
+ && is_unsigned_v<_Up> && sizeof(__x) > 2)
+ {
+ auto __x128 = __vector_bitcast<_Up>(__ix);
+ auto __mask
+ = __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__iy) << 5);
+ auto __x4 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x128) >> 4) & _UShort(0xff0f));
+ __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x4)));
+ __mask += __mask;
+ auto __x2 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x128) >> 2) & _UShort(0xff3f));
+ __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x2)));
+ __mask += __mask;
+ auto __x1 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x128) >> 1) & _UShort(0xff7f));
+ __x128 = __vector_bitcast<_Up>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__x128), __to_intrin(__x1)));
+ return __intrin_bitcast<_V>(
+ __x128
+ & ((__vector_bitcast<_Up>(__iy) & char(0xf8))
+ == 0)); // y > 7 nulls the result
+ }
+ else if constexpr (__have_sse4_1 && is_signed_v<_Up> && sizeof(__x) > 2)
+ {
+ auto __mask
+ = __vector_bitcast<_UChar>(__vector_bitcast<_UShort>(__iy) << 5);
+ auto __maskl = [&]() {
+ return __to_intrin(__vector_bitcast<_UShort>(__mask) << 8);
+ };
+ auto __xh = __vector_bitcast<short>(__ix);
+ auto __xl = __vector_bitcast<short>(__ix) << 8;
+ auto __xh4 = __xh >> 4;
+ auto __xl4 = __xl >> 4;
+ __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh4)));
+ __xl = __vector_bitcast<short>(
+ _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+ __to_intrin(__xl4)));
+ __mask += __mask;
+ auto __xh2 = __xh >> 2;
+ auto __xl2 = __xl >> 2;
+ __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh2)));
+ __xl = __vector_bitcast<short>(
+ _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+ __to_intrin(__xl2)));
+ __mask += __mask;
+ auto __xh1 = __xh >> 1;
+ auto __xl1 = __xl >> 1;
+ __xh = __vector_bitcast<short>(_CommonImplX86::_S_blend_intrin(
+ __to_intrin(__mask), __to_intrin(__xh), __to_intrin(__xh1)));
+ __xl = __vector_bitcast<short>(
+ _CommonImplX86::_S_blend_intrin(__maskl(), __to_intrin(__xl),
+ __to_intrin(__xl1)));
+ return __intrin_bitcast<_V>(
+ (__vector_bitcast<_Up>((__xh & short(0xff00)))
+ | __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__xl) >> 8))
+ & ((__vector_bitcast<_Up>(__iy) & char(0xf8))
+ == 0)); // y > 7 nulls the result
+ }
+ else if constexpr (is_unsigned_v<_Up> && sizeof(__x) > 2) // SSE2
+ {
+ auto __mask
+ = __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__y) << 5);
+ auto __x4 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x) >> 4) & _UShort(0xff0f));
+ __x = __mask > 0x7f ? __x4 : __x;
+ __mask += __mask;
+ auto __x2 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x) >> 2) & _UShort(0xff3f));
+ __x = __mask > 0x7f ? __x2 : __x;
+ __mask += __mask;
+ auto __x1 = __vector_bitcast<_Up>(
+ (__vector_bitcast<_UShort>(__x) >> 1) & _UShort(0xff7f));
+ __x = __mask > 0x7f ? __x1 : __x;
+ return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+ }
+ else if constexpr (sizeof(__x) > 2) // signed SSE2
+ {
+ static_assert(is_signed_v<_Up>);
+ auto __maskh = __vector_bitcast<_UShort>(__y) << 5;
+ auto __maskl = __vector_bitcast<_UShort>(__y) << (5 + 8);
+ auto __xh = __vector_bitcast<short>(__x);
+ auto __xl = __vector_bitcast<short>(__x) << 8;
+ auto __xh4 = __xh >> 4;
+ auto __xl4 = __xl >> 4;
+ __xh = __maskh > 0x7fff ? __xh4 : __xh;
+ __xl = __maskl > 0x7fff ? __xl4 : __xl;
+ __maskh += __maskh;
+ __maskl += __maskl;
+ auto __xh2 = __xh >> 2;
+ auto __xl2 = __xl >> 2;
+ __xh = __maskh > 0x7fff ? __xh2 : __xh;
+ __xl = __maskl > 0x7fff ? __xl2 : __xl;
+ __maskh += __maskh;
+ __maskl += __maskl;
+ auto __xh1 = __xh >> 1;
+ auto __xl1 = __xl >> 1;
+ __xh = __maskh > 0x7fff ? __xh1 : __xh;
+ __xl = __maskl > 0x7fff ? __xl1 : __xl;
+ __x = __vector_bitcast<_Up>((__xh & short(0xff00)))
+ | __vector_bitcast<_Up>(__vector_bitcast<_UShort>(__xl) >> 8);
+ return __x & ((__y & char(0xf8)) == 0); // y > 7 nulls the result
+ }
+ else
+ return __x >> __y;
+ } //}}}
+ else if constexpr (sizeof(_Up) == 2 && sizeof(__x) >= 4) //{{{
+ {
+ [[maybe_unused]] auto __blend_0xaa = [](auto __a, auto __b) {
+ if constexpr (sizeof(__a) == 16)
+ return _mm_blend_epi16(__to_intrin(__a), __to_intrin(__b), 0xaa);
+ else if constexpr (sizeof(__a) == 32)
+ return _mm256_blend_epi16(__to_intrin(__a), __to_intrin(__b), 0xaa);
+ else if constexpr (sizeof(__a) == 64)
+ return _mm512_mask_blend_epi16(0xaaaa'aaaaU, __to_intrin(__a),
+ __to_intrin(__b));
+ else
+ __assert_unreachable<decltype(__a)>();
+ };
+ if constexpr (__have_avx512bw_vl && sizeof(_Tp) <= 16)
+ return __intrin_bitcast<_V>(is_signed_v<_Up>
+ ? _mm_srav_epi16(__ix, __iy)
+ : _mm_srlv_epi16(__ix, __iy));
+ else if constexpr (__have_avx512bw_vl && sizeof(_Tp) == 32)
+ return __vector_bitcast<_Up>(is_signed_v<_Up>
+ ? _mm256_srav_epi16(__ix, __iy)
+ : _mm256_srlv_epi16(__ix, __iy));
+ else if constexpr (__have_avx512bw && sizeof(_Tp) == 64)
+ return __vector_bitcast<_Up>(is_signed_v<_Up>
+ ? _mm512_srav_epi16(__ix, __iy)
+ : _mm512_srlv_epi16(__ix, __iy));
+ else if constexpr (__have_avx2 && is_signed_v<_Up>)
+ return __intrin_bitcast<_V>(
+ __blend_0xaa(((__vector_bitcast<int>(__ix) << 16)
+ >> (__vector_bitcast<int>(__iy) & 0xffffu))
+ >> 16,
+ __vector_bitcast<int>(__ix)
+ >> (__vector_bitcast<int>(__iy) >> 16)));
+ else if constexpr (__have_avx2 && is_unsigned_v<_Up>)
+ return __intrin_bitcast<_V>(
+ __blend_0xaa((__vector_bitcast<_UInt>(__ix) & 0xffffu)
+ >> (__vector_bitcast<_UInt>(__iy) & 0xffffu),
+ __vector_bitcast<_UInt>(__ix)
+ >> (__vector_bitcast<_UInt>(__iy) >> 16)));
+ else if constexpr (__have_sse4_1)
+ {
+ auto __mask = __vector_bitcast<_UShort>(__iy);
+ auto __x128 = __vector_bitcast<_Up>(__ix);
+ //__mask *= 0x0808;
+ __mask = (__mask << 3) | (__mask << 11);
+ // do __x128 = 0 where __y[4] is set
+ __x128 = __vector_bitcast<_Up>(
+ _mm_blendv_epi8(__to_intrin(__x128), __m128i(),
+ __to_intrin(__mask)));
+ // do __x128 =>> 8 where __y[3] is set
+ __x128 = __vector_bitcast<_Up>(
+ _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 8),
+ __to_intrin(__mask += __mask)));
+ // do __x128 =>> 4 where __y[2] is set
+ __x128 = __vector_bitcast<_Up>(
+ _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 4),
+ __to_intrin(__mask += __mask)));
+ // do __x128 =>> 2 where __y[1] is set
+ __x128 = __vector_bitcast<_Up>(
+ _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 2),
+ __to_intrin(__mask += __mask)));
+ // do __x128 =>> 1 where __y[0] is set
+ return __intrin_bitcast<_V>(
+ _mm_blendv_epi8(__to_intrin(__x128), __to_intrin(__x128 >> 1),
+ __to_intrin(__mask + __mask)));
+ }
+ else
+ {
+ auto __k = __vector_bitcast<_UShort>(__iy) << 11;
+ auto __x128 = __vector_bitcast<_Up>(__ix);
+ auto __mask = [](__vector_type16_t<_UShort> __kk) {
+ return __vector_bitcast<short>(__kk) < 0;
+ };
+ // do __x128 = 0 where __y[4] is set
+ __x128 = __mask(__k) ? decltype(__x128)() : __x128;
+ // do __x128 =>> 8 where __y[3] is set
+ __x128 = __mask(__k += __k) ? __x128 >> 8 : __x128;
+ // do __x128 =>> 4 where __y[2] is set
+ __x128 = __mask(__k += __k) ? __x128 >> 4 : __x128;
+ // do __x128 =>> 2 where __y[1] is set
+ __x128 = __mask(__k += __k) ? __x128 >> 2 : __x128;
+ // do __x128 =>> 1 where __y[0] is set
+ return __intrin_bitcast<_V>(__mask(__k + __k) ? __x128 >> 1
+ : __x128);
+ }
+ } //}}}
+ else if constexpr (sizeof(_Up) == 4 && !__have_avx2) //{{{
+ {
+ if constexpr (is_unsigned_v<_Up>)
+ {
+ // x >> y == x * 2^-y == (x * 2^(31-y)) >> 31
+ const __m128 __factor_f = reinterpret_cast<__m128>(
+ 0x4f00'0000u - (__vector_bitcast<unsigned, 4>(__y) << 23));
+ const __m128i __factor
+ = __builtin_constant_p(__factor_f) ? __to_intrin(
+ __make_vector<unsigned>(__factor_f[0], __factor_f[1],
+ __factor_f[2], __factor_f[3]))
+ : _mm_cvttps_epi32(__factor_f);
+ const auto __r02
+ = _mm_srli_epi64(_mm_mul_epu32(__ix, __factor), 31);
+ const auto __r13 = _mm_mul_epu32(_mm_srli_si128(__ix, 4),
+ _mm_srli_si128(__factor, 4));
+ if constexpr (__have_sse4_1)
+ return __intrin_bitcast<_V>(
+ _mm_blend_epi16(_mm_slli_epi64(__r13, 1), __r02, 0x33));
+ else
+ return __intrin_bitcast<_V>(
+ __r02 | _mm_slli_si128(_mm_srli_epi64(__r13, 31), 4));
+ }
+ else
+ {
+ auto __shift = [](auto __a, auto __b) {
+ if constexpr (is_signed_v<_Up>)
+ return _mm_sra_epi32(__a, __b);
+ else
+ return _mm_srl_epi32(__a, __b);
+ };
+ const auto __r0
+ = __shift(__ix, _mm_unpacklo_epi32(__iy, __m128i()));
+ const auto __r1 = __shift(__ix, _mm_srli_epi64(__iy, 32));
+ const auto __r2
+ = __shift(__ix, _mm_unpackhi_epi32(__iy, __m128i()));
+ const auto __r3 = __shift(__ix, _mm_srli_si128(__iy, 12));
+ if constexpr (__have_sse4_1)
+ return __intrin_bitcast<_V>(
+ _mm_blend_epi16(_mm_blend_epi16(__r1, __r0, 0x3),
+ _mm_blend_epi16(__r3, __r2, 0x30), 0xf0));
+ else
+ return __intrin_bitcast<_V>(_mm_unpacklo_epi64(
+ _mm_unpacklo_epi32(__r0, _mm_srli_si128(__r1, 4)),
+ _mm_unpackhi_epi32(__r2, _mm_srli_si128(__r3, 4))));
+ }
+ } //}}}
+ else
+ return __x >> __y;
+ }
+#endif // _GLIBCXX_SIMD_NO_SHIFT_OPT
+
+ // }}}
+ // compares {{{
+ // __equal_to {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if constexpr (__is_avx512_abi<_Abi>()) // {{{
+ {
+ if (__builtin_is_constant_evaluated()
+ || (__x._M_is_constprop() && __y._M_is_constprop()))
+ return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+ __vector_bitcast<_Tp>(__x._M_data == __y._M_data)));
+
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ [[maybe_unused]] const auto __xi = __to_intrin(__x);
+ [[maybe_unused]] const auto __yi = __to_intrin(__y);
+ if constexpr (std::is_floating_point_v<_Tp>)
+ {
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_EQ_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 2)
+ return _mm512_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 1)
+ return _mm512_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 2)
+ return _mm256_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 1)
+ return _mm256_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 2)
+ return _mm_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 1)
+ return _mm_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+ {
+ const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+ == __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+ _MaskMember<_Tp> __r64;
+ __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+ return __r64;
+ } // }}}
+ else
+ return _Base::__equal_to(__x, __y);
+ }
+
+ // }}}
+ // __not_equal_to {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __not_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if constexpr (__is_avx512_abi<_Abi>()) // {{{
+ {
+ if (__builtin_is_constant_evaluated()
+ || (__x._M_is_constprop() && __y._M_is_constprop()))
+ return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+ __vector_bitcast<_Tp>(__x._M_data != __y._M_data)));
+
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ [[maybe_unused]] const auto __xi = __to_intrin(__x);
+ [[maybe_unused]] const auto __yi = __to_intrin(__y);
+ if constexpr (std::is_floating_point_v<_Tp>)
+ {
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_UQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return ~_mm512_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return ~_mm512_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 2)
+ return ~_mm512_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 1)
+ return ~_mm512_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return ~_mm256_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return ~_mm256_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 2)
+ return ~_mm256_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 1)
+ return ~_mm256_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return ~_mm_mask_cmpeq_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return ~_mm_mask_cmpeq_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 2)
+ return ~_mm_mask_cmpeq_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 1)
+ return ~_mm_mask_cmpeq_epi8_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+ {
+ const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+ != __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+ _MaskMember<_Tp> __r64;
+ __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+ return __r64;
+ } // }}}
+ else
+ return _Base::__not_equal_to(__x, __y);
+ }
+
+ // }}}
+ // __less {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __less(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if constexpr (__is_avx512_abi<_Abi>()) // {{{
+ {
+ if (__builtin_is_constant_evaluated()
+ || (__x._M_is_constprop() && __y._M_is_constprop()))
+ return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+ __vector_bitcast<_Tp>(__x._M_data < __y._M_data)));
+
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ [[maybe_unused]] const auto __xi = __to_intrin(__x);
+ [[maybe_unused]] const auto __yi = __to_intrin(__y);
+ if constexpr (sizeof(__xi) == 64)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm512_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm512_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm512_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm512_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm512_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm512_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm512_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm512_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 32)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm256_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm256_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm256_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm256_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm256_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm256_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm256_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm256_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 16)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm_mask_cmplt_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm_mask_cmplt_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm_mask_cmplt_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm_mask_cmplt_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm_mask_cmplt_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm_mask_cmplt_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm_mask_cmplt_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm_mask_cmplt_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+ {
+ const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+ < __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+ _MaskMember<_Tp> __r64;
+ __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+ return __r64;
+ } // }}}
+ else
+ return _Base::__less(__x, __y);
+ }
+
+ // }}}
+ // __less_equal {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __less_equal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+ if constexpr (__is_avx512_abi<_Abi>()) // {{{
+ {
+ if (__builtin_is_constant_evaluated()
+ || (__x._M_is_constprop() && __y._M_is_constprop()))
+ return _MaskImpl::__to_bits(_SimdWrapper<_Tp, _Np>(
+ __vector_bitcast<_Tp>(__x._M_data <= __y._M_data)));
+
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ [[maybe_unused]] const auto __xi = __to_intrin(__x);
+ [[maybe_unused]] const auto __yi = __to_intrin(__y);
+ if constexpr (sizeof(__xi) == 64)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm512_mask_cmple_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm512_mask_cmple_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm512_mask_cmple_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm512_mask_cmple_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm512_mask_cmple_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm512_mask_cmple_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm512_mask_cmple_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm512_mask_cmple_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 32)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm256_mask_cmple_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm256_mask_cmple_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm256_mask_cmple_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm256_mask_cmple_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm256_mask_cmple_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm256_mask_cmple_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm256_mask_cmple_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm256_mask_cmple_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 16)
+ {
+ if constexpr (std::is_same_v<_Tp, float>)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OS);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm_mask_cmple_epi8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm_mask_cmple_epi16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm_mask_cmple_epi32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_signed_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm_mask_cmple_epi64_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 1)
+ return _mm_mask_cmple_epu8_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 2)
+ return _mm_mask_cmple_epu16_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 4)
+ return _mm_mask_cmple_epu32_mask(__k1, __xi, __yi);
+ else if constexpr (std::is_unsigned_v<_Tp> && sizeof(_Tp) == 8)
+ return _mm_mask_cmple_epu64_mask(__k1, __xi, __yi);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (!__builtin_is_constant_evaluated() && sizeof(__x) == 8) // {{{
+ {
+ const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
+ <= __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
+ _MaskMember<_Tp> __r64;
+ __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
+ return __r64;
+ } // }}}
+ else
+ return _Base::__less_equal(__x, __y);
+ }
+
+ // }}}
+ // }}}
+ // negation {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __negate(_SimdWrapper<_Tp, _Np> __x) noexcept
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ return __equal_to(__x, _SimdWrapper<_Tp, _Np>());
+ else
+ return _Base::__negate(__x);
+ }
+
+ // }}}
+ // math {{{
+ using _Base::__abs;
+ // __sqrt {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __sqrt(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (__is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_sqrt_ps(__to_intrin(__x)));
+ else if constexpr (__is_sse_pd<_Tp, _Np>())
+ return _mm_sqrt_pd(__x);
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ return _mm256_sqrt_ps(__x);
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ return _mm256_sqrt_pd(__x);
+ else if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_sqrt_ps(__x);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_sqrt_pd(__x);
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __ldexp {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __ldexp(_SimdWrapper<_Tp, _Np> __x, __fixed_size_storage_t<int, _Np> __exp)
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __xi = __to_intrin(__x);
+ constexpr _SimdConverter<int, simd_abi::fixed_size<_Np>, _Tp, _Abi>
+ __cvt;
+ const auto __expi = __to_intrin(__cvt(__exp));
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 16)
+ {
+ if constexpr (sizeof(_Tp) == 8)
+ return _mm_maskz_scalef_pd(__k1, __xi, __expi);
+ else
+ return _mm_maskz_scalef_ps(__k1, __xi, __expi);
+ }
+ else if constexpr (sizeof(__xi) == 32)
+ {
+ if constexpr (sizeof(_Tp) == 8)
+ return _mm256_maskz_scalef_pd(__k1, __xi, __expi);
+ else
+ return _mm256_maskz_scalef_ps(__k1, __xi, __expi);
+ }
+ else
+ {
+ static_assert(sizeof(__xi) == 64);
+ if constexpr (sizeof(_Tp) == 8)
+ return _mm512_maskz_scalef_pd(__k1, __xi, __expi);
+ else
+ return _mm512_maskz_scalef_ps(__k1, __xi, __expi);
+ }
+ }
+ else
+ return _Base::__ldexp(__x, __exp);
+ }
+
+ // }}}
+ // __trunc {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __trunc(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_roundscale_ps(__x, 0x0b);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_roundscale_pd(__x, 0x0b);
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ return _mm256_round_ps(__x, 0x3);
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ return _mm256_round_pd(__x, 0x3);
+ else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x3));
+ else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+ return _mm_round_pd(__x, 0x3);
+ else if constexpr (__is_sse_ps<_Tp, _Np>())
+ {
+ auto __truncated = _mm_cvtepi32_ps(_mm_cvttps_epi32(__to_intrin(__x)));
+ const auto __no_fractional_values
+ = __vector_bitcast<int>(__vector_bitcast<_UInt>(__to_intrin(__x))
+ & 0x7f800000u)
+ < 0x4b000000; // the exponent is so large that no mantissa bits
+ // signify fractional values (0x3f8 + 23*8 =
+ // 0x4b0)
+ return __no_fractional_values ? __truncated : __to_intrin(__x);
+ }
+ else
+ return _Base::__trunc(__x);
+ }
+
+ // }}}
+ // __round {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __round(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _V = __vector_type_t<_Tp, _Np>;
+ _V __truncated;
+ if constexpr (__is_avx512_ps<_Tp, _Np>())
+ __truncated = _mm512_roundscale_ps(__x._M_data, 0x0b);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ __truncated = _mm512_roundscale_pd(__x._M_data, 0x0b);
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ __truncated
+ = _mm256_round_ps(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ __truncated
+ = _mm256_round_pd(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+ else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+ __truncated = __auto_bitcast(
+ _mm_round_ps(__to_intrin(__x), _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC));
+ else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+ __truncated
+ = _mm_round_pd(__x._M_data, _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC);
+ else if constexpr (__is_sse_ps<_Tp, _Np>())
+ __truncated
+ = __auto_bitcast(_mm_cvtepi32_ps(_mm_cvttps_epi32(__to_intrin(__x))));
+ else
+ return _Base::__round(__x);
+
+ // x < 0 => truncated <= 0 && truncated >= x => x - truncated <= 0
+ // x > 0 => truncated >= 0 && truncated <= x => x - truncated >= 0
+
+ const _V __rounded
+ = __truncated
+ + (__and(_S_absmask<_V>, __x._M_data - __truncated) >= _Tp(.5)
+ ? __or(__and(_S_signmask<_V>, __x._M_data), _V() + 1)
+ : _V());
+ if constexpr (__have_sse4_1)
+ return __rounded;
+ else
+ return __and(_S_absmask<_V>, __x._M_data) < 0x1p23f ? __rounded
+ : __x._M_data;
+ }
+
+ // }}}
+ // __nearbyint {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __nearbyint(_Tp __x) noexcept
+ {
+ if constexpr (_TVT::template __is<float, 16>)
+ return _mm512_roundscale_ps(__x, 0x0c);
+ else if constexpr (_TVT::template __is<double, 8>)
+ return _mm512_roundscale_pd(__x, 0x0c);
+ else if constexpr (_TVT::template __is<float, 8>)
+ return _mm256_round_ps(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+ else if constexpr (_TVT::template __is<double, 4>)
+ return _mm256_round_pd(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+ else if constexpr (__have_sse4_1 && _TVT::template __is<float, 4>)
+ return _mm_round_ps(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+ else if constexpr (__have_sse4_1 && _TVT::template __is<double, 2>)
+ return _mm_round_pd(__x, _MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC);
+ else
+ return _Base::__nearbyint(__x);
+ }
+
+ // }}}
+ // __rint {{{
+ template <typename _Tp, typename _TVT = _VectorTraits<_Tp>>
+ _GLIBCXX_SIMD_INTRINSIC static _Tp __rint(_Tp __x) noexcept
+ {
+ if constexpr (_TVT::template __is<float, 16>)
+ return _mm512_roundscale_ps(__x, 0x04);
+ else if constexpr (_TVT::template __is<double, 8>)
+ return _mm512_roundscale_pd(__x, 0x04);
+ else if constexpr (_TVT::template __is<float, 8>)
+ return _mm256_round_ps(__x, _MM_FROUND_CUR_DIRECTION);
+ else if constexpr (_TVT::template __is<double, 4>)
+ return _mm256_round_pd(__x, _MM_FROUND_CUR_DIRECTION);
+ else if constexpr (__have_sse4_1 && _TVT::template __is<float, 4>)
+ return _mm_round_ps(__x, _MM_FROUND_CUR_DIRECTION);
+ else if constexpr (__have_sse4_1 && _TVT::template __is<double, 2>)
+ return _mm_round_pd(__x, _MM_FROUND_CUR_DIRECTION);
+ else
+ return _Base::__rint(__x);
+ }
+
+ // }}}
+ // __floor {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __floor(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_roundscale_ps(__x, 0x09);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_roundscale_pd(__x, 0x09);
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ return _mm256_round_ps(__x, 0x1);
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ return _mm256_round_pd(__x, 0x1);
+ else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_floor_ps(__to_intrin(__x)));
+ else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+ return _mm_floor_pd(__x);
+ else
+ return _Base::__floor(__x);
+ }
+
+ // }}}
+ // __ceil {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+ __ceil(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _mm512_roundscale_ps(__x, 0x0a);
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _mm512_roundscale_pd(__x, 0x0a);
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ return _mm256_round_ps(__x, 0x2);
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ return _mm256_round_pd(__x, 0x2);
+ else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_ceil_ps(__to_intrin(__x)));
+ else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
+ return _mm_ceil_pd(__x);
+ else
+ return _Base::__ceil(__x);
+ }
+
+ // }}}
+ // __signbit {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __signbit(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+ {
+ if constexpr (sizeof(__x) == 64 && sizeof(_Tp) == 4)
+ return _mm512_movepi32_mask(__intrin_bitcast<__m512i>(__x._M_data));
+ else if constexpr (sizeof(__x) == 64 && sizeof(_Tp) == 8)
+ return _mm512_movepi64_mask(__intrin_bitcast<__m512i>(__x._M_data));
+ else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) == 4)
+ return _mm256_movepi32_mask(__intrin_bitcast<__m256i>(__x._M_data));
+ else if constexpr (sizeof(__x) == 32 && sizeof(_Tp) == 8)
+ return _mm256_movepi64_mask(__intrin_bitcast<__m256i>(__x._M_data));
+ else if constexpr (sizeof(__x) <= 16 && sizeof(_Tp) == 4)
+ return _mm_movepi32_mask(__intrin_bitcast<__m128i>(__x._M_data));
+ else if constexpr (sizeof(__x) <= 16 && sizeof(_Tp) == 8)
+ return _mm_movepi64_mask(__intrin_bitcast<__m128i>(__x._M_data));
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __xi = __to_intrin(__x);
+ [[maybe_unused]] constexpr auto __k1
+ = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_movemask_ps(__xi);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_movemask_pd(__xi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_movemask_ps(__xi);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_movemask_pd(__xi);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmplt_epi32_mask(__k1,
+ __intrin_bitcast<__m512i>(__xi),
+ __m512i());
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmplt_epi64_mask(__k1,
+ __intrin_bitcast<__m512i>(__xi),
+ __m512i());
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__signbit(__x);
+ /*{
+ using _I = __int_for_sizeof_t<_Tp>;
+ if constexpr (sizeof(__x) == 64)
+ return __less(__vector_bitcast<_I>(__x), _I());
+ else
+ {
+ const auto __xx = __vector_bitcast<_I>(__x._M_data);
+ [[maybe_unused]] constexpr _I __signmask =
+ std::numeric_limits<_I>::min();
+ if constexpr ((sizeof(_Tp) == 4 &&
+ (__have_avx2 || sizeof(__x) == 16)) ||
+ __have_avx512vl)
+ {
+ return __vector_bitcast<_Tp>(__xx >>
+ std::numeric_limits<_I>::digits);
+ }
+ else if constexpr ((__have_avx2 ||
+ (__have_ssse3 && sizeof(__x) == 16)))
+ {
+ return __vector_bitcast<_Tp>((__xx & __signmask) ==
+ __signmask);
+ }
+ else
+ { // SSE2/3 or AVX (w/o AVX2)
+ constexpr auto __one = __vector_broadcast<_Np, _Tp>(1);
+ return __vector_bitcast<_Tp>(
+ __vector_bitcast<_Tp>(
+ (__xx & __signmask) |
+ __vector_bitcast<_I>(__one)) // -1 or 1
+ != __one);
+ }
+ }
+ }*/
+ }
+
+ // }}}
+ // __isnonzerovalue_mask (isnormal | is subnormal == !isinf & !isnan & !is
+ // zero) {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static auto __isnonzerovalue_mask(_Tp __x)
+ {
+ using _Traits = _VectorTraits<_Tp>;
+ if constexpr (__have_avx512dq_vl)
+ {
+ if constexpr (_Traits::template __is<
+ float, 2> || _Traits::template __is<float, 4>)
+ return _knot_mask8(_mm_fpclass_ps_mask(__to_intrin(__x), 0x9f));
+ else if constexpr (_Traits::template __is<float, 8>)
+ return _knot_mask8(_mm256_fpclass_ps_mask(__x, 0x9f));
+ else if constexpr (_Traits::template __is<float, 16>)
+ return _knot_mask16(_mm512_fpclass_ps_mask(__x, 0x9f));
+ else if constexpr (_Traits::template __is<double, 2>)
+ return _knot_mask8(_mm_fpclass_pd_mask(__x, 0x9f));
+ else if constexpr (_Traits::template __is<double, 4>)
+ return _knot_mask8(_mm256_fpclass_pd_mask(__x, 0x9f));
+ else if constexpr (_Traits::template __is<double, 8>)
+ return _knot_mask8(_mm512_fpclass_pd_mask(__x, 0x9f));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ {
+ using _Up = typename _Traits::value_type;
+ constexpr size_t _Np = _Traits::_S_width;
+ const auto __a
+ = __x * std::numeric_limits<_Up>::infinity(); // NaN if __x == 0
+ const auto __b = __x * _Up(); // NaN if __x == inf
+ if constexpr (__have_avx512vl && __is_sse_ps<_Up, _Np>())
+ return _mm_cmp_ps_mask(__to_intrin(__a), __to_intrin(__b),
+ _CMP_ORD_Q);
+ else if constexpr (__have_avx512f && __is_sse_ps<_Up, _Np>())
+ return __mmask8(0xf
+ & _mm512_cmp_ps_mask(__auto_bitcast(__a),
+ __auto_bitcast(__b),
+ _CMP_ORD_Q));
+ else if constexpr (__have_avx512vl && __is_sse_pd<_Up, _Np>())
+ return _mm_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+ else if constexpr (__have_avx512f && __is_sse_pd<_Up, _Np>())
+ return __mmask8(0x3
+ & _mm512_cmp_pd_mask(__auto_bitcast(__a),
+ __auto_bitcast(__b),
+ _CMP_ORD_Q));
+ else if constexpr (__have_avx512vl && __is_avx_ps<_Up, _Np>())
+ return _mm256_cmp_ps_mask(__a, __b, _CMP_ORD_Q);
+ else if constexpr (__have_avx512f && __is_avx_ps<_Up, _Np>())
+ return __mmask8(_mm512_cmp_ps_mask(__auto_bitcast(__a),
+ __auto_bitcast(__b), _CMP_ORD_Q));
+ else if constexpr (__have_avx512vl && __is_avx_pd<_Up, _Np>())
+ return _mm256_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+ else if constexpr (__have_avx512f && __is_avx_pd<_Up, _Np>())
+ return __mmask8(0xf
+ & _mm512_cmp_pd_mask(__auto_bitcast(__a),
+ __auto_bitcast(__b),
+ _CMP_ORD_Q));
+ else if constexpr (__is_avx512_ps<_Up, _Np>())
+ return _mm512_cmp_ps_mask(__a, __b, _CMP_ORD_Q);
+ else if constexpr (__is_avx512_pd<_Up, _Np>())
+ return _mm512_cmp_pd_mask(__a, __b, _CMP_ORD_Q);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ }
+
+ // }}}
+ // __isfinite {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isfinite(_SimdWrapper<_Tp, _Np> __x)
+ {
+ static_assert(is_floating_point_v<_Tp>);
+#if __FINITE_MATH_ONLY__
+ [](auto&&){}(__x);
+ return __equal_to(_SimdWrapper<_Tp, _Np>(), _SimdWrapper<_Tp, _Np>());
+#else
+ if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+ {
+ const auto __xi = __to_intrin(__x);
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm512_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm256_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm256_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm_mask_fpclass_ps_mask(__k1, __xi, 0x99);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm_mask_fpclass_pd_mask(__k1, __xi, 0x99);
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ {
+ // if all exponent bits are set, __x is either inf or NaN
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto __inf = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+ return __less<_I, _Np>(__vector_bitcast<_I>(__x) & __inf, __inf);
+ }
+ else
+ return _Base::__isfinite(__x);
+#endif
+ }
+
+ // }}}
+ // __isinf {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isinf(_SimdWrapper<_Tp, _Np> __x)
+ {
+#if __FINITE_MATH_ONLY__
+ [](auto&&){}(__x);
+ return {}; // false
+#else
+ if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+ {
+ const auto __xi = __to_intrin(__x);
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_fpclass_ps_mask(__xi, 0x18);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_fpclass_pd_mask(__xi, 0x18);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_fpclass_ps_mask(__xi, 0x18);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_fpclass_pd_mask(__xi, 0x18);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_fpclass_ps_mask(__xi, 0x18);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_fpclass_pd_mask(__xi, 0x18);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx512dq_vl)
+ {
+ if constexpr (__is_sse_pd<_Tp, _Np>())
+ return __vector_bitcast<double>(
+ _mm_movm_epi64(_mm_fpclass_pd_mask(__x, 0x18)));
+ else if constexpr (__is_avx_pd<_Tp, _Np>())
+ return __vector_bitcast<double>(
+ _mm256_movm_epi64(_mm256_fpclass_pd_mask(__x, 0x18)));
+ else if constexpr (__is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(
+ _mm_movm_epi32(_mm_fpclass_ps_mask(__to_intrin(__x), 0x18)));
+ else if constexpr (__is_avx_ps<_Tp, _Np>())
+ return __vector_bitcast<float>(
+ _mm256_movm_epi32(_mm256_fpclass_ps_mask(__x, 0x18)));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__isinf(__x);
+#endif
+ }
+
+ // }}}
+ // __isnormal {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isnormal(_SimdWrapper<_Tp, _Np> __x)
+ {
+#if __FINITE_MATH_ONLY__
+ [[maybe_unused]] constexpr int __mode = 0x26;
+#else
+ [[maybe_unused]] constexpr int __mode = 0xbf;
+#endif
+ if constexpr (__is_avx512_abi<_Abi>() && __have_avx512dq)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm512_mask_fpclass_ps_mask(__k1, __xi, __mode);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm512_mask_fpclass_pd_mask(__k1, __xi, __mode);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm256_mask_fpclass_ps_mask(__k1, __xi, __mode);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm256_mask_fpclass_pd_mask(__k1, __xi, __mode);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __k1 ^ _mm_mask_fpclass_ps_mask(__k1, __xi, __mode);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __k1 ^ _mm_mask_fpclass_pd_mask(__k1, __xi, __mode);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx512dq)
+ {
+ if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
+ return __auto_bitcast(_mm_movm_epi32(
+ _knot_mask8(_mm_fpclass_ps_mask(__to_intrin(__x), __mode))));
+ else if constexpr (__have_avx512vl && __is_avx_ps<_Tp, _Np>())
+ return __vector_bitcast<float>(_mm256_movm_epi32(
+ _knot_mask8(_mm256_fpclass_ps_mask(__x, __mode))));
+ else if constexpr (__is_avx512_ps<_Tp, _Np>())
+ return _knot_mask16(_mm512_fpclass_ps_mask(__x, __mode));
+ else if constexpr (__have_avx512vl && __is_sse_pd<_Tp, _Np>())
+ return __vector_bitcast<double>(
+ _mm_movm_epi64(_knot_mask8(_mm_fpclass_pd_mask(__x, __mode))));
+ else if constexpr (__have_avx512vl && __is_avx_pd<_Tp, _Np>())
+ return __vector_bitcast<double>(_mm256_movm_epi64(
+ _knot_mask8(_mm256_fpclass_pd_mask(__x, __mode))));
+ else if constexpr (__is_avx512_pd<_Tp, _Np>())
+ return _knot_mask8(_mm512_fpclass_pd_mask(__x, __mode));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto absn = __vector_bitcast<_I>(__abs(__x));
+ const auto minn = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::min()));
+#if __FINITE_MATH_ONLY__
+ return __less_equal<_I, _Np>(minn, absn);
+#else
+ const auto infn = __vector_bitcast<_I>(
+ __vector_broadcast<_Np>(std::numeric_limits<_Tp>::infinity()));
+ return __and(__less_equal<_I, _Np>(minn, absn),
+ __less<_I, _Np>(absn, infn));
+#endif
+ }
+ else
+ return _Base::__isnormal(__x);
+ }
+
+ // }}}
+ // __isnan {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isnan(_SimdWrapper<_Tp, _Np> __x)
+ {
+ return __isunordered(__x, __x);
+ }
+
+ // }}}
+ // __isunordered {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __isunordered(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
+ {
+#if __FINITE_MATH_ONLY__
+ [](auto&&){}(__x);
+ return {}; // false
+#else
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ constexpr auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_UNORD_Q);
+ }
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_UNORD_Q);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmpunord_ps(__xi, __yi));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __auto_bitcast(_mm_cmpunord_pd(__xi, __yi));
+ else
+ __assert_unreachable<_Tp>();
+#endif
+ }
+
+ // }}}
+ // __isgreater {{{
+ template <typename _Tp, size_t _Np>
+ static constexpr _MaskMember<_Tp> __isgreater(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GT_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx)
+ {
+ if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_GT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_GT_OQ));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_cmp_pd(__xi, __yi, _CMP_GT_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ {
+ const auto __xn = __vector_bitcast<int>(__xi);
+ const auto __yn = __vector_bitcast<int>(__yi);
+ const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+ return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+ reinterpret_cast<__m128>(__xp > __yp)));
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+ -_mm_ucomigt_sd(__xi, __yi),
+ -_mm_ucomigt_sd(_mm_unpackhi_pd(__xi, __xi),
+ _mm_unpackhi_pd(__yi, __yi))});
+ else
+ return _Base::__isgreater(__x, __y);
+ }
+
+ // }}}
+ // __isgreaterequal {{{
+ template <typename _Tp, size_t _Np>
+ static constexpr _MaskMember<_Tp> __isgreaterequal(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_GE_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx)
+ {
+ if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_GE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_GE_OQ));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_cmp_pd(__xi, __yi, _CMP_GE_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ {
+ const auto __xn = __vector_bitcast<int>(__xi);
+ const auto __yn = __vector_bitcast<int>(__yi);
+ const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+ return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+ reinterpret_cast<__m128>(__xp >= __yp)));
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+ -_mm_ucomige_sd(__xi, __yi),
+ -_mm_ucomige_sd(_mm_unpackhi_pd(__xi, __xi),
+ _mm_unpackhi_pd(__yi, __yi))});
+ else
+ return _Base::__isgreaterequal(__x, __y);
+ }
+
+ // }}}
+ // __isless {{{
+ template <typename _Tp, size_t _Np>
+ static constexpr _MaskMember<_Tp> __isless(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LT_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx)
+ {
+ if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_LT_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_LT_OQ));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_cmp_pd(__xi, __yi, _CMP_LT_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ {
+ const auto __xn = __vector_bitcast<int>(__xi);
+ const auto __yn = __vector_bitcast<int>(__yi);
+ const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+ return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+ reinterpret_cast<__m128>(__xp < __yp)));
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+ -_mm_ucomigt_sd(__yi, __xi),
+ -_mm_ucomigt_sd(_mm_unpackhi_pd(__yi, __yi),
+ _mm_unpackhi_pd(__xi, __xi))});
+ else
+ return _Base::__isless(__x, __y);
+ }
+
+ // }}}
+ // __islessequal {{{
+ template <typename _Tp, size_t _Np>
+ static constexpr _MaskMember<_Tp> __islessequal(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_LE_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx)
+ {
+ if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_LE_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_LE_OQ));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_cmp_pd(__xi, __yi, _CMP_LE_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ {
+ const auto __xn = __vector_bitcast<int>(__xi);
+ const auto __yn = __vector_bitcast<int>(__yi);
+ const auto __xp = __xn < 0 ? -(__xn & 0x7fff'ffff) : __xn;
+ const auto __yp = __yn < 0 ? -(__yn & 0x7fff'ffff) : __yn;
+ return __auto_bitcast(__and(_mm_cmpord_ps(__xi, __yi),
+ reinterpret_cast<__m128>(__xp <= __yp)));
+ }
+ else if constexpr (__have_sse2 && sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __auto_bitcast(__vector_type_t<__int_with_sizeof_t<8>, 2>{
+ -_mm_ucomige_sd(__yi, __xi),
+ -_mm_ucomige_sd(_mm_unpackhi_pd(__yi, __yi),
+ _mm_unpackhi_pd(__xi, __xi))});
+ else
+ return _Base::__islessequal(__x, __y);
+ }
+
+ // }}}
+ // __islessgreater {{{
+ template <typename _Tp, size_t _Np>
+ static constexpr _MaskMember<_Tp> __islessgreater(_SimdWrapper<_Tp, _Np> __x,
+ _SimdWrapper<_Tp, _Np> __y)
+ {
+ const auto __xi = __to_intrin(__x);
+ const auto __yi = __to_intrin(__y);
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ const auto __k1 = _Abi::template __implicit_mask<_Tp>();
+ if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 4)
+ return _mm512_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 64 && sizeof(_Tp) == 8)
+ return _mm512_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return _mm_mask_cmp_ps_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_mask_cmp_pd_mask(__k1, __xi, __yi, _CMP_NEQ_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__have_avx)
+ {
+ if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 4)
+ return _mm256_cmp_ps(__xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 32 && sizeof(_Tp) == 8)
+ return _mm256_cmp_pd(__xi, __yi, _CMP_NEQ_OQ);
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(_mm_cmp_ps(__xi, __yi, _CMP_NEQ_OQ));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return _mm_cmp_pd(__xi, __yi, _CMP_NEQ_OQ);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 4)
+ return __auto_bitcast(
+ __and(_mm_cmpord_ps(__xi, __yi), _mm_cmpneq_ps(__xi, __yi)));
+ else if constexpr (sizeof(__xi) == 16 && sizeof(_Tp) == 8)
+ return __and(_mm_cmpord_pd(__xi, __yi), _mm_cmpneq_pd(__xi, __yi));
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ //}}}
+ //}}}
+};
+
+// }}}
+// _MaskImplX86Mixin {{{
+struct _MaskImplX86Mixin
+{
+ template <typename _Tp> using _TypeTag = _Tp*;
+ using _Base = _MaskImplBuiltinMixin;
+
+ // __to_maskvector(bool) {{{
+ template <typename _Up, size_t _ToN = 1, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr enable_if_t<is_same_v<_Tp, bool>,
+ _SimdWrapper<_Up, _ToN>>
+ __to_maskvector(_Tp __x)
+ {
+ using _I = __int_for_sizeof_t<_Up>;
+ return __vector_bitcast<_Up>(__x ? __vector_type_t<_I, _ToN>{~_I()}
+ : __vector_type_t<_I, _ToN>{});
+ }
+
+ // }}}
+ // __to_maskvector(_SanitizedBitMask) {{{
+ template <typename _Up, size_t _UpN = 0, size_t _Np,
+ size_t _ToN = _UpN == 0 ? _Np : _UpN>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+ __to_maskvector(_SanitizedBitMask<_Np> __x)
+ {
+ using _UV = __vector_type_t<_Up, _ToN>;
+ using _UI = __intrinsic_type_t<_Up, _ToN>;
+ [[maybe_unused]] const auto __k = __x._M_to_bits();
+ if constexpr (_Np == 1)
+ return __to_maskvector<_Up, _ToN>(__k);
+ else if (__x._M_is_constprop() || __builtin_is_constant_evaluated())
+ {
+ using _Ip = __int_for_sizeof_t<_Up>;
+ return __vector_bitcast<_Up>(
+ __generate_from_n_evaluations<std::min(_ToN, _Np),
+ __vector_type_t<_Ip, _ToN>>(
+ [&](auto __i) -> _Ip { return -__x[__i.value]; }));
+ }
+ else if constexpr (sizeof(_Up) == 1)
+ {
+ if constexpr (sizeof(_UI) == 16)
+ {
+ if constexpr (__have_avx512bw_vl)
+ return __intrin_bitcast<_UV>(_mm_movm_epi8(__k));
+ else if constexpr (__have_avx512bw)
+ return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi8(__k)));
+ else if constexpr (__have_avx512f)
+ {
+ auto __as32bits = _mm512_maskz_mov_epi32(__k, ~__m512i());
+ auto __as16bits = __xzyw(
+ _mm256_packs_epi32(__lo256(__as32bits), __hi256(__as32bits)));
+ return __intrin_bitcast<_UV>(
+ _mm_packs_epi16(__lo128(__as16bits), __hi128(__as16bits)));
+ }
+ else if constexpr (__have_ssse3)
+ {
+ const auto __bitmask = __to_intrin(
+ __make_vector<_UChar>(1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8,
+ 16, 32, 64, 128));
+ return __intrin_bitcast<_UV>(
+ __vector_bitcast<_Up>(
+ _mm_shuffle_epi8(__to_intrin(
+ __vector_type_t<_ULLong, 2>{__k}),
+ _mm_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
+ 1, 1, 1, 1, 1, 1))
+ & __bitmask)
+ != 0);
+ }
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 32)
+ {
+ if constexpr (__have_avx512bw_vl)
+ return __vector_bitcast<_Up>(_mm256_movm_epi8(__k));
+ else if constexpr (__have_avx512bw)
+ return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi8(__k)));
+ else if constexpr (__have_avx512f)
+ {
+ auto __as16bits = // 0 16 1 17 ... 15 31
+ _mm512_srli_epi32(_mm512_maskz_mov_epi32(__k, ~__m512i()), 16)
+ | _mm512_slli_epi32(_mm512_maskz_mov_epi32(__k >> 16,
+ ~__m512i()),
+ 16);
+ auto __0_16_1_17 = __xzyw(_mm256_packs_epi16(
+ __lo256(__as16bits),
+ __hi256(__as16bits)) // 0 16 1 17 2 18 3 19 8 24 9 25 ...
+ );
+ // deinterleave:
+ return __vector_bitcast<_Up>(__xzyw(_mm256_shuffle_epi8(
+ __0_16_1_17, // 0 16 1 17 2 ...
+ _mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11,
+ 13, 15, 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5,
+ 7, 9, 11, 13,
+ 15)))); // 0-7 16-23 8-15 24-31 -> xzyw
+ // 0-3 8-11 16-19 24-27
+ // 4-7 12-15 20-23 28-31
+ }
+ else if constexpr (__have_avx2)
+ {
+ const auto __bitmask = _mm256_broadcastsi128_si256(__to_intrin(
+ __make_vector<_UChar>(1, 2, 4, 8, 16, 32, 64, 128, 1, 2, 4, 8,
+ 16, 32, 64, 128)));
+ return __vector_bitcast<_Up>(
+ __vector_bitcast<_Up>(
+ _mm256_shuffle_epi8(
+ _mm256_broadcastsi128_si256(
+ __to_intrin(__vector_type_t<_ULLong, 2>{__k})),
+ _mm256_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
+ 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
+ 3, 3, 3, 3))
+ & __bitmask)
+ != 0);
+ }
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 64)
+ return reinterpret_cast<__vector_type_t<_SChar, 64>>(
+ _mm512_movm_epi8(__k));
+ if constexpr (std::min(_ToN, _Np) <= 4)
+ {
+ if constexpr (_Np > 7) // avoid overflow
+ __x &= _SanitizedBitMask<_Np>(0x0f);
+ const _UInt __char_mask
+ = ((_UInt(__x.to_ulong()) * 0x00204081U) & 0x01010101ULL) * 0xff;
+ __vector_type_t<_Up, _ToN> __r = {};
+ __builtin_memcpy(&__r, &__char_mask,
+ std::min(sizeof(__r), sizeof(__char_mask)));
+ return __r;
+ }
+ else if constexpr (std::min(_ToN, _Np) <= 7)
+ {
+ if constexpr (_Np > 7) // avoid overflow
+ __x &= _SanitizedBitMask<_Np>(0x7f);
+ const _ULLong __char_mask
+ = ((__x.to_ulong() * 0x40810204081ULL) & 0x0101010101010101ULL)
+ * 0xff;
+ __vector_type_t<_Up, _ToN> __r = {};
+ __builtin_memcpy(&__r, &__char_mask,
+ std::min(sizeof(__r), sizeof(__char_mask)));
+ return __r;
+ }
+ }
+ else if constexpr (sizeof(_Up) == 2)
+ {
+ if constexpr (sizeof(_UI) == 16)
+ {
+ if constexpr (__have_avx512bw_vl)
+ return __intrin_bitcast<_UV>(_mm_movm_epi16(__k));
+ else if constexpr (__have_avx512bw)
+ return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi16(__k)));
+ else if constexpr (__have_avx512f)
+ {
+ __m256i __as32bits;
+ if constexpr (__have_avx512vl)
+ __as32bits = _mm256_maskz_mov_epi32(__k, ~__m256i());
+ else
+ __as32bits = __lo256(_mm512_maskz_mov_epi32(__k, ~__m512i()));
+ return __intrin_bitcast<_UV>(
+ _mm_packs_epi32(__lo128(__as32bits), __hi128(__as32bits)));
+ }
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 32)
+ {
+ if constexpr (__have_avx512bw_vl)
+ return __vector_bitcast<_Up>(_mm256_movm_epi16(__k));
+ else if constexpr (__have_avx512bw)
+ return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi16(__k)));
+ else if constexpr (__have_avx512f)
+ {
+ auto __as32bits = _mm512_maskz_mov_epi32(__k, ~__m512i());
+ return __vector_bitcast<_Up>(
+ __xzyw(_mm256_packs_epi32(__lo256(__as32bits),
+ __hi256(__as32bits))));
+ }
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 64)
+ return __vector_bitcast<_Up>(_mm512_movm_epi16(__k));
+ }
+ else if constexpr (sizeof(_Up) == 4)
+ {
+ if constexpr (sizeof(_UI) == 16)
+ {
+ if constexpr (__have_avx512dq_vl)
+ return __intrin_bitcast<_UV>(_mm_movm_epi32(__k));
+ else if constexpr (__have_avx512dq)
+ return __intrin_bitcast<_UV>(__lo128(_mm512_movm_epi32(__k)));
+ else if constexpr (__have_avx512vl)
+ return __intrin_bitcast<_UV>(
+ _mm_maskz_mov_epi32(__k, ~__m128i()));
+ else if constexpr (__have_avx512f)
+ return __intrin_bitcast<_UV>(
+ __lo128(_mm512_maskz_mov_epi32(__k, ~__m512i())));
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 32)
+ {
+ if constexpr (__have_avx512dq_vl)
+ return __vector_bitcast<_Up>(_mm256_movm_epi32(__k));
+ else if constexpr (__have_avx512dq)
+ return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi32(__k)));
+ else if constexpr (__have_avx512vl)
+ return __vector_bitcast<_Up>(
+ _mm256_maskz_mov_epi32(__k, ~__m256i()));
+ else if constexpr (__have_avx512f)
+ return __vector_bitcast<_Up>(
+ __lo256(_mm512_maskz_mov_epi32(__k, ~__m512i())));
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 64)
+ return __vector_bitcast<_Up>(
+ __have_avx512dq ? _mm512_movm_epi32(__k)
+ : _mm512_maskz_mov_epi32(__k, ~__m512i()));
+ }
+ else if constexpr (sizeof(_Up) == 8)
+ {
+ if constexpr (sizeof(_UI) == 16)
+ {
+ if constexpr (__have_avx512dq_vl)
+ return __vector_bitcast<_Up>(_mm_movm_epi64(__k));
+ else if constexpr (__have_avx512dq)
+ return __vector_bitcast<_Up>(__lo128(_mm512_movm_epi64(__k)));
+ else if constexpr (__have_avx512vl)
+ return __vector_bitcast<_Up>(
+ _mm_maskz_mov_epi64(__k, ~__m128i()));
+ else if constexpr (__have_avx512f)
+ return __vector_bitcast<_Up>(
+ __lo128(_mm512_maskz_mov_epi64(__k, ~__m512i())));
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 32)
+ {
+ if constexpr (__have_avx512dq_vl)
+ return __vector_bitcast<_Up>(_mm256_movm_epi64(__k));
+ else if constexpr (__have_avx512dq)
+ return __vector_bitcast<_Up>(__lo256(_mm512_movm_epi64(__k)));
+ else if constexpr (__have_avx512vl)
+ return __vector_bitcast<_Up>(
+ _mm256_maskz_mov_epi64(__k, ~__m256i()));
+ else if constexpr (__have_avx512f)
+ return __vector_bitcast<_Up>(
+ __lo256(_mm512_maskz_mov_epi64(__k, ~__m512i())));
+ // else fall through
+ }
+ else if constexpr (sizeof(_UI) == 64)
+ return __vector_bitcast<_Up>(
+ __have_avx512dq ? _mm512_movm_epi64(__k)
+ : _mm512_maskz_mov_epi64(__k, ~__m512i()));
+ }
+
+ using _UpUInt = std::make_unsigned_t<__int_for_sizeof_t<_Up>>;
+ using _V = __vector_type_t<_UpUInt, _ToN>;
+ constexpr size_t __bits_per_element = sizeof(_Up) * CHAR_BIT;
+ if constexpr (_ToN == 2)
+ {
+ return __vector_bitcast<_Up>(_V{_UpUInt(-__x[0]), _UpUInt(-__x[1])});
+ }
+ else if constexpr (!__have_avx2 && __have_avx && sizeof(_V) == 32)
+ {
+ if constexpr (sizeof(_Up) == 4)
+ return __vector_bitcast<_Up>(_mm256_cmp_ps(
+ _mm256_and_ps(_mm256_castsi256_ps(_mm256_set1_epi32(__k)),
+ _mm256_castsi256_ps(_mm256_setr_epi32(
+ 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80))),
+ _mm256_setzero_ps(), _CMP_NEQ_UQ));
+ else if constexpr (sizeof(_Up) == 8)
+ return __vector_bitcast<_Up>(_mm256_cmp_pd(
+ _mm256_and_pd(_mm256_castsi256_pd(_mm256_set1_epi64x(__k)),
+ _mm256_castsi256_pd(
+ _mm256_setr_epi64x(0x01, 0x02, 0x04, 0x08))),
+ _mm256_setzero_pd(), _CMP_NEQ_UQ));
+ else
+ __assert_unreachable<_Up>();
+ }
+ else if constexpr (__bits_per_element >= _ToN)
+ {
+ constexpr auto __bitmask
+ = __generate_vector<__vector_type_t<_UpUInt, _ToN>>(
+ [](auto __i) constexpr->_UpUInt {
+ return __i < _ToN ? 1ull << __i : 0;
+ });
+ const auto __bits = __vector_broadcast<_ToN, _UpUInt>(__k) & __bitmask;
+ if constexpr (__bits_per_element > _ToN)
+ return __vector_bitcast<_Up>(
+ __vector_bitcast<__int_for_sizeof_t<_Up>>(__bits) > 0);
+ else
+ return __vector_bitcast<_Up>(__bits != 0);
+ }
+ else
+ {
+ const _V __tmp
+ = __generate_vector<_V>([&](auto __i) constexpr {
+ return static_cast<_UpUInt>(
+ __k >> (__bits_per_element * (__i / __bits_per_element)));
+ })
+ & __generate_vector<_V>([](auto __i) constexpr {
+ return static_cast<_UpUInt>(1ull << (__i % __bits_per_element));
+ }); // mask bit index
+ return __vector_bitcast<_Up>(__tmp != _V());
+ }
+ }
+
+ // }}}
+ // __to_maskvector(_SimdWrapper) {{{
+ template <typename _Up, size_t _UpN = 0, typename _Tp, size_t _Np,
+ size_t _ToN = _UpN == 0 ? _Np : _UpN>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Up, _ToN>
+ __to_maskvector(_SimdWrapper<_Tp, _Np> __x)
+ {
+ using _TW = _SimdWrapper<_Tp, _Np>;
+ using _UW = _SimdWrapper<_Up, _ToN>;
+ using _UI = __intrinsic_type_t<_Up, _ToN>;
+ if constexpr (sizeof(_Up) == sizeof(_Tp) && sizeof(_TW) == sizeof(_UW))
+ if constexpr (_ToN <= _Np)
+ return __wrapper_bitcast<_Up, _ToN>(__x);
+ else
+ return simd_abi::deduce_t<_Up, _ToN>::__masked(
+ __wrapper_bitcast<_Up, _ToN>(__x));
+ else if constexpr (is_same_v<_Tp, bool>) // bits -> vector
+ return __to_maskvector<_Up, _ToN>(
+ _BitMask<_Np>(__x._M_data)._M_sanitized());
+ else
+ { // vector -> vector {{{
+ if (__x._M_is_constprop() || __builtin_is_constant_evaluated())
+ {
+ const auto __y = __vector_bitcast<__int_for_sizeof_t<_Tp>>(__x);
+ using _Ip = __int_for_sizeof_t<_Up>;
+ return __vector_bitcast<_Up>(
+ __generate_from_n_evaluations<std::min(_ToN, _Np),
+ __vector_type_t<_Ip, _ToN>>(
+ [&](auto __i) -> _Ip { return __y[__i.value]; }));
+ }
+ using _To = __vector_type_t<_Up, _ToN>;
+ [[maybe_unused]] constexpr size_t _FromN = _Np;
+ constexpr int _FromBytes = sizeof(_Tp);
+ constexpr int _ToBytes = sizeof(_Up);
+ const auto __k = __x._M_data;
+
+ if constexpr (_FromBytes == _ToBytes)
+ return __intrin_bitcast<_To>(__k);
+ else if constexpr (sizeof(_UI) == 16 && sizeof(__k) == 16)
+ { // SSE -> SSE {{{
+ if constexpr (_FromBytes == 4 && _ToBytes == 8)
+ return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+ else if constexpr (_FromBytes == 2 && _ToBytes == 8)
+ {
+ const auto __y
+ = __vector_bitcast<int>(__interleave128_lo(__k, __k));
+ return __intrin_bitcast<_To>(__interleave128_lo(__y, __y));
+ }
+ else if constexpr (_FromBytes == 1 && _ToBytes == 8)
+ {
+ auto __y
+ = __vector_bitcast<short>(__interleave128_lo(__k, __k));
+ auto __z = __vector_bitcast<int>(__interleave128_lo(__y, __y));
+ return __intrin_bitcast<_To>(__interleave128_lo(__z, __z));
+ }
+ else if constexpr (_FromBytes == 8 && _ToBytes == 4 && __have_sse2)
+ return __intrin_bitcast<_To>(
+ _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()));
+ else if constexpr (_FromBytes == 8 && _ToBytes == 4)
+ return __vector_shuffle<1, 3, 6, 7>(__vector_bitcast<_Up>(__k),
+ _UI());
+ else if constexpr (_FromBytes == 2 && _ToBytes == 4)
+ return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+ else if constexpr (_FromBytes == 1 && _ToBytes == 4)
+ {
+ const auto __y
+ = __vector_bitcast<short>(__interleave128_lo(__k, __k));
+ return __intrin_bitcast<_To>(__interleave128_lo(__y, __y));
+ }
+ else if constexpr (_FromBytes == 8 && _ToBytes == 2)
+ {
+ if constexpr (__have_sse2 && !__have_ssse3)
+ return __intrin_bitcast<_To>(_mm_packs_epi32(
+ _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()),
+ __m128i()));
+ else
+ return __intrin_bitcast<_To>(
+ __vector_permute<3, 7, -1, -1, -1, -1, -1, -1>(
+ __vector_bitcast<_Up>(__k)));
+ }
+ else if constexpr (_FromBytes == 4 && _ToBytes == 2)
+ return __intrin_bitcast<_To>(
+ _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i()));
+ else if constexpr (_FromBytes == 1 && _ToBytes == 2)
+ return __intrin_bitcast<_To>(__interleave128_lo(__k, __k));
+ else if constexpr (_FromBytes == 8 && _ToBytes == 1 && __have_ssse3)
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(7, 15, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1)));
+ else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+ {
+ auto __y
+ = _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i());
+ __y = _mm_packs_epi32(__y, __m128i());
+ return __intrin_bitcast<_To>(_mm_packs_epi16(__y, __m128i()));
+ }
+ else if constexpr (_FromBytes == 4 && _ToBytes == 1 && __have_ssse3)
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1)));
+ else if constexpr (_FromBytes == 4 && _ToBytes == 1)
+ {
+ const auto __y
+ = _mm_packs_epi32(__vector_bitcast<_LLong>(__k), __m128i());
+ return __intrin_bitcast<_To>(_mm_packs_epi16(__y, __m128i()));
+ }
+ else if constexpr (_FromBytes == 2 && _ToBytes == 1)
+ return __intrin_bitcast<_To>(
+ _mm_packs_epi16(__vector_bitcast<_LLong>(__k), __m128i()));
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (sizeof(_UI) == 32 && sizeof(__k) == 32)
+ { // AVX -> AVX {{{
+ if constexpr (_FromBytes == _ToBytes)
+ __assert_unreachable<_Tp>();
+ else if constexpr (_FromBytes == _ToBytes * 2)
+ {
+ const auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+ _mm_packs_epi16(__lo128(__y), __hi128(__y))));
+ }
+ else if constexpr (_FromBytes == _ToBytes * 4)
+ {
+ const auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+ _mm_packs_epi16(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+ __m128i())));
+ }
+ else if constexpr (_FromBytes == _ToBytes * 8)
+ {
+ const auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(_mm256_castsi128_si256(
+ _mm_shuffle_epi8(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+ _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1))));
+ }
+ else if constexpr (_FromBytes * 2 == _ToBytes)
+ {
+ auto __y = __xzyw(__to_intrin(__k));
+ if constexpr (std::is_floating_point_v<_Tp>)
+ return __intrin_bitcast<_To>(_mm256_unpacklo_ps(__y, __y));
+ else
+ return __intrin_bitcast<_To>(_mm256_unpacklo_epi8(__y, __y));
+ }
+ else if constexpr (_FromBytes * 4 == _ToBytes)
+ {
+ auto __y
+ = _mm_unpacklo_epi8(__lo128(__vector_bitcast<_LLong>(__k)),
+ __lo128(__vector_bitcast<_LLong>(
+ __k))); // drops 3/4 of input
+ return __intrin_bitcast<_To>(
+ __concat(_mm_unpacklo_epi16(__y, __y),
+ _mm_unpackhi_epi16(__y, __y)));
+ }
+ else if constexpr (_FromBytes == 1 && _ToBytes == 8)
+ {
+ auto __y
+ = _mm_unpacklo_epi8(__lo128(__vector_bitcast<_LLong>(__k)),
+ __lo128(__vector_bitcast<_LLong>(
+ __k))); // drops 3/4 of input
+ __y = _mm_unpacklo_epi16(__y,
+ __y); // drops another 1/2 => 7/8 total
+ return __intrin_bitcast<_To>(
+ __concat(_mm_unpacklo_epi32(__y, __y),
+ _mm_unpackhi_epi32(__y, __y)));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (sizeof(_UI) == 32 && sizeof(__k) == 16)
+ { // SSE -> AVX {{{
+ if constexpr (_FromBytes == _ToBytes)
+ return __intrin_bitcast<_To>(
+ __intrinsic_type_t<_Tp, 32 / sizeof(_Tp)>(
+ __zero_extend(__to_intrin(__k))));
+ else if constexpr (_FromBytes * 2 == _ToBytes)
+ { // keep all
+ return __intrin_bitcast<_To>(
+ __concat(_mm_unpacklo_epi8(__vector_bitcast<_LLong>(__k),
+ __vector_bitcast<_LLong>(__k)),
+ _mm_unpackhi_epi8(__vector_bitcast<_LLong>(__k),
+ __vector_bitcast<_LLong>(__k))));
+ }
+ else if constexpr (_FromBytes * 4 == _ToBytes)
+ {
+ if constexpr (__have_avx2)
+ {
+ return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+ __concat(__vector_bitcast<_LLong>(__k),
+ __vector_bitcast<_LLong>(__k)),
+ _mm256_setr_epi8(0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3,
+ 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6,
+ 7, 7, 7, 7)));
+ }
+ else
+ {
+ return __intrin_bitcast<_To>(__concat(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(0, 0, 0, 0, 1, 1, 1, 1, 2,
+ 2, 2, 2, 3, 3, 3, 3)),
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(4, 4, 4, 4, 5, 5, 5, 5, 6,
+ 6, 6, 6, 7, 7, 7, 7))));
+ }
+ }
+ else if constexpr (_FromBytes * 8 == _ToBytes)
+ {
+ if constexpr (__have_avx2)
+ {
+ return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+ __concat(__vector_bitcast<_LLong>(__k),
+ __vector_bitcast<_LLong>(__k)),
+ _mm256_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
+ 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
+ 3, 3, 3, 3)));
+ }
+ else
+ {
+ return __intrin_bitcast<_To>(__concat(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(0, 0, 0, 0, 0, 0, 0, 0, 1,
+ 1, 1, 1, 1, 1, 1, 1)),
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(2, 2, 2, 2, 2, 2, 2, 2, 3,
+ 3, 3, 3, 3, 3, 3, 3))));
+ }
+ }
+ else if constexpr (_FromBytes == _ToBytes * 2)
+ return __intrin_bitcast<_To>(__m256i(__zero_extend(
+ _mm_packs_epi16(__vector_bitcast<_LLong>(__k), __m128i()))));
+ else if constexpr (_FromBytes == 8 && _ToBytes == 2)
+ {
+ return __intrin_bitcast<_To>(__m256i(__zero_extend(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(6, 7, 14, 15, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1)))));
+ }
+ else if constexpr (_FromBytes == 4 && _ToBytes == 1)
+ {
+ return __intrin_bitcast<_To>(__m256i(__zero_extend(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1)))));
+ }
+ else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+ {
+ return __intrin_bitcast<_To>(__m256i(__zero_extend(
+ _mm_shuffle_epi8(__vector_bitcast<_LLong>(__k),
+ _mm_setr_epi8(7, 15, -1, -1, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1)))));
+ }
+ else
+ static_assert(!std::is_same_v<_Tp, _Tp>, "should be unreachable");
+ } // }}}
+ else if constexpr (sizeof(_UI) == 16 && sizeof(__k) == 32)
+ { // AVX -> SSE {{{
+ if constexpr (_FromBytes == _ToBytes)
+ { // keep low 1/2
+ return __intrin_bitcast<_To>(__lo128(__k));
+ }
+ else if constexpr (_FromBytes == _ToBytes * 2)
+ { // keep all
+ auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(
+ _mm_packs_epi16(__lo128(__y), __hi128(__y)));
+ }
+ else if constexpr (_FromBytes == _ToBytes * 4)
+ { // add 1/2 undef
+ auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(
+ _mm_packs_epi16(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+ __m128i()));
+ }
+ else if constexpr (_FromBytes == 8 && _ToBytes == 1)
+ { // add 3/4 undef
+ auto __y = __vector_bitcast<_LLong>(__k);
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(_mm_packs_epi16(__lo128(__y), __hi128(__y)),
+ _mm_setr_epi8(3, 7, 11, 15, -1, -1, -1, -1,
+ -1, -1, -1, -1, -1, -1, -1,
+ -1)));
+ }
+ else if constexpr (_FromBytes * 2 == _ToBytes)
+ { // keep low 1/4
+ auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+ }
+ else if constexpr (_FromBytes * 4 == _ToBytes)
+ { // keep low 1/8
+ auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+ __y = _mm_unpacklo_epi8(__y, __y);
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+ }
+ else if constexpr (_FromBytes * 8 == _ToBytes)
+ { // keep low 1/16
+ auto __y = __lo128(__vector_bitcast<_LLong>(__k));
+ __y = _mm_unpacklo_epi8(__y, __y);
+ __y = _mm_unpacklo_epi8(__y, __y);
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__y, __y));
+ }
+ else
+ static_assert(!std::is_same_v<_Tp, _Tp>, "should be unreachable");
+ } // }}}
+ else
+ return _Base::template __to_maskvector<_Up, _ToN>(__x);
+ /*
+ if constexpr (_FromBytes > _ToBytes) {
+ const _To __y = __vector_bitcast<_Up>(__k);
+ return [&] <std::size_t... _Is> (std::index_sequence<_Is...>) {
+ constexpr int _Stride = _FromBytes / _ToBytes;
+ return _To{__y[(_Is + 1) * _Stride - 1]...};
+ }(std::make_index_sequence<std::min(_ToN, _FromN)>());
+ } else {
+ // {0, 0, 1, 1} (_Dups = 2, _Is<4>)
+ // {0, 0, 0, 0, 1, 1, 1, 1} (_Dups = 4, _Is<8>)
+ // {0, 0, 1, 1, 2, 2, 3, 3} (_Dups = 2, _Is<8>)
+ // ...
+ return [&] <std::size_t... _Is> (std::index_sequence<_Is...>) {
+ constexpr int __dup = _ToBytes / _FromBytes;
+ return __intrin_bitcast<_To>(_From{__k[_Is / __dup]...});
+ }(std::make_index_sequence<_FromN>());
+ }
+ */
+ } // }}}
+ }
+
+ // }}}
+ // __to_bits {{{
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SanitizedBitMask<_Np>
+ __to_bits(_SimdWrapper<_Tp, _Np> __x)
+ {
+ if constexpr (is_same_v<_Tp, bool>)
+ return _BitMask<_Np>(__x._M_data)._M_sanitized();
+ else
+ {
+ if (__builtin_is_constant_evaluated() || __builtin_constant_p(__x._M_data))
+ {
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto __bools = -__vector_bitcast<_I>(__x);
+ _ULLong __k = 0;
+ __execute_n_times<_Np>([&](auto __i) {
+ __k |= (_ULLong(__bools[int(__i)]) << __i);
+ });
+ if(__builtin_is_constant_evaluated() || __builtin_constant_p(__k))
+ return __k;
+ }
+ const auto __xi = __to_intrin(__x);
+ if constexpr (is_floating_point_v<_Tp>)
+ if constexpr (sizeof(_Tp) == 4) // float
+ if constexpr (sizeof(__xi) == 16)
+ return _BitMask<_Np>(_mm_movemask_ps(__xi));
+ else if constexpr (sizeof(__xi) == 32)
+ return _BitMask<_Np>(_mm256_movemask_ps(__xi));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(
+ _mm512_movepi32_mask(reinterpret_cast<__m512i>(__xi)));
+ else
+ return _BitMask<_Np>(
+ _mm512_cmp_ps_mask(__xi, __xi, _CMP_UNORD_Q));
+ else // implies double
+ if constexpr (sizeof(__xi) == 16)
+ return _BitMask<_Np>(_mm_movemask_pd(__xi));
+ else if constexpr (sizeof(__xi) == 32)
+ return _BitMask<_Np>(_mm256_movemask_pd(__xi));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(
+ _mm512_movepi64_mask(reinterpret_cast<__m512i>(__xi)));
+ else
+ return _BitMask<_Np>(_mm512_cmp_pd_mask(__xi, __xi, _CMP_UNORD_Q));
+
+ else if constexpr (sizeof(_Tp) == 1)
+ if constexpr (sizeof(__xi) == 16)
+ if constexpr (__have_avx512bw_vl)
+ return _BitMask<_Np>(_mm_movepi8_mask(__xi));
+ else // implies SSE2
+ return _BitMask<_Np>(_mm_movemask_epi8(__xi));
+ else if constexpr (sizeof(__xi) == 32)
+ if constexpr (__have_avx512bw_vl)
+ return _BitMask<_Np>(_mm256_movepi8_mask(__xi));
+ else // implies AVX2
+ return _BitMask<_Np>(_mm256_movemask_epi8(__xi));
+ else // implies AVX512BW
+ return _BitMask<_Np>(_mm512_movepi8_mask(__xi));
+
+ else if constexpr (sizeof(_Tp) == 2)
+ if constexpr (sizeof(__xi) == 16)
+ if constexpr (__have_avx512bw_vl)
+ return _BitMask<_Np>(_mm_movepi16_mask(__xi));
+ else if constexpr (__have_avx512bw)
+ return _BitMask<_Np>(_mm512_movepi16_mask(__zero_extend(__xi)));
+ else // implies SSE2
+ return _BitMask<_Np>(
+ _mm_movemask_epi8(_mm_packs_epi16(__xi, __m128i())));
+ else if constexpr (sizeof(__xi) == 32)
+ if constexpr (__have_avx512bw_vl)
+ return _BitMask<_Np>(_mm256_movepi16_mask(__xi));
+ else if constexpr (__have_avx512bw)
+ return _BitMask<_Np>(_mm512_movepi16_mask(__zero_extend(__xi)));
+ else // implies SSE2
+ return _BitMask<_Np>(_mm_movemask_epi8(
+ _mm_packs_epi16(__lo128(__xi), __hi128(__xi))));
+ else // implies AVX512BW
+ return _BitMask<_Np>(_mm512_movepi16_mask(__xi));
+
+ else if constexpr (sizeof(_Tp) == 4)
+ if constexpr (sizeof(__xi) == 16)
+ if constexpr (__have_avx512dq_vl)
+ return _BitMask<_Np>(_mm_movepi32_mask(__xi));
+ else if constexpr (__have_avx512vl)
+ return _BitMask<_Np>(_mm_cmplt_epi32_mask(__xi, __m128i()));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi32_mask(__zero_extend(__xi)));
+ else if constexpr (__have_avx512f)
+ return _BitMask<_Np>(
+ _mm512_cmplt_epi32_mask(__zero_extend(__xi), __m512i()));
+ else // implies SSE
+ return _BitMask<_Np>(
+ _mm_movemask_ps(reinterpret_cast<__m128>(__xi)));
+ else if constexpr (sizeof(__xi) == 32)
+ if constexpr (__have_avx512dq_vl)
+ return _BitMask<_Np>(_mm256_movepi32_mask(__xi));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi32_mask(__zero_extend(__xi)));
+ else if constexpr (__have_avx512vl)
+ return _BitMask<_Np>(_mm256_cmplt_epi32_mask(__xi, __m256i()));
+ else if constexpr (__have_avx512f)
+ return _BitMask<_Np>(
+ _mm512_cmplt_epi32_mask(__zero_extend(__xi), __m512i()));
+ else // implies AVX
+ return _BitMask<_Np>(
+ _mm256_movemask_ps(reinterpret_cast<__m256>(__xi)));
+ else // implies AVX512??
+ if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi32_mask(__xi));
+ else // implies AVX512F
+ return _BitMask<_Np>(_mm512_cmplt_epi32_mask(__xi, __m512i()));
+
+ else if constexpr (sizeof(_Tp) == 8)
+ if constexpr (sizeof(__xi) == 16)
+ if constexpr (__have_avx512dq_vl)
+ return _BitMask<_Np>(_mm_movepi64_mask(__xi));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi64_mask(__zero_extend(__xi)));
+ else if constexpr (__have_avx512vl)
+ return _BitMask<_Np>(_mm_cmplt_epi64_mask(__xi, __m128i()));
+ else if constexpr (__have_avx512f)
+ return _BitMask<_Np>(
+ _mm512_cmplt_epi64_mask(__zero_extend(__xi), __m512i()));
+ else // implies SSE2
+ return _BitMask<_Np>(
+ _mm_movemask_pd(reinterpret_cast<__m128d>(__xi)));
+ else if constexpr (sizeof(__xi) == 32)
+ if constexpr (__have_avx512dq_vl)
+ return _BitMask<_Np>(_mm256_movepi64_mask(__xi));
+ else if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi64_mask(__zero_extend(__xi)));
+ else if constexpr (__have_avx512vl)
+ return _BitMask<_Np>(_mm256_cmplt_epi64_mask(__xi, __m256i()));
+ else if constexpr (__have_avx512f)
+ return _BitMask<_Np>(
+ _mm512_cmplt_epi64_mask(__zero_extend(__xi), __m512i()));
+ else // implies AVX
+ return _BitMask<_Np>(
+ _mm256_movemask_pd(reinterpret_cast<__m256d>(__xi)));
+ else // implies AVX512??
+ if constexpr (__have_avx512dq)
+ return _BitMask<_Np>(_mm512_movepi64_mask(__xi));
+ else // implies AVX512F
+ return _BitMask<_Np>(_mm512_cmplt_epi64_mask(__xi, __m512i()));
+
+ else
+ __assert_unreachable<_Tp>();
+ }
+ }
+ // }}}
+};
+
+// }}}
+// _MaskImplX86 {{{
+template <typename _Abi>
+struct _MaskImplX86 : _MaskImplX86Mixin, _MaskImplBuiltin<_Abi>
+{
+ using _MaskImplX86Mixin::__to_bits;
+ using _MaskImplX86Mixin::__to_maskvector;
+ using _MaskImplBuiltin<_Abi>::__convert;
+
+ // member types {{{
+ template <typename _Tp>
+ using _SimdMember = typename _Abi::template __traits<_Tp>::_SimdMember;
+ template <typename _Tp>
+ using _MaskMember = typename _Abi::template __traits<_Tp>::_MaskMember;
+ template <typename _Tp> static constexpr size_t size = simd_size_v<_Tp, _Abi>;
+ using _Base = _MaskImplBuiltin<_Abi>;
+
+ // }}}
+ // __broadcast {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __broadcast(bool __x)
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ return __x ? _Abi::__masked(_MaskMember<_Tp>(-1)) : _MaskMember<_Tp>();
+ else
+ return _Base::template __broadcast<_Tp>(__x);
+ }
+
+ // }}}
+ // __load {{{
+ template <typename _Tp, typename _Flags>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember<_Tp>
+ __load(const bool* __bool_mem)
+ {
+ const void* __mem = __bool_mem;
+ if constexpr (is_same_v<_Flags, vector_aligned_tag>)
+ __mem
+ = __builtin_assume_aligned(__mem,
+ memory_alignment_v<simd_mask<_Tp, _Abi>>);
+ else if constexpr (!is_same_v<_Flags, element_aligned_tag>)
+ __mem = __builtin_assume_aligned(__mem, _Flags::_S_alignment);
+
+ if constexpr (__have_avx512bw)
+ {
+ const auto __to_vec_or_bits = [](auto __bits) -> decltype(auto) {
+ if constexpr (__is_avx512_abi<_Abi>())
+ return __bits;
+ else
+ return __to_maskvector<_Tp>(
+ _BitMask<size<_Tp>>(__bits)._M_sanitized());
+ };
+
+ if constexpr (size<_Tp> <= 16 && __have_avx512vl)
+ {
+ __m128i __a = {};
+ __builtin_memcpy(&__a, __mem, size<_Tp>);
+ return __to_vec_or_bits(_mm_test_epi8_mask(__a, __a));
+ }
+ else if constexpr (size<_Tp> <= 32 && __have_avx512vl)
+ {
+ __m256i __a = {};
+ __builtin_memcpy(&__a, __mem, size<_Tp>);
+ return __to_vec_or_bits(_mm256_test_epi8_mask(__a, __a));
+ }
+ else if constexpr (size<_Tp> <= 64)
+ {
+ __m512i __a = {};
+ __builtin_memcpy(&__a, __mem, size<_Tp>);
+ return __to_vec_or_bits(_mm512_test_epi8_mask(__a, __a));
+ }
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ {
+ if constexpr (size<_Tp> <= 8)
+ {
+ __m128i __a = {};
+ __builtin_memcpy(&__a, __mem, size<_Tp>);
+ const auto __b = _mm512_cvtepi8_epi64(__a);
+ return _mm512_test_epi64_mask(__b, __b);
+ }
+ else if constexpr (size<_Tp> <= 16)
+ {
+ __m128i __a = {};
+ __builtin_memcpy(&__a, __mem, size<_Tp>);
+ const auto __b = _mm512_cvtepi8_epi32(__a);
+ return _mm512_test_epi32_mask(__b, __b);
+ }
+ else if constexpr (size<_Tp> <= 32)
+ {
+ __m128i __a = {};
+ __builtin_memcpy(&__a, __mem, 16);
+ const auto __b = _mm512_cvtepi8_epi32(__a);
+ __builtin_memcpy(&__a, __mem + 16, size<_Tp> - 16);
+ const auto __c = _mm512_cvtepi8_epi32(__a);
+ return _mm512_test_epi32_mask(__b, __b)
+ | (_mm512_test_epi32_mask(__c, __c) << 16);
+ }
+ else if constexpr (size<_Tp> <= 64)
+ {
+ __m128i __a = {};
+ __builtin_memcpy(&__a, __mem, 16);
+ const auto __b = _mm512_cvtepi8_epi32(__a);
+ __builtin_memcpy(&__a, __mem + 16, 16);
+ const auto __c = _mm512_cvtepi8_epi32(__a);
+ if constexpr (size<_Tp> <= 48)
+ {
+ __builtin_memcpy(&__a, __mem + 32, size<_Tp> - 32);
+ const auto __d = _mm512_cvtepi8_epi32(__a);
+ return _mm512_test_epi32_mask(__b, __b)
+ | (_mm512_test_epi32_mask(__c, __c) << 16)
+ | (_ULLong(_mm512_test_epi32_mask(__d, __d)) << 32);
+ }
+ else
+ {
+ __builtin_memcpy(&__a, __mem + 16, 16);
+ const auto __d = _mm512_cvtepi8_epi32(__a);
+ __builtin_memcpy(&__a, __mem + 32, size<_Tp> - 48);
+ const auto __e = _mm512_cvtepi8_epi32(__a);
+ return _mm512_test_epi32_mask(__b, __b)
+ | (_mm512_test_epi32_mask(__c, __c) << 16)
+ | (_ULLong(_mm512_test_epi32_mask(__d, __d)) << 32)
+ | (_ULLong(_mm512_test_epi32_mask(__e, __e)) << 48);
+ }
+ }
+ else
+ __assert_unreachable<_Flags>();
+ }
+ else if constexpr (sizeof(_Tp) == 8 && size<_Tp> == 2)
+ return __vector_bitcast<_Tp>(
+ __vector_type16_t<int>{-int(__bool_mem[0]), -int(__bool_mem[0]),
+ -int(__bool_mem[1]), -int(__bool_mem[1])});
+ else if constexpr (sizeof(_Tp) == 8 && size<_Tp> <= 4 && __have_avx)
+ {
+ int __bool4;
+ __builtin_memcpy(&__bool4, __mem, size<_Tp>);
+ const auto __k
+ = __to_intrin((__vector_broadcast<4>(__bool4)
+ & __make_vector<int>(0x1, 0x100, 0x10000,
+ size<_Tp> == 4 ? 0x1000000 : 0))
+ != 0);
+ return __vector_bitcast<_Tp>(
+ __concat(_mm_unpacklo_epi32(__k, __k), _mm_unpackhi_epi32(__k, __k)));
+ }
+ else if constexpr (sizeof(_Tp) == 4 && size<_Tp> <= 4)
+ {
+ int __bools = 0;
+ __builtin_memcpy(&__bools, __mem, size<_Tp>);
+ if constexpr (__have_sse2)
+ {
+ __m128i __k = _mm_cvtsi32_si128(__bools);
+ __k = _mm_cmpgt_epi16(_mm_unpacklo_epi8(__k, __k), __m128i());
+ return __vector_bitcast<_Tp, size<_Tp>>(
+ _mm_unpacklo_epi16(__k, __k));
+ }
+ else
+ {
+ __m128 __k = _mm_cvtpi8_ps(_mm_cvtsi32_si64(__bools));
+ _mm_empty();
+ return __vector_bitcast<_Tp, size<_Tp>>(
+ _mm_cmpgt_ps(__k, __m128()));
+ }
+ }
+ else if constexpr (sizeof(_Tp) == 4 && size<_Tp> <= 8)
+ {
+ __m128i __k = {};
+ __builtin_memcpy(&__k, __mem, size<_Tp>);
+ __k = _mm_cmpgt_epi16(_mm_unpacklo_epi8(__k, __k), __m128i());
+ return __vector_bitcast<_Tp>(
+ __concat(_mm_unpacklo_epi16(__k, __k), _mm_unpackhi_epi16(__k, __k)));
+ }
+ else if constexpr (sizeof(_Tp) == 2 && size<_Tp> <= 16)
+ {
+ __m128i __k = {};
+ __builtin_memcpy(&__k, __mem, size<_Tp>);
+ __k = _mm_cmpgt_epi8(__k, __m128i());
+ if constexpr (size<_Tp> <= 8)
+ return __vector_bitcast<_Tp, size<_Tp>>(_mm_unpacklo_epi8(__k, __k));
+ else
+ return __concat(_mm_unpacklo_epi8(__k, __k),
+ _mm_unpackhi_epi8(__k, __k));
+ }
+ else
+ return _Base::template __load<_Tp, _Flags>(__bool_mem);
+ }
+
+ // }}}
+ // __from_bitmask{{{
+ template <size_t _Np, typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static _MaskMember<_Tp>
+ __from_bitmask(_SanitizedBitMask<_Np> __bits, _TypeTag<_Tp>)
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ return __bits._M_to_bits();
+ else
+ return __to_maskvector<_Tp, size<_Tp>>(__bits);
+ }
+
+ // }}}
+ // __masked_load {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ static inline _SimdWrapper<_Tp, _Np>
+ __masked_load(_SimdWrapper<_Tp, _Np> __merge, _SimdWrapper<_Tp, _Np> __mask,
+ const bool* __mem, _Fp) noexcept
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ if constexpr (__have_avx512bw_vl)
+ {
+ if constexpr (_Np <= 16)
+ {
+ const auto __a = _mm_mask_loadu_epi8(__m128i(), __mask, __mem);
+ return (__merge & ~__mask) | _mm_test_epi8_mask(__a, __a);
+ }
+ else if constexpr (_Np <= 32)
+ {
+ const auto __a
+ = _mm256_mask_loadu_epi8(__m256i(), __mask, __mem);
+ return (__merge & ~__mask) | _mm256_test_epi8_mask(__a, __a);
+ }
+ else if constexpr (_Np <= 64)
+ {
+ const auto __a
+ = _mm512_mask_loadu_epi8(__m512i(), __mask, __mem);
+ return (__merge & ~__mask) | _mm512_test_epi8_mask(__a, __a);
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ {
+ _BitOps::__bit_iteration(__mask, [&](auto __i) {
+ __merge.__set(__i, __mem[__i]);
+ });
+ return __merge;
+ }
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 32 && sizeof(_Tp) == 1)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge
+ = _mm256_mask_sub_epi8(__to_intrin(__merge), __k, __m256i(),
+ _mm256_mask_loadu_epi8(__m256i(), __k, __mem));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 16 && sizeof(_Tp) == 1)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge
+ = _mm_mask_sub_epi8(__vector_bitcast<_LLong>(__merge), __k, __m128i(),
+ _mm_mask_loadu_epi8(__m128i(), __k, __mem));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 16 && sizeof(_Tp) == 2)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = _mm256_mask_sub_epi16(
+ __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+ _mm256_cvtepi8_epi16(_mm_mask_loadu_epi8(__m128i(), __k, __mem)));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 8 && sizeof(_Tp) == 2)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = _mm_mask_sub_epi16(
+ __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+ _mm_cvtepi8_epi16(_mm_mask_loadu_epi8(__m128i(), __k, __mem)));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 8 && sizeof(_Tp) == 4)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = __vector_bitcast<_Tp>(_mm256_mask_sub_epi32(
+ __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+ _mm256_cvtepi8_epi32(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 4 && sizeof(_Tp) == 4)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = __vector_bitcast<_Tp>(_mm_mask_sub_epi32(
+ __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+ _mm_cvtepi8_epi32(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 4 && sizeof(_Tp) == 8)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = __vector_bitcast<_Tp>(_mm256_mask_sub_epi64(
+ __vector_bitcast<_LLong>(__merge), __k, __m256i(),
+ _mm256_cvtepi8_epi64(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+ }
+ else if constexpr (__have_avx512bw_vl && _Np == 2 && sizeof(_Tp) == 8)
+ {
+ const auto __k = __to_bits(__mask)._M_to_bits();
+ __merge = __vector_bitcast<_Tp>(_mm_mask_sub_epi64(
+ __vector_bitcast<_LLong>(__merge), __k, __m128i(),
+ _mm_cvtepi8_epi64(_mm_mask_loadu_epi8(__m128i(), __k, __mem))));
+ }
+ else
+ {
+ return _Base::__masked_load(__merge, __mask, __mem, _Fp{});
+ }
+ return __merge;
+ }
+
+ // __store {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ _GLIBCXX_SIMD_INTRINSIC static void __store(_SimdWrapper<_Tp, _Np> __v,
+ bool* __mem, _Fp) noexcept
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ if constexpr (__have_avx512bw_vl)
+ _CommonImplX86::__store<_Np>(
+ __vector_bitcast<char>([](auto __data) {
+ if constexpr (_Np <= 16)
+ return _mm_maskz_set1_epi8(__data, 1);
+ else if constexpr (_Np <= 32)
+ return _mm256_maskz_set1_epi8(__data, 1);
+ else
+ return _mm512_maskz_set1_epi8(__data, 1);
+ }(__v._M_data)),
+ __mem, _Fp());
+ else if constexpr (_Np <= 8)
+ _CommonImplX86::__store<_Np>(
+ __vector_bitcast<char>(
+#if defined __x86_64__
+ __make_wrapper<_ULLong>(
+ _pdep_u64(__v._M_data, 0x0101010101010101ULL), 0ull)
+#else
+ __make_wrapper<_UInt>(_pdep_u32(__v._M_data, 0x01010101U),
+ _pdep_u32(__v._M_data >> 4, 0x01010101U))
+#endif
+ ),
+ __mem, _Fp());
+ else if constexpr (_Np <= 16)
+ _mm512_mask_cvtepi32_storeu_epi8(__mem, 0xffffu >> (16 - _Np),
+ _mm512_maskz_set1_epi32(__v._M_data,
+ 1));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__is_sse_abi<_Abi>()) //{{{
+ {
+ if constexpr (_Np == 2 && sizeof(_Tp) == 8)
+ {
+ const auto __k = __vector_bitcast<int>(__v);
+ __mem[0] = -__k[1];
+ __mem[1] = -__k[3];
+ }
+ else if constexpr (_Np <= 4 && sizeof(_Tp) == 4)
+ {
+ if constexpr (__have_sse2)
+ {
+ const unsigned __bool4
+ = __vector_bitcast<_UInt>(
+ _mm_packs_epi16(_mm_packs_epi32(__intrin_bitcast<__m128i>(
+ __to_intrin(__v)),
+ __m128i()),
+ __m128i()))[0]
+ & 0x01010101u;
+ __builtin_memcpy(__mem, &__bool4, _Np);
+ }
+ else if constexpr (__have_mmx)
+ {
+ const __m64 __k
+ = _mm_cvtps_pi8(__and(__to_intrin(__v), _mm_set1_ps(1.f)));
+ __builtin_memcpy(__mem, &__k, _Np);
+ _mm_empty();
+ }
+ else
+ return _Base::__store(__v, __mem, _Fp());
+ }
+ else if constexpr (_Np <= 8 && sizeof(_Tp) == 2)
+ {
+ _CommonImplX86::__store<_Np>(
+ __vector_bitcast<char>(_mm_packs_epi16(
+ __to_intrin(__vector_bitcast<_UShort>(__v) >> 15), __m128i())),
+ __mem, _Fp());
+ }
+ else if constexpr (_Np <= 16 && sizeof(_Tp) == 1)
+ _CommonImplX86::__store<_Np>(__v._M_data & 1, __mem, _Fp());
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else if constexpr (__is_avx_abi<_Abi>()) // {{{
+ {
+ if constexpr (_Np <= 4 && sizeof(_Tp) == 8)
+ {
+ auto __k = __intrin_bitcast<__m256i>(__to_intrin(__v));
+ int __bool4;
+ if constexpr (__have_avx2)
+ __bool4 = _mm256_movemask_epi8(__k);
+ else
+ __bool4 = (_mm_movemask_epi8(__lo128(__k))
+ | (_mm_movemask_epi8(__hi128(__k)) << 16));
+ __bool4 &= 0x01010101;
+ __builtin_memcpy(__mem, &__bool4, _Np);
+ }
+ else if constexpr (_Np <= 8 && sizeof(_Tp) == 4)
+ {
+ const auto __k = __intrin_bitcast<__m256i>(__to_intrin(__v));
+ const auto __k2
+ = _mm_srli_epi16(_mm_packs_epi16(__lo128(__k), __hi128(__k)), 15);
+ const auto __k3
+ = __vector_bitcast<char>(_mm_packs_epi16(__k2, __m128i()));
+ _CommonImplX86::__store<_Np>(__k3, __mem, _Fp());
+ }
+ else if constexpr (_Np <= 16 && sizeof(_Tp) == 2)
+ {
+ if constexpr (__have_avx2)
+ {
+ const auto __x = _mm256_srli_epi16(__to_intrin(__v), 15);
+ const auto __bools = __vector_bitcast<char>(
+ _mm_packs_epi16(__lo128(__x), __hi128(__x)));
+ _CommonImplX86::__store<_Np>(__bools, __mem, _Fp());
+ }
+ else
+ {
+ const auto __bools
+ = 1
+ & __vector_bitcast<_UChar>(
+ _mm_packs_epi16(__lo128(__to_intrin(__v)),
+ __hi128(__to_intrin(__v))));
+ _CommonImplX86::__store<_Np>(__bools, __mem, _Fp());
+ }
+ }
+ else if constexpr (_Np <= 32 && sizeof(_Tp) == 1)
+ _CommonImplX86::__store<_Np>(1 & __v._M_data, __mem, _Fp());
+ else
+ __assert_unreachable<_Tp>();
+ } // }}}
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // __masked_store {{{2
+ template <typename _Tp, size_t _Np, typename _Fp>
+ static inline void __masked_store(const _SimdWrapper<_Tp, _Np> __v,
+ bool* __mem, _Fp,
+ const _SimdWrapper<_Tp, _Np> __k) noexcept
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ static_assert(is_same_v<_Tp, bool>);
+ if constexpr (_Np <= 16 && __have_avx512bw_vl)
+ _mm_mask_storeu_epi8(__mem, __k, _mm_maskz_set1_epi8(__v, 1));
+ else if constexpr (_Np <= 16)
+ _mm512_mask_cvtepi32_storeu_epi8(__mem, __k,
+ _mm512_maskz_set1_epi32(__v, 1));
+ else if constexpr (_Np <= 32 && __have_avx512bw_vl)
+ _mm256_mask_storeu_epi8(__mem, __k, _mm256_maskz_set1_epi8(__v, 1));
+ else if constexpr (_Np <= 32 && __have_avx512bw)
+ _mm256_mask_storeu_epi8(__mem, __k,
+ __lo256(_mm512_maskz_set1_epi8(__v, 1)));
+ else if constexpr (_Np <= 64 && __have_avx512bw)
+ _mm512_mask_storeu_epi8(__mem, __k, _mm512_maskz_set1_epi8(__v, 1));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ _Base::__masked_store(__v, __mem, _Fp(), __k);
+ }
+
+ // logical and bitwise operators {{{2
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __logical_and(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kand_mask8(__x._M_data, __y._M_data);
+ else if constexpr (_Np <= 16)
+ return _kand_mask16(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kand_mask32(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kand_mask64(__x._M_data, __y._M_data);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__logical_and(__x, __y);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __logical_or(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kor_mask8(__x._M_data, __y._M_data);
+ else if constexpr (_Np <= 16)
+ return _kor_mask16(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kor_mask32(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kor_mask64(__x._M_data, __y._M_data);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__logical_or(__x, __y);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_not(const _SimdWrapper<_Tp, _Np>& __x)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kandn_mask8(__x._M_data,
+ _Abi::template __implicit_mask_n<_Np>());
+ else if constexpr (_Np <= 16)
+ return _kandn_mask16(__x._M_data,
+ _Abi::template __implicit_mask_n<_Np>());
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kandn_mask32(__x._M_data,
+ _Abi::template __implicit_mask_n<_Np>());
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kandn_mask64(__x._M_data,
+ _Abi::template __implicit_mask_n<_Np>());
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__bit_not(__x);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_and(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kand_mask8(__x._M_data, __y._M_data);
+ else if constexpr (_Np <= 16)
+ return _kand_mask16(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kand_mask32(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kand_mask64(__x._M_data, __y._M_data);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__bit_and(__x, __y);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kor_mask8(__x._M_data, __y._M_data);
+ else if constexpr (_Np <= 16)
+ return _kor_mask16(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kor_mask32(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kor_mask64(__x._M_data, __y._M_data);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__bit_or(__x, __y);
+ }
+
+ template <typename _Tp, size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
+ __bit_xor(const _SimdWrapper<_Tp, _Np>& __x,
+ const _SimdWrapper<_Tp, _Np>& __y)
+ {
+ if constexpr (std::is_same_v<_Tp, bool>)
+ {
+ if constexpr (__have_avx512dq && _Np <= 8)
+ return _kxor_mask8(__x._M_data, __y._M_data);
+ else if constexpr (_Np <= 16)
+ return _kxor_mask16(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 32)
+ return _kxor_mask32(__x._M_data, __y._M_data);
+ else if constexpr (__have_avx512bw && _Np <= 64)
+ return _kxor_mask64(__x._M_data, __y._M_data);
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return _Base::__bit_xor(__x, __y);
+ }
+
+ //}}}2
+ // __masked_assign{{{
+ template <size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<bool, _Np> __k, _SimdWrapper<bool, _Np>& __lhs,
+ _SimdWrapper<bool, _Np> __rhs)
+ {
+ __lhs._M_data
+ = (~__k._M_data & __lhs._M_data) | (__k._M_data & __rhs._M_data);
+ }
+
+ template <size_t _Np>
+ _GLIBCXX_SIMD_INTRINSIC static void
+ __masked_assign(_SimdWrapper<bool, _Np> __k, _SimdWrapper<bool, _Np>& __lhs,
+ bool __rhs)
+ {
+ if (__rhs)
+ __lhs._M_data = __k._M_data | __lhs._M_data;
+ else
+ __lhs._M_data = ~__k._M_data & __lhs._M_data;
+ }
+
+ using _MaskImplBuiltin<_Abi>::__masked_assign;
+
+ //}}}
+ // __all_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __all_of(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (__have_sse4_1)
+ return 0
+ != __testc(__as_vector(__k),
+ _Abi::template __implicit_mask<_Tp>());
+ else if constexpr (std::is_same_v<_Tp, float>)
+ return (_mm_movemask_ps(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+ == (1 << _Np) - 1;
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return (_mm_movemask_pd(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+ == (1 << _Np) - 1;
+ else
+ return (_mm_movemask_epi8(__to_intrin(__k._M_data))
+ & ((1 << (_Np * sizeof(_Tp))) - 1))
+ == (1 << (_Np * sizeof(_Tp))) - 1;
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ {
+ constexpr auto _Mask = _Abi::template __implicit_mask<_Tp>();
+ const auto __kk = __k._M_data._M_data;
+ if constexpr (sizeof(__kk) == 1)
+ {
+ if constexpr (__have_avx512dq)
+ return _kortestc_mask8_u8(__kk, _Mask == 0xff ? __kk
+ : __mmask8(~_Mask));
+ else
+ return _kortestc_mask16_u8(__kk, __mmask16(~_Mask));
+ }
+ else if constexpr (sizeof(__kk) == 2)
+ return _kortestc_mask16_u8(__kk, _Mask == 0xffff ? __kk
+ : __mmask16(~_Mask));
+ else if constexpr (sizeof(__kk) == 4 && __have_avx512bw)
+ return _kortestc_mask32_u8(__kk, _Mask == 0xffffffffU
+ ? __kk
+ : __mmask32(~_Mask));
+ else if constexpr (sizeof(__kk) == 8 && __have_avx512bw)
+ return _kortestc_mask64_u8(__kk, _Mask == 0xffffffffffffffffULL
+ ? __kk
+ : __mmask64(~_Mask));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ }
+
+ // }}}
+ // __any_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __any_of(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (__have_sse4_1)
+ {
+ if constexpr (_Abi::_S_is_partial || sizeof(__k) < 16)
+ return 0
+ == __testz(__as_vector(__k),
+ _Abi::template __implicit_mask<_Tp>());
+ else
+ return 0 == __testz(__as_vector(__k), __as_vector(__k));
+ }
+ else if constexpr (std::is_same_v<_Tp, float>)
+ return (_mm_movemask_ps(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+ != 0;
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return (_mm_movemask_pd(__to_intrin(__k._M_data)) & ((1 << _Np) - 1))
+ != 0;
+ else
+ return (_mm_movemask_epi8(__to_intrin(__k._M_data))
+ & ((1 << (_Np * sizeof(_Tp))) - 1))
+ != 0;
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ return (__k._M_data._M_data & _Abi::template __implicit_mask<_Tp>()) != 0;
+ }
+
+ // }}}
+ // __none_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __none_of(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (__have_sse4_1)
+ {
+ if constexpr (_Abi::_S_is_partial || sizeof(__k) < 16)
+ return 0
+ != __testz(__as_vector(__k),
+ _Abi::template __implicit_mask<_Tp>());
+ else
+ return 0 != __testz(__as_vector(__k), __as_vector(__k));
+ }
+ else if constexpr (std::is_same_v<_Tp, float>)
+ return (__movemask(__to_intrin(__k._M_data)) & ((1 << _Np) - 1)) == 0;
+ else if constexpr (std::is_same_v<_Tp, double>)
+ return (__movemask(__to_intrin(__k._M_data)) & ((1 << _Np) - 1)) == 0;
+ else
+ return (__movemask(__to_intrin(__k._M_data))
+ & int((1ull << (_Np * sizeof(_Tp))) - 1))
+ == 0;
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ return (__k._M_data._M_data & _Abi::template __implicit_mask<_Tp>()) == 0;
+ }
+
+ // }}}
+ // __some_of {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static bool __some_of(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_sse_abi<_Abi>() || __is_avx_abi<_Abi>())
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ if constexpr (__have_sse4_1)
+ return 0
+ != __testnzc(__as_vector(__k),
+ _Abi::template __implicit_mask<_Tp>());
+ else if constexpr (std::is_same_v<_Tp, float>)
+ {
+ constexpr int __allbits = (1 << _Np) - 1;
+ const auto __tmp
+ = _mm_movemask_ps(__to_intrin(__k._M_data)) & __allbits;
+ return __tmp > 0 && __tmp < __allbits;
+ }
+ else if constexpr (std::is_same_v<_Tp, double>)
+ {
+ constexpr int __allbits = (1 << _Np) - 1;
+ const auto __tmp
+ = _mm_movemask_pd(__to_intrin(__k._M_data)) & __allbits;
+ return __tmp > 0 && __tmp < __allbits;
+ }
+ else
+ {
+ constexpr int __allbits = (1 << (_Np * sizeof(_Tp))) - 1;
+ const auto __tmp
+ = _mm_movemask_epi8(__to_intrin(__k._M_data)) & __allbits;
+ return __tmp > 0 && __tmp < __allbits;
+ }
+ }
+ else if constexpr (__is_avx512_abi<_Abi>())
+ return __any_of(__k) && !__all_of(__k);
+ else
+ __assert_unreachable<_Tp>();
+ }
+
+ // }}}
+ // __popcount {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __popcount(simd_mask<_Tp, _Abi> __k)
+ {
+ constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
+ const auto __kk = _Abi::__masked(__k._M_data)._M_data;
+ if constexpr (__is_avx512_abi<_Abi>())
+ {
+ if constexpr (_Np > 32)
+ return __builtin_popcountll(__kk);
+ else
+ return __builtin_popcount(__kk);
+ }
+ else
+ {
+ if constexpr (__have_popcnt)
+ {
+ int __bits = __movemask(__to_intrin(__vector_bitcast<_Tp>(__kk)));
+ const int __count = __builtin_popcount(__bits);
+ return std::is_integral_v<_Tp> ? __count / sizeof(_Tp) : __count;
+ }
+ else if constexpr (_Np == 2 && sizeof(_Tp) == 8)
+ {
+ const int mask = _mm_movemask_pd(__auto_bitcast(__kk));
+ return mask - (mask >> 1);
+ }
+ else if constexpr (_Np <= 4 && sizeof(_Tp) == 8)
+ {
+ auto __x = -(__lo128(__kk) + __hi128(__kk));
+ return __x[0] + __x[1];
+ }
+ else if constexpr (_Np <= 4 && sizeof(_Tp) == 4)
+ {
+ if constexpr (__have_sse2)
+ {
+ __m128i __x = __intrin_bitcast<__m128i>(__to_intrin(__kk));
+ __x = _mm_add_epi32(__x,
+ _mm_shuffle_epi32(__x,
+ _MM_SHUFFLE(0, 1, 2, 3)));
+ __x = _mm_add_epi32(
+ __x, _mm_shufflelo_epi16(__x, _MM_SHUFFLE(1, 0, 3, 2)));
+ return -_mm_cvtsi128_si32(__x);
+ }
+ else
+ return __builtin_popcount(_mm_movemask_ps(__auto_bitcast(__kk)));
+ }
+ else if constexpr (_Np <= 8 && sizeof(_Tp) == 2)
+ {
+ auto __x = __to_intrin(__kk);
+ __x
+ = _mm_add_epi16(__x,
+ _mm_shuffle_epi32(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+ __x = _mm_add_epi16(__x,
+ _mm_shufflelo_epi16(__x,
+ _MM_SHUFFLE(0, 1, 2, 3)));
+ __x = _mm_add_epi16(__x,
+ _mm_shufflelo_epi16(__x,
+ _MM_SHUFFLE(2, 3, 0, 1)));
+ return -short(_mm_extract_epi16(__x, 0));
+ }
+ else if constexpr (_Np <= 16 && sizeof(_Tp) == 1)
+ {
+ auto __x = __to_intrin(__kk);
+ __x = _mm_add_epi8(__x,
+ _mm_shuffle_epi32(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+ __x
+ = _mm_add_epi8(__x,
+ _mm_shufflelo_epi16(__x, _MM_SHUFFLE(0, 1, 2, 3)));
+ __x
+ = _mm_add_epi8(__x,
+ _mm_shufflelo_epi16(__x, _MM_SHUFFLE(2, 3, 0, 1)));
+ auto __y = -__vector_bitcast<_UChar>(__x);
+ if constexpr (__have_sse4_1)
+ return __y[0] + __y[1];
+ else
+ {
+ unsigned __z = _mm_extract_epi16(__to_intrin(__y), 0);
+ return (__z & 0xff) + (__z >> 8);
+ }
+ }
+ else if constexpr (sizeof(__kk) == 32)
+ {
+ // The following works only as long as the implementations above use
+ // a summation
+ using _I = __int_for_sizeof_t<_Tp>;
+ const auto __as_int = __vector_bitcast<_I>(__kk);
+ _MaskImplX86<simd_abi::__sse>::__popcount(
+ simd_mask<_I, simd_abi::__sse>(__private_init,
+ __lo128(__as_int)
+ + __hi128(__as_int)));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ }
+
+ // }}}
+ // __find_first_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_first_set(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ if constexpr (size<_Tp> <= 32)
+ return _tzcnt_u32(__k._M_data._M_data);
+ else
+ return _BitOps::__firstbit(__k._M_data._M_data);
+ else
+ return _Base::__find_first_set(__k);
+ }
+
+ // }}}
+ // __find_last_set {{{
+ template <typename _Tp>
+ _GLIBCXX_SIMD_INTRINSIC static int __find_last_set(simd_mask<_Tp, _Abi> __k)
+ {
+ if constexpr (__is_avx512_abi<_Abi>())
+ if constexpr (size<_Tp> <= 32)
+ return 31 - _lzcnt_u32(__k._M_data._M_data);
+ else
+ return _BitOps::__lastbit(__k._M_data._M_data);
+ else
+ return _Base::__find_last_set(__k);
+ }
+
+ // }}}
+};
+
+// }}}
+
+_GLIBCXX_SIMD_END_NAMESPACE
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_X86_H_
+
+// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h b/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h
new file mode 100644
index 00000000000..f72d7809680
--- /dev/null
+++ b/libstdc++-v3/include/experimental/bits/simd_x86_conversions.h
@@ -0,0 +1,1993 @@
+// x86 specific conversion optimizations -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+#define _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+
+#if __cplusplus >= 201703L
+
+// work around PR85827
+// 1-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v)
+{
+ static_assert(__is_vector_type_v<_V>);
+ using _Tp = typename _Traits::value_type;
+ constexpr size_t _Np = _Traits::_S_width;
+ [[maybe_unused]] const auto __intrin = __to_intrin(__v);
+ using _Up = typename _VectorTraits<_To>::value_type;
+ constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+ // [xyz]_to_[xyz] {{{2
+ [[maybe_unused]] constexpr bool __x_to_x
+ = sizeof(__v) <= 16 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __x_to_y
+ = sizeof(__v) <= 16 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __x_to_z
+ = sizeof(__v) <= 16 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __y_to_x
+ = sizeof(__v) == 32 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __y_to_y
+ = sizeof(__v) == 32 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __y_to_z
+ = sizeof(__v) == 32 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __z_to_x
+ = sizeof(__v) == 64 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __z_to_y
+ = sizeof(__v) == 64 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __z_to_z
+ = sizeof(__v) == 64 && sizeof(_To) == 64;
+
+ // iX_to_iX {{{2
+ [[maybe_unused]] constexpr bool __i_to_i
+ = is_integral_v<_Up> && is_integral_v<_Tp>;
+ [[maybe_unused]] constexpr bool __i8_to_i16
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i8_to_i32
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i8_to_i64
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i16_to_i8
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i16_to_i32
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i16_to_i64
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i32_to_i8
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i32_to_i16
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i32_to_i64
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i64_to_i8
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i64_to_i16
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i64_to_i32
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+ // [fsu]X_to_[fsu]X {{{2
+ // ibw = integral && byte or word, i.e. char and short with any signedness
+ [[maybe_unused]] constexpr bool __s64_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s32_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s16_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s8_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u64_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u32_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u16_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u8_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s64_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s32_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u64_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u32_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __ibw_to_f32
+ = is_integral_v<_Tp> && sizeof(_Tp) <= 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __ibw_to_f64
+ = is_integral_v<_Tp> && sizeof(_Tp) <= 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_f64
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_f32
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+ if constexpr (__i_to_i && __y_to_x && !__have_avx2) //{{{2
+ return __convert_x86<_To>(__lo128(__v), __hi128(__v));
+ else if constexpr (__i_to_i && __x_to_y && !__have_avx2) //{{{2
+ return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v),
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(
+ __extract_part<1, _Np / _M * 2>(__v)));
+ else if constexpr (__i_to_i) //{{{2
+ {
+ static_assert(__x_to_x || __have_avx2,
+ "integral conversions with ymm registers require AVX2");
+ static_assert(__have_avx512bw
+ || ((sizeof(_Tp) >= 4 || sizeof(__v) < 64)
+ && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+ "8/16-bit integers in zmm registers require AVX512BW");
+ static_assert((sizeof(__v) < 64 && sizeof(_To) < 64) || __have_avx512f,
+ "integral conversions with ymm registers require AVX2");
+ }
+ if constexpr (is_floating_point_v<_Tp> == is_floating_point_v<_Up> && //{{{2
+ sizeof(_Tp) == sizeof(_Up))
+ {
+ // conversion uses simple bit reinterpretation (or no conversion at all)
+ if constexpr (_Np >= _M)
+ return __intrin_bitcast<_To>(__v);
+ else
+ return __zero_extend(__vector_bitcast<_Up>(__v));
+ }
+ else if constexpr (_Np < _M && sizeof(_To) > 16) // zero extend (eg. xmm -> ymm){{{2
+ return __zero_extend(
+ __convert_x86<__vector_type_t<
+ _Up, (16 / sizeof(_Up) > _Np) ? 16 / sizeof(_Up) : _Np>>(__v));
+ else if constexpr (_Np > _M && sizeof(__v) > 16) // partial input (eg. ymm -> xmm){{{2
+ return __convert_x86<_To>(__extract_part<0, _Np / _M>(__v));
+ else if constexpr (__i64_to_i32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi64_epi32(__intrin));
+ else if constexpr (__x_to_x)
+ return __auto_bitcast(
+ _mm_shuffle_ps(__vector_bitcast<float>(__v), __m128(), 8));
+ else if constexpr (__y_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi64_epi32(__intrin));
+ else if constexpr (__y_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi64_epi32(__auto_bitcast(__v))));
+ else if constexpr (__y_to_x)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm256_permute4x64_epi64(_mm256_shuffle_epi32(__intrin, 8),
+ 0 + 4 * 2)));
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(_mm512_cvtepi64_epi32(__intrin));
+ }
+ else if constexpr (__i64_to_i16) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi64_epi16(__intrin));
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi64_epi16(__auto_bitcast(__v))));
+ else if constexpr (__x_to_x && __have_ssse3)
+ {
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(__intrin,
+ _mm_setr_epi8(0, 1, 8, 9, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80)));
+ // fallback without SSSE3
+ }
+ else if constexpr (__y_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi64_epi16(__intrin));
+ else if constexpr (__y_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi64_epi16(__auto_bitcast(__v))));
+ else if constexpr (__y_to_x)
+ {
+ const auto __a = _mm256_shuffle_epi8(
+ __intrin,
+ _mm256_setr_epi8(0, 1, 8, 9, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, 0, 1, 8, 9, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80));
+ return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+ }
+ else if constexpr (__z_to_x)
+ return __intrin_bitcast<_To>(_mm512_cvtepi64_epi16(__intrin));
+ }
+ else if constexpr (__i64_to_i8) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi64_epi8(__intrin));
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi64_epi8(__zero_extend(__intrin))));
+ else if constexpr (__y_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi64_epi8(__intrin));
+ else if constexpr (__y_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ _mm512_cvtepi64_epi8(__zero_extend(__intrin)));
+ else if constexpr (__z_to_x)
+ return __intrin_bitcast<_To>(_mm512_cvtepi64_epi8(__intrin));
+ }
+ else if constexpr (__i32_to_i64) //{{{2
+ {
+ if constexpr (__have_sse4_1 && __x_to_x)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi32_epi64(__intrin)
+ : _mm_cvtepu32_epi64(__intrin));
+ else if constexpr (__x_to_x)
+ {
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi32(__intrin, is_signed_v<_Tp>
+ ? _mm_srai_epi32(__intrin, 31)
+ : __m128i()));
+ }
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi32_epi64(__intrin)
+ : _mm256_cvtepu32_epi64(__intrin));
+ else if constexpr (__y_to_z)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi32_epi64(__intrin)
+ : _mm512_cvtepu32_epi64(__intrin));
+ }
+ else if constexpr (__i32_to_i16) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi32_epi16(__intrin));
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi32_epi16(__auto_bitcast(__v))));
+ else if constexpr (__x_to_x && __have_ssse3)
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ __intrin, _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80)));
+ else if constexpr (__x_to_x)
+ {
+ auto __a = _mm_unpacklo_epi16(__intrin, __m128i()); // 0o.o 1o.o
+ auto __b = _mm_unpackhi_epi16(__intrin, __m128i()); // 2o.o 3o.o
+ auto __c = _mm_unpacklo_epi16(__a, __b); // 02oo ..oo
+ auto __d = _mm_unpackhi_epi16(__a, __b); // 13oo ..oo
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi16(__c, __d)); // 0123 oooo
+ }
+ else if constexpr (__y_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi32_epi16(__intrin));
+ else if constexpr (__y_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi32_epi16(__auto_bitcast(__v))));
+ else if constexpr (__y_to_x)
+ {
+ auto __a = _mm256_shuffle_epi8(
+ __intrin,
+ _mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, 0, 1, 4, 5, 8,
+ 9, 12, 13, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80));
+ return __intrin_bitcast<_To>(__lo128(
+ _mm256_permute4x64_epi64(__a,
+ 0xf8))); // __a[0] __a[2] | __a[3] __a[3]
+ }
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(_mm512_cvtepi32_epi16(__intrin));
+ }
+ else if constexpr (__i32_to_i8) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi32_epi8(__intrin));
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi32_epi8(__zero_extend(__intrin))));
+ else if constexpr (__x_to_x && __have_ssse3)
+ {
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(__intrin,
+ _mm_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80)));
+ }
+ else if constexpr (__x_to_x)
+ {
+ const auto __a
+ = _mm_unpacklo_epi8(__intrin, __intrin); // 0... .... 1... ....
+ const auto __b
+ = _mm_unpackhi_epi8(__intrin, __intrin); // 2... .... 3... ....
+ const auto __c = _mm_unpacklo_epi8(__a, __b); // 02.. .... .... ....
+ const auto __d = _mm_unpackhi_epi8(__a, __b); // 13.. .... .... ....
+ const auto __e = _mm_unpacklo_epi8(__c, __d); // 0123 .... .... ....
+ return __intrin_bitcast<_To>(__e & _mm_cvtsi32_si128(-1));
+ }
+ else if constexpr (__y_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi32_epi8(__intrin));
+ else if constexpr (__y_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ _mm512_cvtepi32_epi8(__zero_extend(__intrin)));
+ else if constexpr (__z_to_x)
+ return __intrin_bitcast<_To>(_mm512_cvtepi32_epi8(__intrin));
+ }
+ else if constexpr (__i16_to_i64) //{{{2
+ {
+ if constexpr (__x_to_x && __have_sse4_1)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi16_epi64(__intrin)
+ : _mm_cvtepu16_epi64(__intrin));
+ else if constexpr (__x_to_x && is_signed_v<_Tp>)
+ {
+ auto __x = _mm_srai_epi16(__intrin, 15);
+ auto __y = _mm_unpacklo_epi16(__intrin, __x);
+ __x = _mm_unpacklo_epi16(__x, __x);
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__y, __x));
+ }
+ else if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi32(_mm_unpacklo_epi16(__intrin, __m128i()),
+ __m128i()));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi16_epi64(__intrin)
+ : _mm256_cvtepu16_epi64(__intrin));
+ else if constexpr (__x_to_z)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi16_epi64(__intrin)
+ : _mm512_cvtepu16_epi64(__intrin));
+ }
+ else if constexpr (__i16_to_i32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_sse4_1)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi16_epi32(__intrin)
+ : _mm_cvtepu16_epi32(__intrin));
+ else if constexpr (__x_to_x && is_signed_v<_Tp>)
+ return __intrin_bitcast<_To>(
+ _mm_srai_epi32(_mm_unpacklo_epi16(__intrin, __intrin), 16));
+ else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi16(__intrin, __m128i()));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi16_epi32(__intrin)
+ : _mm256_cvtepu16_epi32(__intrin));
+ else if constexpr (__y_to_z)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi16_epi32(__intrin)
+ : _mm512_cvtepu16_epi32(__intrin));
+ }
+ else if constexpr (__i16_to_i8) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512bw_vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi16_epi8(__intrin));
+ else if constexpr (__x_to_x && __have_avx512bw)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepi16_epi8(__zero_extend(__intrin))));
+ else if constexpr (__x_to_x && __have_ssse3)
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ __intrin, _mm_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80)));
+ else if constexpr (__x_to_x)
+ {
+ auto __a
+ = _mm_unpacklo_epi8(__intrin, __intrin); // 00.. 11.. 22.. 33..
+ auto __b
+ = _mm_unpackhi_epi8(__intrin, __intrin); // 44.. 55.. 66.. 77..
+ auto __c = _mm_unpacklo_epi8(__a, __b); // 0404 .... 1515 ....
+ auto __d = _mm_unpackhi_epi8(__a, __b); // 2626 .... 3737 ....
+ auto __e = _mm_unpacklo_epi8(__c, __d); // 0246 0246 .... ....
+ auto __f = _mm_unpackhi_epi8(__c, __d); // 1357 1357 .... ....
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__e, __f));
+ }
+ else if constexpr (__y_to_x && __have_avx512bw_vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi16_epi8(__intrin));
+ else if constexpr (__y_to_x && __have_avx512bw)
+ return __intrin_bitcast<_To>(
+ __lo256(_mm512_cvtepi16_epi8(__zero_extend(__intrin))));
+ else if constexpr (__y_to_x)
+ {
+ auto __a = _mm256_shuffle_epi8(
+ __intrin,
+ _mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, 0, 2, 4,
+ 6, 8, 10, 12, 14));
+ return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+ }
+ else if constexpr (__z_to_y && __have_avx512bw)
+ return __intrin_bitcast<_To>(_mm512_cvtepi16_epi8(__intrin));
+ else if constexpr (__z_to_y)
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__i8_to_i64) //{{{2
+ {
+ if constexpr (__x_to_x && __have_sse4_1)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi8_epi64(__intrin)
+ : _mm_cvtepu8_epi64(__intrin));
+ else if constexpr (__x_to_x && is_signed_v<_Tp>)
+ {
+ if constexpr (__have_ssse3)
+ {
+ auto __dup = _mm_unpacklo_epi8(__intrin, __intrin);
+ auto __epi16 = _mm_srai_epi16(__dup, 8);
+ _mm_shuffle_epi8(__epi16, _mm_setr_epi8(0, 1, 1, 1, 1, 1, 1, 1, 2,
+ 3, 3, 3, 3, 3, 3, 3));
+ }
+ else
+ {
+ auto __x = _mm_unpacklo_epi8(__intrin, __intrin);
+ __x = _mm_unpacklo_epi16(__x, __x);
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi32(_mm_srai_epi32(__x, 24),
+ _mm_srai_epi32(__x, 31)));
+ }
+ }
+ else if constexpr (__x_to_x)
+ {
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi32(
+ _mm_unpacklo_epi16(_mm_unpacklo_epi8(__intrin, __m128i()),
+ __m128i()),
+ __m128i()));
+ }
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi8_epi64(__intrin)
+ : _mm256_cvtepu8_epi64(__intrin));
+ else if constexpr (__x_to_z)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi8_epi64(__intrin)
+ : _mm512_cvtepu8_epi64(__intrin));
+ }
+ else if constexpr (__i8_to_i32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_sse4_1)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi8_epi32(__intrin)
+ : _mm_cvtepu8_epi32(__intrin));
+ else if constexpr (__x_to_x && is_signed_v<_Tp>)
+ {
+ const auto __x = _mm_unpacklo_epi8(__intrin, __intrin);
+ return __intrin_bitcast<_To>(
+ _mm_srai_epi32(_mm_unpacklo_epi16(__x, __x), 24));
+ }
+ else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi16(_mm_unpacklo_epi8(__intrin, __m128i()),
+ __m128i()));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi8_epi32(__intrin)
+ : _mm256_cvtepu8_epi32(__intrin));
+ else if constexpr (__x_to_z)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi8_epi32(__intrin)
+ : _mm512_cvtepu8_epi32(__intrin));
+ }
+ else if constexpr (__i8_to_i16) //{{{2
+ {
+ if constexpr (__x_to_x && __have_sse4_1)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm_cvtepi8_epi16(__intrin)
+ : _mm_cvtepu8_epi16(__intrin));
+ else if constexpr (__x_to_x && is_signed_v<_Tp>)
+ return __intrin_bitcast<_To>(
+ _mm_srai_epi16(_mm_unpacklo_epi8(__intrin, __intrin), 8));
+ else if constexpr (__x_to_x && is_unsigned_v<_Tp>)
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__intrin, __m128i()));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm256_cvtepi8_epi16(__intrin)
+ : _mm256_cvtepu8_epi16(__intrin));
+ else if constexpr (__y_to_z && __have_avx512bw)
+ return __intrin_bitcast<_To>(is_signed_v<_Tp>
+ ? _mm512_cvtepi8_epi16(__intrin)
+ : _mm512_cvtepu8_epi16(__intrin));
+ else if constexpr (__y_to_z)
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__f32_to_s64) //{{{2
+ {
+ if constexpr (__have_avx512dq_vl && __x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttps_epi64(__intrin));
+ else if constexpr (__have_avx512dq_vl && __x_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvttps_epi64(__intrin));
+ else if constexpr (__have_avx512dq && __y_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvttps_epi64(__intrin));
+ // else use scalar fallback
+ }
+ else if constexpr (__f32_to_u64) //{{{2
+ {
+ if constexpr (__have_avx512dq_vl && __x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttps_epu64(__intrin));
+ else if constexpr (__have_avx512dq_vl && __x_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvttps_epu64(__intrin));
+ else if constexpr (__have_avx512dq && __y_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvttps_epu64(__intrin));
+ // else use scalar fallback
+ }
+ else if constexpr (__f32_to_s32) //{{{2
+ {
+ if constexpr (__x_to_x || __y_to_y || __z_to_z)
+ {
+ // go to fallback, it does the right thing
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__f32_to_u32) //{{{2
+ {
+ if constexpr (__have_avx512vl && __x_to_x)
+ return __auto_bitcast(_mm_cvttps_epu32(__intrin));
+ else if constexpr (__have_avx512f && __x_to_x)
+ return __auto_bitcast(
+ __lo128(_mm512_cvttps_epu32(__auto_bitcast(__v))));
+ else if constexpr (__have_avx512vl && __y_to_y)
+ return __vector_bitcast<_Up>(_mm256_cvttps_epu32(__intrin));
+ else if constexpr (__have_avx512f && __y_to_y)
+ return __vector_bitcast<_Up>(
+ __lo256(_mm512_cvttps_epu32(__auto_bitcast(__v))));
+ else if constexpr (__x_to_x || __y_to_y || __z_to_z)
+ {
+ // go to fallback, it does the right thing. We can't use the
+ // _mm_floor_ps - 0x8000'0000 trick for f32->u32 because it would
+ // discard small input values (only 24 mantissa bits)
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else if constexpr (__f32_to_ibw) //{{{2
+ return __convert_x86<_To>(__convert_x86<__vector_type_t<int, _Np>>(__v));
+ else if constexpr (__f64_to_s64) //{{{2
+ {
+ if constexpr (__have_avx512dq_vl && __x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttpd_epi64(__intrin));
+ else if constexpr (__have_avx512dq_vl && __y_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvttpd_epi64(__intrin));
+ else if constexpr (__have_avx512dq && __z_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvttpd_epi64(__intrin));
+ // else use scalar fallback
+ }
+ else if constexpr (__f64_to_u64) //{{{2
+ {
+ if constexpr (__have_avx512dq_vl && __x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttpd_epu64(__intrin));
+ else if constexpr (__have_avx512dq_vl && __y_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvttpd_epu64(__intrin));
+ else if constexpr (__have_avx512dq && __z_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvttpd_epu64(__intrin));
+ // else use scalar fallback
+ }
+ else if constexpr (__f64_to_s32) //{{{2
+ {
+ if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttpd_epi32(__intrin));
+ else if constexpr (__y_to_x)
+ return __intrin_bitcast<_To>(_mm256_cvttpd_epi32(__intrin));
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(_mm512_cvttpd_epi32(__intrin));
+ }
+ else if constexpr (__f64_to_u32) //{{{2
+ {
+ if constexpr (__have_avx512vl && __x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvttpd_epu32(__intrin));
+ else if constexpr (__have_sse4_1 && __x_to_x)
+ return __vector_bitcast<_Up, _M>(
+ _mm_cvttpd_epi32(_mm_floor_pd(__intrin) - 0x8000'0000u))
+ ^ 0x8000'0000u;
+ else if constexpr (__x_to_x)
+ {
+ // use scalar fallback: it's only 2 values to convert, can't get much
+ // better than scalar decomposition
+ }
+ else if constexpr (__have_avx512vl && __y_to_x)
+ return __intrin_bitcast<_To>(_mm256_cvttpd_epu32(__intrin));
+ else if constexpr (__y_to_x)
+ {
+ return __intrin_bitcast<_To>(
+ __vector_bitcast<_Up>(
+ _mm256_cvttpd_epi32(_mm256_floor_pd(__intrin) - 0x8000'0000u))
+ ^ 0x8000'0000u);
+ }
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(_mm512_cvttpd_epu32(__intrin));
+ }
+ else if constexpr (__f64_to_ibw) //{{{2
+ {
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, (_Np < 4 ? 4 : _Np)>>(__v));
+ }
+ else if constexpr (__s64_to_f32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi64_ps(__intrin));
+ else if constexpr (__y_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi64_ps(__intrin));
+ else if constexpr (__z_to_y && __have_avx512dq)
+ return __intrin_bitcast<_To>(_mm512_cvtepi64_ps(__intrin));
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(
+ _mm512_cvtpd_ps(__convert_x86<__vector_type_t<double, 8>>(__v)));
+ }
+ else if constexpr (__u64_to_f32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm_cvtepu64_ps(__intrin));
+ else if constexpr (__y_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepu64_ps(__intrin));
+ else if constexpr (__z_to_y && __have_avx512dq)
+ return __intrin_bitcast<_To>(_mm512_cvtepu64_ps(__intrin));
+ else if constexpr (__z_to_y)
+ {
+ return __intrin_bitcast<_To>(
+ __lo256(_mm512_cvtepu32_ps(__auto_bitcast(
+ _mm512_cvtepi64_epi32(_mm512_srai_epi64(__intrin, 32)))))
+ * 0x100000000LL
+ + __lo256(_mm512_cvtepu32_ps(
+ __auto_bitcast(_mm512_cvtepi64_epi32(__intrin)))));
+ }
+ }
+ else if constexpr (__s32_to_f32) //{{{2
+ {
+ // use fallback (builtin conversion)
+ }
+ else if constexpr (__u32_to_f32) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ {
+ // use fallback
+ }
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepu32_ps(__auto_bitcast(__v))));
+ else if constexpr (__x_to_x && (__have_fma || __have_fma4))
+ // work around PR85819
+ return __auto_bitcast(0x10000 * _mm_cvtepi32_ps(__to_intrin(__v >> 16))
+ + _mm_cvtepi32_ps(__to_intrin(__v & 0xffff)));
+ else if constexpr (__y_to_y && __have_avx512vl)
+ {
+ // use fallback
+ }
+ else if constexpr (__y_to_y && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo256(_mm512_cvtepu32_ps(__auto_bitcast(__v))));
+ else if constexpr (__y_to_y)
+ // work around PR85819
+ return 0x10000 * _mm256_cvtepi32_ps(__to_intrin(__v >> 16))
+ + _mm256_cvtepi32_ps(__to_intrin(__v & 0xffff));
+ // else use fallback (builtin conversion)
+ }
+ else if constexpr (__ibw_to_f32) //{{{2
+ {
+ if constexpr (_M <= 4 || __have_avx2)
+ return __convert_x86<_To>(__convert_x86<__vector_type_t<int, _M>>(__v));
+ else
+ {
+ static_assert(__x_to_y);
+ __m128i __a, __b;
+ if constexpr (__have_sse4_1)
+ {
+ __a = sizeof(_Tp) == 2
+ ? (is_signed_v<_Tp> ? _mm_cvtepi16_epi32(__intrin)
+ : _mm_cvtepu16_epi32(__intrin))
+ : (is_signed_v<_Tp> ? _mm_cvtepi8_epi32(__intrin)
+ : _mm_cvtepu8_epi32(__intrin));
+ const auto __w
+ = _mm_shuffle_epi32(__intrin, sizeof(_Tp) == 2 ? 0xee : 0xe9);
+ __b = sizeof(_Tp) == 2
+ ? (is_signed_v<_Tp> ? _mm_cvtepi16_epi32(__w)
+ : _mm_cvtepu16_epi32(__w))
+ : (is_signed_v<_Tp> ? _mm_cvtepi8_epi32(__w)
+ : _mm_cvtepu8_epi32(__w));
+ }
+ else
+ {
+ __m128i __tmp;
+ if constexpr (sizeof(_Tp) == 1)
+ {
+ __tmp
+ = is_signed_v<_Tp>
+ ? _mm_srai_epi16(_mm_unpacklo_epi8(__intrin, __intrin),
+ 8)
+ : _mm_unpacklo_epi8(__intrin, __m128i());
+ }
+ else
+ {
+ static_assert(sizeof(_Tp) == 2);
+ __tmp = __intrin;
+ }
+ __a = is_signed_v<_Tp>
+ ? _mm_srai_epi32(_mm_unpacklo_epi16(__tmp, __tmp), 16)
+ : _mm_unpacklo_epi16(__tmp, __m128i());
+ __b = is_signed_v<_Tp>
+ ? _mm_srai_epi32(_mm_unpackhi_epi16(__tmp, __tmp), 16)
+ : _mm_unpackhi_epi16(__tmp, __m128i());
+ }
+ return __convert_x86<_To>(__vector_bitcast<int>(__a),
+ __vector_bitcast<int>(__b));
+ }
+ }
+ else if constexpr (__s64_to_f64) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm_cvtepi64_pd(__intrin));
+ else if constexpr (__y_to_y && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepi64_pd(__intrin));
+ else if constexpr (__z_to_z && __have_avx512dq)
+ return __intrin_bitcast<_To>(_mm512_cvtepi64_pd(__intrin));
+ else if constexpr (__z_to_z)
+ {
+ return __intrin_bitcast<_To>(
+ _mm512_cvtepi32_pd(_mm512_cvtepi64_epi32(__to_intrin(__v >> 32)))
+ * 0x100000000LL
+ + _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__intrin)));
+ }
+ }
+ else if constexpr (__u64_to_f64) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm_cvtepu64_pd(__intrin));
+ else if constexpr (__y_to_y && __have_avx512dq_vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepu64_pd(__intrin));
+ else if constexpr (__z_to_z && __have_avx512dq)
+ return __intrin_bitcast<_To>(_mm512_cvtepu64_pd(__intrin));
+ else if constexpr (__z_to_z)
+ {
+ return __intrin_bitcast<_To>(
+ _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__to_intrin(__v >> 32)))
+ * 0x100000000LL
+ + _mm512_cvtepu32_pd(_mm512_cvtepi64_epi32(__intrin)));
+ }
+ }
+ else if constexpr (__s32_to_f64) //{{{2
+ {
+ if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvtepi32_pd(__intrin));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvtepi32_pd(__intrin));
+ else if constexpr (__y_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvtepi32_pd(__intrin));
+ }
+ else if constexpr (__u32_to_f64) //{{{2
+ {
+ if constexpr (__x_to_x && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm_cvtepu32_pd(__intrin));
+ else if constexpr (__x_to_x && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo128(_mm512_cvtepu32_pd(__auto_bitcast(__v))));
+ else if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(
+ _mm_cvtepi32_pd(__to_intrin(__v ^ 0x8000'0000u)) + 0x8000'0000u);
+ else if constexpr (__x_to_y && __have_avx512vl)
+ return __intrin_bitcast<_To>(_mm256_cvtepu32_pd(__intrin));
+ else if constexpr (__x_to_y && __have_avx512f)
+ return __intrin_bitcast<_To>(
+ __lo256(_mm512_cvtepu32_pd(__auto_bitcast(__v))));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(
+ _mm256_cvtepi32_pd(__to_intrin(__v ^ 0x8000'0000u)) + 0x8000'0000u);
+ else if constexpr (__y_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvtepu32_pd(__intrin));
+ }
+ else if constexpr (__ibw_to_f64) //{{{2
+ {
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, std::max(size_t(4), _M)>>(__v));
+ }
+ else if constexpr (__f32_to_f64) //{{{2
+ {
+ if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvtps_pd(__intrin));
+ else if constexpr (__x_to_y)
+ return __intrin_bitcast<_To>(_mm256_cvtps_pd(__intrin));
+ else if constexpr (__y_to_z)
+ return __intrin_bitcast<_To>(_mm512_cvtps_pd(__intrin));
+ }
+ else if constexpr (__f64_to_f32) //{{{2
+ {
+ if constexpr (__x_to_x)
+ return __intrin_bitcast<_To>(_mm_cvtpd_ps(__intrin));
+ else if constexpr (__y_to_x)
+ return __intrin_bitcast<_To>(_mm256_cvtpd_ps(__intrin));
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(_mm512_cvtpd_ps(__intrin));
+ }
+ else //{{{2
+ __assert_unreachable<_Tp>();
+
+ // fallback:{{{2
+ return __vector_convert<_To>(__v, make_index_sequence<std::min(_M, _Np)>());
+ //}}}
+} // }}}
+// 2-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1)
+{
+ static_assert(__is_vector_type_v<_V>);
+ using _Tp = typename _Traits::value_type;
+ constexpr size_t _Np = _Traits::_S_width;
+ [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+ [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+ using _Up = typename _VectorTraits<_To>::value_type;
+ constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+ static_assert(2 * _Np <= _M, "__v1 would be discarded; use the one-argument "
+ "__convert_x86 overload instead");
+
+ // [xyz]_to_[xyz] {{{2
+ [[maybe_unused]] constexpr bool __x_to_x
+ = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __x_to_y
+ = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __x_to_z
+ = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __y_to_x
+ = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __y_to_y
+ = sizeof(__v0) == 32 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __y_to_z
+ = sizeof(__v0) == 32 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __z_to_x
+ = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __z_to_y
+ = sizeof(__v0) == 64 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __z_to_z
+ = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+ // iX_to_iX {{{2
+ [[maybe_unused]] constexpr bool __i_to_i
+ = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+ [[maybe_unused]] constexpr bool __i8_to_i16
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i8_to_i32
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i8_to_i64
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i16_to_i8
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i16_to_i32
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i16_to_i64
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i32_to_i8
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i32_to_i16
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i32_to_i64
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i64_to_i8
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i64_to_i16
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i64_to_i32
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+ // [fsu]X_to_[fsu]X {{{2
+ // ibw = integral && byte or word, i.e. char and short with any signedness
+ [[maybe_unused]] constexpr bool __i64_to_f32
+ = is_integral_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s32_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s16_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s8_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u32_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u16_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u8_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s64_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s32_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s16_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s8_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u64_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u32_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u16_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u8_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_f64
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_f32
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+ if constexpr (__i_to_i && __y_to_x && !__have_avx2)
+ { //{{{2
+ // <double, 4>, <double, 4> => <short, 8>
+ return __convert_x86<_To>(__lo128(__v0), __hi128(__v0), __lo128(__v1),
+ __hi128(__v1));
+ }
+ else if constexpr (__i_to_i)
+ { // assert ISA {{{2
+ static_assert(__x_to_x || __have_avx2,
+ "integral conversions with ymm registers require AVX2");
+ static_assert(__have_avx512bw
+ || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+ && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+ "8/16-bit integers in zmm registers require AVX512BW");
+ static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+ "integral conversions with ymm registers require AVX2");
+ }
+ // concat => use 1-arg __convert_x86 {{{2
+ if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+ || (sizeof(__v0) == 16 && __have_avx
+ && std::is_floating_point_v<_Tp>)
+ || (sizeof(__v0) == 32 && __have_avx512f
+ && (sizeof(_Tp) >= 4 || __have_avx512bw)))
+ {
+ // The ISA can handle wider input registers, so concat and use one-arg
+ // implementation. This reduces code duplication considerably.
+ return __convert_x86<_To>(__concat(__v0, __v1));
+ }
+ else
+ { //{{{2
+ // conversion using bit reinterpretation (or no conversion at all) should
+ // all go through the concat branch above:
+ static_assert(!(
+ std::is_floating_point_v<
+ _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+ if constexpr (2 * _Np < _M && sizeof(_To) > 16)
+ { // handle all zero extension{{{2
+ constexpr size_t Min = 16 / sizeof(_Up);
+ return __zero_extend(
+ __convert_x86<
+ __vector_type_t<_Up, (Min > 2 * _Np) ? Min : 2 * _Np>>(__v0,
+ __v1));
+ }
+ else if constexpr (__i64_to_i32)
+ { //{{{2
+ if constexpr (__x_to_x)
+ return __auto_bitcast(
+ _mm_shuffle_ps(__auto_bitcast(__v0), __auto_bitcast(__v1), 0x88));
+ else if constexpr (__y_to_y)
+ {
+ // AVX512F is not available (would concat otherwise)
+ return __auto_bitcast(
+ __xzyw(_mm256_shuffle_ps(__auto_bitcast(__v0),
+ __auto_bitcast(__v1), 0x88)));
+ // alternative:
+ // const auto v0_abxxcdxx = _mm256_shuffle_epi32(__v0, 8);
+ // const auto v1_efxxghxx = _mm256_shuffle_epi32(__v1, 8);
+ // const auto v_abefcdgh = _mm256_unpacklo_epi64(v0_abxxcdxx,
+ // v1_efxxghxx); return _mm256_permute4x64_epi64(v_abefcdgh,
+ // 0x01 * 0 + 0x04 * 2 + 0x10 * 1 + 0x40 * 3); // abcdefgh
+ }
+ else if constexpr (__z_to_z)
+ return __intrin_bitcast<_To>(__concat(_mm512_cvtepi64_epi32(__i0),
+ _mm512_cvtepi64_epi32(__i1)));
+ }
+ else if constexpr (__i64_to_i16)
+ { //{{{2
+ if constexpr (__x_to_x)
+ {
+ // AVX2 is not available (would concat otherwise)
+ if constexpr (__have_sse4_1)
+ {
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ _mm_blend_epi16(__i0, _mm_slli_si128(__i1, 4), 0x44),
+ _mm_setr_epi8(0, 1, 8, 9, 4, 5, 12, 13, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80)));
+ }
+ else
+ {
+ return __vector_type_t<_Up, _M>{_Up(__v0[0]), _Up(__v0[1]),
+ _Up(__v1[0]), _Up(__v1[1])};
+ }
+ }
+ else if constexpr (__y_to_x)
+ {
+ auto __a
+ = _mm256_unpacklo_epi16(__i0, __i1); // 04.. .... 26.. ....
+ auto __b
+ = _mm256_unpackhi_epi16(__i0, __i1); // 15.. .... 37.. ....
+ auto __c = _mm256_unpacklo_epi16(__a, __b); // 0145 .... 2367 ....
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi32(__lo128(__c), __hi128(__c))); // 0123 4567
+ }
+ else if constexpr (__z_to_y)
+ return __intrin_bitcast<_To>(__concat(_mm512_cvtepi64_epi16(__i0),
+ _mm512_cvtepi64_epi16(__i1)));
+ }
+ else if constexpr (__i64_to_i8)
+ { //{{{2
+ if constexpr (__x_to_x && __have_sse4_1)
+ {
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ _mm_blend_epi16(__i0, _mm_slli_si128(__i1, 4), 0x44),
+ _mm_setr_epi8(0, 8, 4, 12, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80)));
+ }
+ else if constexpr (__x_to_x && __have_ssse3)
+ {
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi16(
+ _mm_shuffle_epi8(__i0, _mm_setr_epi8(0, 8, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80)),
+ _mm_shuffle_epi8(__i1, _mm_setr_epi8(0, 8, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80))));
+ }
+ else if constexpr (__x_to_x)
+ {
+ return __vector_type_t<_Up, _M>{_Up(__v0[0]), _Up(__v0[1]),
+ _Up(__v1[0]), _Up(__v1[1])};
+ }
+ else if constexpr (__y_to_x)
+ {
+ const auto __a = _mm256_shuffle_epi8(
+ _mm256_blend_epi32(__i0, _mm256_slli_epi64(__i1, 32), 0xAA),
+ _mm256_setr_epi8(0, 8, -0x80, -0x80, 4, 12, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, 0, 8, -0x80, -0x80, 4, 12,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80));
+ return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+ } // __z_to_x uses concat fallback
+ }
+ else if constexpr (__i32_to_i16)
+ { //{{{2
+ if constexpr (__x_to_x)
+ {
+ // AVX2 is not available (would concat otherwise)
+ if constexpr (__have_sse4_1)
+ {
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ _mm_blend_epi16(__i0, _mm_slli_si128(__i1, 2), 0xaa),
+ _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7, 10, 11,
+ 14, 15)));
+ }
+ else if constexpr (__have_ssse3)
+ {
+ return __intrin_bitcast<_To>(
+ _mm_hadd_epi16(__to_intrin(__v0 << 16),
+ __to_intrin(__v1 << 16)));
+ /*
+ return _mm_unpacklo_epi64(
+ _mm_shuffle_epi8(__i0, _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12,
+ 13, 8, 9, 12, 13, 12, 13, 14, 15)), _mm_shuffle_epi8(__i1,
+ _mm_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 8, 9, 12, 13, 12, 13,
+ 14, 15)));
+ */
+ }
+ else
+ {
+ auto __a = _mm_unpacklo_epi16(__i0, __i1); // 04.. 15..
+ auto __b = _mm_unpackhi_epi16(__i0, __i1); // 26.. 37..
+ auto __c = _mm_unpacklo_epi16(__a, __b); // 0246 ....
+ auto __d = _mm_unpackhi_epi16(__a, __b); // 1357 ....
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi16(__c, __d)); // 0123 4567
+ }
+ }
+ else if constexpr (__y_to_y)
+ {
+ const auto __shuf
+ = _mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80, 0,
+ 1, 4, 5, 8, 9, 12, 13, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80);
+ auto __a = _mm256_shuffle_epi8(__i0, __shuf);
+ auto __b = _mm256_shuffle_epi8(__i1, __shuf);
+ return __intrin_bitcast<_To>(
+ __xzyw(_mm256_unpacklo_epi64(__a, __b)));
+ } // __z_to_z uses concat fallback
+ }
+ else if constexpr (__i32_to_i8)
+ { //{{{2
+ if constexpr (__x_to_x && __have_ssse3)
+ {
+ const auto shufmask
+ = _mm_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80);
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi32(_mm_shuffle_epi8(__i0, shufmask),
+ _mm_shuffle_epi8(__i1, shufmask)));
+ }
+ else if constexpr (__x_to_x)
+ {
+ auto __a = _mm_unpacklo_epi8(__i0, __i1); // 04.. .... 15.. ....
+ auto __b = _mm_unpackhi_epi8(__i0, __i1); // 26.. .... 37.. ....
+ auto __c = _mm_unpacklo_epi8(__a, __b); // 0246 .... .... ....
+ auto __d = _mm_unpackhi_epi8(__a, __b); // 1357 .... .... ....
+ auto __e = _mm_unpacklo_epi8(__c, __d); // 0123 4567 .... ....
+ return __intrin_bitcast<_To>(__e & __m128i{-1, 0});
+ }
+ else if constexpr (__y_to_x)
+ {
+ const auto __a = _mm256_shuffle_epi8(
+ _mm256_blend_epi16(__i0, _mm256_slli_epi32(__i1, 16), 0xAA),
+ _mm256_setr_epi8(0, 4, 8, 12, -0x80, -0x80, -0x80, -0x80, 2, 6,
+ 10, 14, -0x80, -0x80, -0x80, -0x80, -0x80,
+ -0x80, -0x80, -0x80, 0, 4, 8, 12, -0x80, -0x80,
+ -0x80, -0x80, 2, 6, 10, 14));
+ return __intrin_bitcast<_To>(__lo128(__a) | __hi128(__a));
+ } // __z_to_y uses concat fallback
+ }
+ else if constexpr (__i16_to_i8)
+ { //{{{2
+ if constexpr (__x_to_x && __have_ssse3)
+ {
+ const auto __shuf = reinterpret_cast<__m128i>(
+ __vector_type_t<_UChar, 16>{0, 2, 4, 6, 8, 10, 12, 14, 0x80,
+ 0x80, 0x80, 0x80, 0x80, 0x80, 0x80,
+ 0x80});
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi64(_mm_shuffle_epi8(__i0, __shuf),
+ _mm_shuffle_epi8(__i1, __shuf)));
+ }
+ else if constexpr (__x_to_x)
+ {
+ auto __a = _mm_unpacklo_epi8(__i0, __i1); // 08.. 19.. 2A.. 3B..
+ auto __b = _mm_unpackhi_epi8(__i0, __i1); // 4C.. 5D.. 6E.. 7F..
+ auto __c = _mm_unpacklo_epi8(__a, __b); // 048C .... 159D ....
+ auto __d = _mm_unpackhi_epi8(__a, __b); // 26AE .... 37BF ....
+ auto __e = _mm_unpacklo_epi8(__c, __d); // 0246 8ACE .... ....
+ auto __f = _mm_unpackhi_epi8(__c, __d); // 1357 9BDF .... ....
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(__e, __f));
+ }
+ else if constexpr (__y_to_y)
+ {
+ return __intrin_bitcast<_To>(__xzyw(_mm256_shuffle_epi8(
+ (__to_intrin(__v0) & _mm256_set1_epi32(0x00ff00ff))
+ | _mm256_slli_epi16(__i1, 8),
+ _mm256_setr_epi8(0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7, 9, 11,
+ 13, 15, 0, 2, 4, 6, 8, 10, 12, 14, 1, 3, 5, 7,
+ 9, 11, 13, 15))));
+ } // __z_to_z uses concat fallback
+ }
+ else if constexpr (__i64_to_f32)
+ { //{{{2
+ if constexpr (__x_to_x)
+ return __make_wrapper<float>(__v0[0], __v0[1], __v1[0], __v1[1]);
+ else if constexpr (__y_to_y)
+ {
+ static_assert(__y_to_y && __have_avx2);
+ const auto __a = _mm256_unpacklo_epi32(__i0, __i1); // aeAE cgCG
+ const auto __b = _mm256_unpackhi_epi32(__i0, __i1); // bfBF dhDH
+ const auto __lo32 = _mm256_unpacklo_epi32(__a, __b); // abef cdgh
+ const auto __hi32
+ = __vector_bitcast<conditional_t<is_signed_v<_Tp>, int, _UInt>>(
+ _mm256_unpackhi_epi32(__a, __b)); // ABEF CDGH
+ const auto __hi
+ = 0x100000000LL
+ * __convert_x86<__vector_type_t<float, 8>>(__hi32);
+ const auto __mid
+ = 0x10000 * _mm256_cvtepi32_ps(_mm256_srli_epi32(__lo32, 16));
+ const auto __lo
+ = _mm256_cvtepi32_ps(_mm256_set1_epi32(0x0000ffffu) & __lo32);
+ return __xzyw((__hi + __mid) + __lo);
+ }
+ else if constexpr (__z_to_z && __have_avx512dq)
+ {
+ return std::is_signed_v<_Tp> ? __concat(_mm512_cvtepi64_ps(__i0),
+ _mm512_cvtepi64_ps(__i1))
+ : __concat(_mm512_cvtepu64_ps(__i0),
+ _mm512_cvtepu64_ps(__i1));
+ }
+ else if constexpr (__z_to_z && std::is_signed_v<_Tp>)
+ {
+ const __m512 __hi32 = _mm512_cvtepi32_ps(
+ __concat(_mm512_cvtepi64_epi32(__to_intrin(__v0 >> 32)),
+ _mm512_cvtepi64_epi32(__to_intrin(__v1 >> 32))));
+ const __m512i __lo32 = __concat(_mm512_cvtepi64_epi32(__i0),
+ _mm512_cvtepi64_epi32(__i1));
+ // split low 32-bits, because if __hi32 is a small negative
+ // number, the 24-bit mantissa may lose important information if
+ // any of the high 8 bits of __lo32 is set, leading to
+ // catastrophic cancelation in the FMA
+ const __m512 __hi16
+ = _mm512_cvtepu32_ps(_mm512_set1_epi32(0xffff0000u) & __lo32);
+ const __m512 __lo16
+ = _mm512_cvtepi32_ps(_mm512_set1_epi32(0x0000ffffu) & __lo32);
+ return (__hi32 * 0x100000000LL + __hi16) + __lo16;
+ }
+ else if constexpr (__z_to_z && std::is_unsigned_v<_Tp>)
+ {
+ return __intrin_bitcast<_To>(
+ _mm512_cvtepu32_ps(
+ __concat(_mm512_cvtepi64_epi32(_mm512_srai_epi64(__i0, 32)),
+ _mm512_cvtepi64_epi32(_mm512_srai_epi64(__i1, 32))))
+ * 0x100000000LL
+ + _mm512_cvtepu32_ps(__concat(_mm512_cvtepi64_epi32(__i0),
+ _mm512_cvtepi64_epi32(__i1))));
+ }
+ }
+ else if constexpr (__f64_to_s32)
+ { //{{{2
+ // use concat fallback
+ }
+ else if constexpr (__f64_to_u32)
+ { //{{{2
+ if constexpr (__x_to_x && __have_sse4_1)
+ {
+ return __vector_bitcast<_Up, _M>(_mm_unpacklo_epi64(
+ _mm_cvttpd_epi32(_mm_floor_pd(__i0) - 0x8000'0000u),
+ _mm_cvttpd_epi32(_mm_floor_pd(__i1) - 0x8000'0000u)))
+ ^ 0x8000'0000u;
+ // without SSE4.1 just use the scalar fallback, it's only four
+ // values
+ }
+ else if constexpr (__y_to_y)
+ {
+ return __vector_bitcast<_Up>(
+ __concat(_mm256_cvttpd_epi32(_mm256_floor_pd(__i0)
+ - 0x8000'0000u),
+ _mm256_cvttpd_epi32(_mm256_floor_pd(__i1)
+ - 0x8000'0000u)))
+ ^ 0x8000'0000u;
+ } // __z_to_z uses fallback
+ }
+ else if constexpr (__f64_to_ibw)
+ { //{{{2
+ // one-arg __f64_to_ibw goes via _SimdWrapper<int, ?>. The fallback
+ // would go via two independet conversions to _SimdWrapper<_To> and
+ // subsequent interleaving. This is better, because f64->__i32 allows
+ // to combine __v0 and __v1 into one register:
+ // if constexpr (__z_to_x || __y_to_x) {
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1));
+ //}
+ }
+ else if constexpr (__f32_to_ibw)
+ { //{{{2
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, _Np>>(__v0),
+ __convert_x86<__vector_type_t<int, _Np>>(__v1));
+ //}}}
+ }
+
+ // fallback: {{{2
+ if constexpr (sizeof(_To) >= 32)
+ // if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+ return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v0),
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v1));
+ else if constexpr (sizeof(_To) == 16)
+ {
+ const auto __lo = __to_intrin(__convert_x86<_To>(__v0));
+ const auto __hi = __to_intrin(__convert_x86<_To>(__v1));
+ if constexpr (sizeof(_Up) * _Np == 8)
+ {
+ if constexpr (is_floating_point_v<_Up>)
+ return __auto_bitcast(
+ _mm_unpacklo_pd(__vector_bitcast<double>(__lo),
+ __vector_bitcast<double>(__hi)));
+ else
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+ }
+ else if constexpr (sizeof(_Up) * _Np == 4)
+ {
+ if constexpr (is_floating_point_v<_Up>)
+ return __auto_bitcast(
+ _mm_unpacklo_ps(__vector_bitcast<float>(__lo),
+ __vector_bitcast<float>(__hi)));
+ else
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__lo, __hi));
+ }
+ else if constexpr (sizeof(_Up) * _Np == 2)
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi16(__lo, __hi));
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return __vector_convert<_To>(__v0, __v1, make_index_sequence<_Np>());
+ //}}}
+ }
+} //}}}1
+// 4-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3)
+{
+ static_assert(__is_vector_type_v<_V>);
+ using _Tp = typename _Traits::value_type;
+ constexpr size_t _Np = _Traits::_S_width;
+ [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+ [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+ [[maybe_unused]] const auto __i2 = __to_intrin(__v2);
+ [[maybe_unused]] const auto __i3 = __to_intrin(__v3);
+ using _Up = typename _VectorTraits<_To>::value_type;
+ constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+ static_assert(4 * _Np <= _M,
+ "__v2/__v3 would be discarded; use the two/one-argument "
+ "__convert_x86 overload instead");
+
+ // [xyz]_to_[xyz] {{{2
+ [[maybe_unused]] constexpr bool __x_to_x
+ = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __x_to_y
+ = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __x_to_z
+ = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __y_to_x
+ = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __y_to_y
+ = sizeof(__v0) == 32 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __y_to_z
+ = sizeof(__v0) == 32 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __z_to_x
+ = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __z_to_y
+ = sizeof(__v0) == 64 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __z_to_z
+ = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+ // iX_to_iX {{{2
+ [[maybe_unused]] constexpr bool __i_to_i
+ = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+ [[maybe_unused]] constexpr bool __i8_to_i16
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i8_to_i32
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i8_to_i64
+ = __i_to_i && sizeof(_Tp) == 1 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i16_to_i8
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i16_to_i32
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __i16_to_i64
+ = __i_to_i && sizeof(_Tp) == 2 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i32_to_i8
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i32_to_i16
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i32_to_i64
+ = __i_to_i && sizeof(_Tp) == 4 && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __i64_to_i8
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __i64_to_i16
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 2;
+ [[maybe_unused]] constexpr bool __i64_to_i32
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 4;
+
+ // [fsu]X_to_[fsu]X {{{2
+ // ibw = integral && byte or word, i.e. char and short with any signedness
+ [[maybe_unused]] constexpr bool __i64_to_f32
+ = is_integral_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s32_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s16_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s8_to_f32
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u32_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u16_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __u8_to_f32
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+ [[maybe_unused]] constexpr bool __s64_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s32_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s16_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __s8_to_f64
+ = is_integral_v<_Tp> && is_signed_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u64_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u32_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u16_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 2
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __u8_to_f64
+ = is_integral_v<_Tp> && is_unsigned_v<_Tp> && sizeof(_Tp) == 1
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f32_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_s64
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_s32
+ = is_integral_v<_Up> && is_signed_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u64
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 8
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_u32
+ = is_integral_v<_Up> && is_unsigned_v<_Up> && sizeof(_Up) == 4
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 4;
+ [[maybe_unused]] constexpr bool __f64_to_ibw
+ = is_integral_v<_Up> && sizeof(_Up) <= 2
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+ [[maybe_unused]] constexpr bool __f32_to_f64
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 4
+ && is_floating_point_v<_Up> && sizeof(_Up) == 8;
+ [[maybe_unused]] constexpr bool __f64_to_f32
+ = is_floating_point_v<_Tp> && sizeof(_Tp) == 8
+ && is_floating_point_v<_Up> && sizeof(_Up) == 4;
+
+ if constexpr (__i_to_i && __y_to_x && !__have_avx2)
+ { //{{{2
+ // <double, 4>, <double, 4>, <double, 4>, <double, 4> => <char, 16>
+ return __convert_x86<_To>(__lo128(__v0), __hi128(__v0), __lo128(__v1),
+ __hi128(__v1), __lo128(__v2), __hi128(__v2),
+ __lo128(__v3), __hi128(__v3));
+ }
+ else if constexpr (__i_to_i)
+ { // assert ISA {{{2
+ static_assert(__x_to_x || __have_avx2,
+ "integral conversions with ymm registers require AVX2");
+ static_assert(__have_avx512bw
+ || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+ && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+ "8/16-bit integers in zmm registers require AVX512BW");
+ static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+ "integral conversions with ymm registers require AVX2");
+ }
+ // concat => use 2-arg __convert_x86 {{{2
+ if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+ || (sizeof(__v0) == 16 && __have_avx
+ && std::is_floating_point_v<_Tp>)
+ || (sizeof(__v0) == 32 && __have_avx512f))
+ {
+ // The ISA can handle wider input registers, so concat and use two-arg
+ // implementation. This reduces code duplication considerably.
+ return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3));
+ }
+ else
+ { //{{{2
+ // conversion using bit reinterpretation (or no conversion at all) should
+ // all go through the concat branch above:
+ static_assert(!(
+ std::is_floating_point_v<
+ _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+ if constexpr (4 * _Np < _M && sizeof(_To) > 16)
+ { // handle all zero extension{{{2
+ constexpr size_t Min = 16 / sizeof(_Up);
+ return __zero_extend(
+ __convert_x86<
+ __vector_type_t<_Up, (Min > 4 * _Np) ? Min : 4 * _Np>>(__v0, __v1,
+ __v2,
+ __v3));
+ }
+ else if constexpr (__i64_to_i16)
+ { //{{{2
+ if constexpr (__x_to_x && __have_sse4_1)
+ {
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ _mm_blend_epi16(_mm_blend_epi16(__i0, _mm_slli_si128(__i1, 2),
+ 0x22),
+ _mm_blend_epi16(_mm_slli_si128(__i2, 4),
+ _mm_slli_si128(__i3, 6), 0x88),
+ 0xcc),
+ _mm_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6, 7, 14,
+ 15)));
+ }
+ else if constexpr (__y_to_y && __have_avx2)
+ {
+ return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+ __xzyw(_mm256_blend_epi16(
+ __auto_bitcast(
+ _mm256_shuffle_ps(__vector_bitcast<float>(__v0),
+ __vector_bitcast<float>(__v2),
+ 0x88)), // 0.1. 8.9. 2.3. A.B.
+ __to_intrin(__vector_bitcast<int>(_mm256_shuffle_ps(
+ __vector_bitcast<float>(__v1),
+ __vector_bitcast<float>(__v3), 0x88))
+ << 16), // .4.5 .C.D .6.7 .E.F
+ 0xaa) // 0415 8C9D 2637 AEBF
+ ), // 0415 2637 8C9D AEBF
+ _mm256_setr_epi8(0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7, 10, 11,
+ 14, 15, 0, 1, 4, 5, 8, 9, 12, 13, 2, 3, 6, 7,
+ 10, 11, 14, 15)));
+ /*
+ auto __a = _mm256_unpacklo_epi16(__v0, __v1); // 04.. .... 26..
+ .... auto __b = _mm256_unpackhi_epi16(__v0, __v1); // 15..
+ .... 37.. .... auto __c = _mm256_unpacklo_epi16(__v2, __v3); //
+ 8C.. .... AE.. .... auto __d = _mm256_unpackhi_epi16(__v2, __v3);
+ // 9D.. .... BF.. .... auto __e = _mm256_unpacklo_epi16(__a, __b);
+ // 0145 .... 2367 .... auto __f = _mm256_unpacklo_epi16(__c, __d);
+ // 89CD .... ABEF .... auto __g = _mm256_unpacklo_epi64(__e, __f);
+ // 0145 89CD 2367 ABEF return __concat(
+ _mm_unpacklo_epi32(__lo128(__g), __hi128(__g)),
+ _mm_unpackhi_epi32(__lo128(__g), __hi128(__g))); // 0123 4567
+ 89AB CDEF
+ */
+ } // else use fallback
+ }
+ else if constexpr (__i64_to_i8)
+ { //{{{2
+ if constexpr (__x_to_x)
+ {
+ // TODO: use fallback for now
+ }
+ else if constexpr (__y_to_x)
+ {
+ auto __a = _mm256_srli_epi32(_mm256_slli_epi32(__i0, 24), 24)
+ | _mm256_srli_epi32(_mm256_slli_epi32(__i1, 24), 16)
+ | _mm256_srli_epi32(_mm256_slli_epi32(__i2, 24), 8)
+ | _mm256_slli_epi32(
+ __i3, 24); // 048C .... 159D .... 26AE .... 37BF ....
+ /*return _mm_shuffle_epi8(
+ _mm_blend_epi32(__lo128(__a) << 32, __hi128(__a), 0x5),
+ _mm_setr_epi8(4, 12, 0, 8, 5, 13, 1, 9, 6, 14, 2, 10, 7, 15,
+ 3, 11));*/
+ auto __b = _mm256_unpackhi_epi64(
+ __a, __a); // 159D .... 159D .... 37BF .... 37BF ....
+ auto __c = _mm256_unpacklo_epi8(
+ __a, __b); // 0145 89CD .... .... 2367 ABEF .... ....
+ return __intrin_bitcast<_To>(
+ _mm_unpacklo_epi16(__lo128(__c),
+ __hi128(__c))); // 0123 4567 89AB CDEF
+ }
+ }
+ else if constexpr (__i32_to_i8)
+ { //{{{2
+ if constexpr (__x_to_x)
+ {
+ if constexpr (__have_ssse3)
+ {
+ const auto __x0 = __vector_bitcast<_UInt>(__v0) & 0xff;
+ const auto __x1 = (__vector_bitcast<_UInt>(__v1) & 0xff) << 8;
+ const auto __x2 = (__vector_bitcast<_UInt>(__v2) & 0xff)
+ << 16;
+ const auto __x3 = __vector_bitcast<_UInt>(__v3) << 24;
+ return __intrin_bitcast<_To>(
+ _mm_shuffle_epi8(__to_intrin(__x0 | __x1 | __x2 | __x3),
+ _mm_setr_epi8(0, 4, 8, 12, 1, 5, 9, 13, 2,
+ 6, 10, 14, 3, 7, 11, 15)));
+ }
+ else
+ {
+ auto __a
+ = _mm_unpacklo_epi8(__i0, __i2); // 08.. .... 19.. ....
+ auto __b
+ = _mm_unpackhi_epi8(__i0, __i2); // 2A.. .... 3B.. ....
+ auto __c
+ = _mm_unpacklo_epi8(__i1, __i3); // 4C.. .... 5D.. ....
+ auto __d
+ = _mm_unpackhi_epi8(__i1, __i3); // 6E.. .... 7F.. ....
+ auto __e = _mm_unpacklo_epi8(__a, __c); // 048C .... .... ....
+ auto __f = _mm_unpackhi_epi8(__a, __c); // 159D .... .... ....
+ auto __g = _mm_unpacklo_epi8(__b, __d); // 26AE .... .... ....
+ auto __h = _mm_unpackhi_epi8(__b, __d); // 37BF .... .... ....
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi8(
+ _mm_unpacklo_epi8(__e, __g), // 0246 8ACE .... ....
+ _mm_unpacklo_epi8(__f, __h) // 1357 9BDF .... ....
+ )); // 0123 4567 89AB CDEF
+ }
+ }
+ else if constexpr (__y_to_y)
+ {
+ const auto __a = _mm256_shuffle_epi8(
+ __to_intrin((__vector_bitcast<_UShort>(_mm256_blend_epi16(
+ __i0, _mm256_slli_epi32(__i1, 16), 0xAA))
+ & 0xff)
+ | (__vector_bitcast<_UShort>(_mm256_blend_epi16(
+ __i2, _mm256_slli_epi32(__i3, 16), 0xAA))
+ << 8)),
+ _mm256_setr_epi8(0, 4, 8, 12, 2, 6, 10, 14, 1, 5, 9, 13, 3, 7,
+ 11, 15, 0, 4, 8, 12, 2, 6, 10, 14, 1, 5, 9, 13,
+ 3, 7, 11, 15));
+ return __intrin_bitcast<_To>(_mm256_permutevar8x32_epi32(
+ __a, _mm256_setr_epi32(0, 4, 1, 5, 2, 6, 3, 7)));
+ }
+ }
+ else if constexpr (__i64_to_f32)
+ { //{{{2
+ // this branch is only relevant with AVX and w/o AVX2 (i.e. no ymm
+ // integers)
+ if constexpr (__x_to_y)
+ {
+ return __make_wrapper<float>(__v0[0], __v0[1], __v1[0], __v1[1],
+ __v2[0], __v2[1], __v3[0], __v3[1]);
+
+ const auto __a = _mm_unpacklo_epi32(__i0, __i1); // acAC
+ const auto __b = _mm_unpackhi_epi32(__i0, __i1); // bdBD
+ const auto __c = _mm_unpacklo_epi32(__i2, __i3); // egEG
+ const auto __d = _mm_unpackhi_epi32(__i2, __i3); // fhFH
+ const auto __lo32a = _mm_unpacklo_epi32(__a, __b); // abcd
+ const auto __lo32b = _mm_unpacklo_epi32(__c, __d); // efgh
+ const auto __hi32
+ = __vector_bitcast<conditional_t<is_signed_v<_Tp>, int, _UInt>>(
+ __concat(_mm_unpackhi_epi32(__a, __b),
+ _mm_unpackhi_epi32(__c, __d))); // ABCD EFGH
+ const auto __hi
+ = 0x100000000LL
+ * __convert_x86<__vector_type_t<float, 8>>(__hi32);
+ const auto __mid
+ = 0x10000
+ * _mm256_cvtepi32_ps(__concat(_mm_srli_epi32(__lo32a, 16),
+ _mm_srli_epi32(__lo32b, 16)));
+ const auto __lo = _mm256_cvtepi32_ps(
+ __concat(_mm_set1_epi32(0x0000ffffu) & __lo32a,
+ _mm_set1_epi32(0x0000ffffu) & __lo32b));
+ return (__hi + __mid) + __lo;
+ }
+ }
+ else if constexpr (__f64_to_ibw)
+ { //{{{2
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1),
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v2, __v3));
+ }
+ else if constexpr (__f32_to_ibw)
+ { //{{{2
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, _Np>>(__v0),
+ __convert_x86<__vector_type_t<int, _Np>>(__v1),
+ __convert_x86<__vector_type_t<int, _Np>>(__v2),
+ __convert_x86<__vector_type_t<int, _Np>>(__v3));
+ } //}}}
+
+ // fallback: {{{2
+ if constexpr (sizeof(_To) >= 32)
+ // if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+ return __concat(__convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1),
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v2,
+ __v3));
+ else if constexpr (sizeof(_To) == 16)
+ {
+ const auto __lo = __to_intrin(__convert_x86<_To>(__v0, __v1));
+ const auto __hi = __to_intrin(__convert_x86<_To>(__v2, __v3));
+ if constexpr (sizeof(_Up) * _Np * 2 == 8)
+ {
+ if constexpr (is_floating_point_v<_Up>)
+ return __auto_bitcast(_mm_unpacklo_pd(__lo, __hi));
+ else
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+ }
+ else if constexpr (sizeof(_Up) * _Np * 2 == 4)
+ {
+ if constexpr (is_floating_point_v<_Up>)
+ return __auto_bitcast(_mm_unpacklo_ps(__lo, __hi));
+ else
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi32(__lo, __hi));
+ }
+ else
+ __assert_unreachable<_Tp>();
+ }
+ else
+ return __vector_convert<_To>(__v0, __v1, __v2, __v3,
+ make_index_sequence<_Np>());
+ //}}}2
+ }
+} //}}}
+// 8-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3, _V __v4, _V __v5, _V __v6,
+ _V __v7)
+{
+ static_assert(__is_vector_type_v<_V>);
+ using _Tp = typename _Traits::value_type;
+ constexpr size_t _Np = _Traits::_S_width;
+ [[maybe_unused]] const auto __i0 = __to_intrin(__v0);
+ [[maybe_unused]] const auto __i1 = __to_intrin(__v1);
+ [[maybe_unused]] const auto __i2 = __to_intrin(__v2);
+ [[maybe_unused]] const auto __i3 = __to_intrin(__v3);
+ [[maybe_unused]] const auto __i4 = __to_intrin(__v4);
+ [[maybe_unused]] const auto __i5 = __to_intrin(__v5);
+ [[maybe_unused]] const auto __i6 = __to_intrin(__v6);
+ [[maybe_unused]] const auto __i7 = __to_intrin(__v7);
+ using _Up = typename _VectorTraits<_To>::value_type;
+ constexpr size_t _M = _VectorTraits<_To>::_S_width;
+
+ static_assert(8 * _Np <= _M,
+ "__v4-__v7 would be discarded; use the four/two/one-argument "
+ "__convert_x86 overload instead");
+
+ // [xyz]_to_[xyz] {{{2
+ [[maybe_unused]] constexpr bool __x_to_x
+ = sizeof(__v0) <= 16 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __x_to_y
+ = sizeof(__v0) <= 16 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __x_to_z
+ = sizeof(__v0) <= 16 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __y_to_x
+ = sizeof(__v0) == 32 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __y_to_y
+ = sizeof(__v0) == 32 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __y_to_z
+ = sizeof(__v0) == 32 && sizeof(_To) == 64;
+ [[maybe_unused]] constexpr bool __z_to_x
+ = sizeof(__v0) == 64 && sizeof(_To) <= 16;
+ [[maybe_unused]] constexpr bool __z_to_y
+ = sizeof(__v0) == 64 && sizeof(_To) == 32;
+ [[maybe_unused]] constexpr bool __z_to_z
+ = sizeof(__v0) == 64 && sizeof(_To) == 64;
+
+ // [if]X_to_i8 {{{2
+ [[maybe_unused]] constexpr bool __i_to_i
+ = std::is_integral_v<_Up> && std::is_integral_v<_Tp>;
+ [[maybe_unused]] constexpr bool __i64_to_i8
+ = __i_to_i && sizeof(_Tp) == 8 && sizeof(_Up) == 1;
+ [[maybe_unused]] constexpr bool __f64_to_i8
+ = is_integral_v<_Up> && sizeof(_Up) == 1
+ && is_floating_point_v<_Tp> && sizeof(_Tp) == 8;
+
+ if constexpr (__i_to_i) // assert ISA {{{2
+ {
+ static_assert(__x_to_x || __have_avx2,
+ "integral conversions with ymm registers require AVX2");
+ static_assert(__have_avx512bw
+ || ((sizeof(_Tp) >= 4 || sizeof(__v0) < 64)
+ && (sizeof(_Up) >= 4 || sizeof(_To) < 64)),
+ "8/16-bit integers in zmm registers require AVX512BW");
+ static_assert((sizeof(__v0) < 64 && sizeof(_To) < 64) || __have_avx512f,
+ "integral conversions with ymm registers require AVX2");
+ }
+ // concat => use 4-arg __convert_x86 {{{2
+ if constexpr ((sizeof(__v0) == 16 && __have_avx2)
+ || (sizeof(__v0) == 16 && __have_avx
+ && std::is_floating_point_v<_Tp>)
+ || (sizeof(__v0) == 32 && __have_avx512f))
+ {
+ // The ISA can handle wider input registers, so concat and use two-arg
+ // implementation. This reduces code duplication considerably.
+ return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3),
+ __concat(__v4, __v5), __concat(__v6, __v7));
+ }
+ else //{{{2
+ {
+ // conversion using bit reinterpretation (or no conversion at all) should
+ // all go through the concat branch above:
+ static_assert(!(
+ std::is_floating_point_v<
+ _Tp> == std::is_floating_point_v<_Up> && sizeof(_Tp) == sizeof(_Up)));
+ static_assert(!(8 * _Np < _M && sizeof(_To) > 16),
+ "zero extension should be impossible");
+ if constexpr (__i64_to_i8) //{{{2
+ {
+ if constexpr (__x_to_x && __have_ssse3)
+ {
+ // unsure whether this is better than the variant below
+ return __intrin_bitcast<_To>(_mm_shuffle_epi8(
+ __to_intrin((((__v0 & 0xff) | ((__v1 & 0xff) << 8))
+ | (((__v2 & 0xff) << 16) | ((__v3 & 0xff) << 24)))
+ | ((((__v4 & 0xff) << 32) | ((__v5 & 0xff) << 40))
+ | (((__v6 & 0xff) << 48) | (__v7 << 56)))),
+ _mm_setr_epi8(0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7,
+ 15)));
+ }
+ else if constexpr (__x_to_x)
+ {
+ const auto __a = _mm_unpacklo_epi8(__i0, __i1); // ac
+ const auto __b = _mm_unpackhi_epi8(__i0, __i1); // bd
+ const auto __c = _mm_unpacklo_epi8(__i2, __i3); // eg
+ const auto __d = _mm_unpackhi_epi8(__i2, __i3); // fh
+ const auto __e = _mm_unpacklo_epi8(__i4, __i5); // ik
+ const auto __f = _mm_unpackhi_epi8(__i4, __i5); // jl
+ const auto __g = _mm_unpacklo_epi8(__i6, __i7); // mo
+ const auto __h = _mm_unpackhi_epi8(__i6, __i7); // np
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi64(
+ _mm_unpacklo_epi32(_mm_unpacklo_epi8(__a, __b), // abcd
+ _mm_unpacklo_epi8(__c, __d)), // efgh
+ _mm_unpacklo_epi32(_mm_unpacklo_epi8(__e, __f), // ijkl
+ _mm_unpacklo_epi8(__g, __h)) // mnop
+ ));
+ }
+ else if constexpr (__y_to_y)
+ {
+ auto __a = // 048C GKOS 159D HLPT 26AE IMQU 37BF JNRV
+ __to_intrin((((__v0 & 0xff) | ((__v1 & 0xff) << 8))
+ | (((__v2 & 0xff) << 16) | ((__v3 & 0xff) << 24)))
+ | ((((__v4 & 0xff) << 32) | ((__v5 & 0xff) << 40))
+ | (((__v6 & 0xff) << 48) | ((__v7 << 56)))));
+ /*
+ auto __b = _mm256_unpackhi_epi64(__a, __a); // 159D HLPT 159D
+ HLPT 37BF JNRV 37BF JNRV auto __c = _mm256_unpacklo_epi8(__a,
+ __b); // 0145 89CD GHKL OPST 2367 ABEF IJMN QRUV auto __d =
+ __xzyw(__c); // 0145 89CD 2367 ABEF GHKL OPST IJMN QRUV return
+ _mm256_shuffle_epi8(
+ __d, _mm256_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13,
+ 6, 7, 14, 15, 0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6, 7, 14,
+ 15));
+ */
+ auto __b = _mm256_shuffle_epi8( // 0145 89CD GHKL OPST 2367 ABEF
+ // IJMN QRUV
+ __a, _mm256_setr_epi8(0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6,
+ 14, 7, 15, 0, 8, 1, 9, 2, 10, 3, 11, 4,
+ 12, 5, 13, 6, 14, 7, 15));
+ auto __c = __xzyw(__b); // 0145 89CD 2367 ABEF GHKL OPST IJMN QRUV
+ return __intrin_bitcast<_To>(_mm256_shuffle_epi8(
+ __c, _mm256_setr_epi8(0, 1, 8, 9, 2, 3, 10, 11, 4, 5, 12, 13, 6,
+ 7, 14, 15, 0, 1, 8, 9, 2, 3, 10, 11, 4, 5,
+ 12, 13, 6, 7, 14, 15)));
+ }
+ else if constexpr (__z_to_z)
+ {
+ return __concat(
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1, __v2,
+ __v3),
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v4, __v5, __v6,
+ __v7));
+ }
+ }
+ else if constexpr (__f64_to_i8) //{{{2
+ {
+ return __convert_x86<_To>(
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v0, __v1),
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v2, __v3),
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v4, __v5),
+ __convert_x86<__vector_type_t<int, _Np * 2>>(__v6, __v7));
+ }
+ else // unreachable {{{2
+ __assert_unreachable<_Tp>();
+ //}}}
+
+ // fallback: {{{2
+ if constexpr (sizeof(_To) >= 32)
+ // if _To is ymm or zmm, then _SimdWrapper<_Up, _M / 2> is xmm or ymm
+ return __concat(
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v0, __v1, __v2, __v3),
+ __convert_x86<__vector_type_t<_Up, _M / 2>>(__v4, __v5, __v6, __v7));
+ else if constexpr (sizeof(_To) == 16)
+ {
+ const auto __lo
+ = __to_intrin(__convert_x86<_To>(__v0, __v1, __v2, __v3));
+ const auto __hi
+ = __to_intrin(__convert_x86<_To>(__v4, __v5, __v6, __v7));
+ static_assert(sizeof(_Up) == 1 && _Np == 2);
+ return __intrin_bitcast<_To>(_mm_unpacklo_epi64(__lo, __hi));
+ }
+ else
+ {
+ __assert_unreachable<_Tp>();
+ // return __vector_convert<_To>(__v0, __v1, __v2, __v3, __v4, __v5,
+ // __v6, __v7,
+ // make_index_sequence<_Np>());
+ } //}}}2
+ }
+} //}}}
+// 16-arg __convert_x86 {{{1
+template <typename _To, typename _V, typename _Traits>
+_GLIBCXX_SIMD_INTRINSIC _To
+__convert_x86(_V __v0, _V __v1, _V __v2, _V __v3, _V __v4, _V __v5, _V __v6,
+ _V __v7, _V __v8, _V __v9, _V __v10, _V __v11, _V __v12, _V __v13,
+ _V __v14, _V __v15)
+{
+ // concat => use 8-arg __convert_x86 {{{2
+ return __convert_x86<_To>(__concat(__v0, __v1), __concat(__v2, __v3),
+ __concat(__v4, __v5), __concat(__v6, __v7),
+ __concat(__v8, __v9), __concat(__v10, __v11),
+ __concat(__v12, __v13), __concat(__v14, __v15));
+} //}}}
+
+#endif // __cplusplus >= 201703L
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD_X86_CONVERSIONS_H
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/include/experimental/simd b/libstdc++-v3/include/experimental/simd
new file mode 100644
index 00000000000..cb875bd0e40
--- /dev/null
+++ b/libstdc++-v3/include/experimental/simd
@@ -0,0 +1,66 @@
+// Components for element-wise operations on data-parallel objects -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// <http://www.gnu.org/licenses/>.
+
+/** @file experimental/simd
+ * This is a TS C++ Library header.
+ */
+
+//
+// N4773 §9 data-parallel types library
+//
+
+#ifndef _GLIBCXX_EXPERIMENTAL_SIMD
+#define _GLIBCXX_EXPERIMENTAL_SIMD
+
+#define __cpp_lib_experimental_parallel_simd 201803
+
+#pragma GCC diagnostic push
+// Many [[gnu::vector_size(N)]] types might lead to a -Wpsabi warning which is
+// irrelevant as those functions never appear on ABI borders
+#pragma GCC diagnostic ignored "-Wpsabi"
+
+// If __OPTIMIZE__ is not defined some intrinsics are defined as macros, making
+// use of C casts internally. This requires us to disable the warning as it
+// would otherwise yield many false positives.
+#ifndef __OPTIMIZE__
+#pragma GCC diagnostic ignored "-Wold-style-cast"
+#endif
+
+#include "bits/simd_detail.h"
+#include "bits/simd.h"
+#include "bits/simd_fixed_size.h"
+#include "bits/simd_scalar.h"
+#include "bits/simd_builtin.h"
+#include "bits/simd_converter.h"
+#if _GLIBCXX_SIMD_X86INTRIN
+#include "bits/simd_x86.h"
+#elif _GLIBCXX_SIMD_HAVE_NEON
+#include "bits/simd_neon.h"
+#endif
+#include "bits/simd_math.h"
+
+#pragma GCC diagnostic pop
+
+#endif // _GLIBCXX_EXPERIMENTAL_SIMD
+// vim: ft=cpp
diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am
index e19509d2534..9cef1e65e1b 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -47,6 +47,7 @@ site.exp: Makefile
@echo '## these variables are automatically generated by make ##' >site.tmp
@echo '# Do not edit here. If you wish to override these values' >>site.tmp
@echo '# edit the last section' >>site.tmp
+ @echo 'set tool libstdc++' >>site.tmp
@echo 'set srcdir $(srcdir)' >>site.tmp
@echo "set objdir `pwd`" >>site.tmp
@echo 'set build_alias "$(build_alias)"' >>site.tmp
@@ -55,7 +56,6 @@ site.exp: Makefile
@echo 'set host_triplet $(host_triplet)' >>site.tmp
@echo 'set target_alias "$(target_alias)"' >>site.tmp
@echo 'set target_triplet $(target_triplet)' >>site.tmp
- @echo 'set target_triplet $(target_triplet)' >>site.tmp
@echo 'set libiconv "$(LIBICONV)"' >>site.tmp
@echo 'set baseline_dir "$(baseline_dir)"' >> site.tmp
@echo 'set baseline_subdir_switch "$(baseline_subdir_switch)"' >> site.tmp
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc
new file mode 100644
index 00000000000..ffff65ee130
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc
new file mode 100644
index 00000000000..f8dd7d4ef82
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char.cc
new file mode 100644
index 00000000000..8b37d82caaa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc
new file mode 100644
index 00000000000..4c11f64ea4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ef375ce9451
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc
new file mode 100644
index 00000000000..6618460cc38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc
new file mode 100644
index 00000000000..e6c5ba261f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9b95c98421c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc
new file mode 100644
index 00000000000..47e8fc78e90
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc
new file mode 100644
index 00000000000..4adce678c0f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc
new file mode 100644
index 00000000000..25bb1f8cb24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-double.cc b/libstdc++-v3/testsuite/experimental/simd/abs-double.cc
new file mode 100644
index 00000000000..302389530c1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc
new file mode 100644
index 00000000000..317d2b7db52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc
new file mode 100644
index 00000000000..e69ea67bc71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-float.cc b/libstdc++-v3/testsuite/experimental/simd/abs-float.cc
new file mode 100644
index 00000000000..2ba2178d2b0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc
new file mode 100644
index 00000000000..b515b5cb4a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc
new file mode 100644
index 00000000000..c41eeb52641
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-int.cc b/libstdc++-v3/testsuite/experimental/simd/abs-int.cc
new file mode 100644
index 00000000000..7299e4af93d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc
new file mode 100644
index 00000000000..5d3ac0a8217
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc
new file mode 100644
index 00000000000..a5f27e384e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long.cc
new file mode 100644
index 00000000000..64719277b7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc
new file mode 100644
index 00000000000..bdd51845a38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc
new file mode 100644
index 00000000000..0454574b3db
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc
new file mode 100644
index 00000000000..d18f45f8a45
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc
new file mode 100644
index 00000000000..736d0005a68
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc
new file mode 100644
index 00000000000..2fdcfbb077c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc
new file mode 100644
index 00000000000..f1dfe10f33f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc
new file mode 100644
index 00000000000..5b95e5def75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc
new file mode 100644
index 00000000000..a09d81da561
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-short.cc b/libstdc++-v3/testsuite/experimental/simd/abs-short.cc
new file mode 100644
index 00000000000..d772aea85e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc
new file mode 100644
index 00000000000..e343396280b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..43146b7b6d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc
new file mode 100644
index 00000000000..bfd89fd5a96
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..8d2eba06b13
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..f3afe7b4548
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc
new file mode 100644
index 00000000000..1b00d8bfdd2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..05354330541
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..6aeec8b356b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc
new file mode 100644
index 00000000000..d3581287a0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..547a3f77c71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..9bd90e8ea7e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc
new file mode 100644
index 00000000000..79c2fd3f739
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..0976cbe1082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ed7adc17ac1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc
new file mode 100644
index 00000000000..60369fe7d23
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..e4078abd7e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..5d24b6bd858
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc
new file mode 100644
index 00000000000..b7bde5c7487
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..7adee468f76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..006e31a4a9c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc
new file mode 100644
index 00000000000..1b837f3f005
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/abs-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/abs.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc
new file mode 100644
index 00000000000..453e5f6c644
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc
new file mode 100644
index 00000000000..5c6ec25040d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc
new file mode 100644
index 00000000000..0b4b81155f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc
new file mode 100644
index 00000000000..f946ac69bed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..a04511a6cf5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc
new file mode 100644
index 00000000000..3a8dcf9acd3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc
new file mode 100644
index 00000000000..ba1cd1b9cbd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..797d60f7822
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc
new file mode 100644
index 00000000000..874e27d70a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc
new file mode 100644
index 00000000000..3f3e06b4e17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc
new file mode 100644
index 00000000000..c8690e22f1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc
new file mode 100644
index 00000000000..bf4accfc59b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc
new file mode 100644
index 00000000000..5b97d8a5a67
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc
new file mode 100644
index 00000000000..6c88f4e5289
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc
new file mode 100644
index 00000000000..e4b0eec5742
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc
new file mode 100644
index 00000000000..2e40f05da5a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc
new file mode 100644
index 00000000000..801fa146563
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc
new file mode 100644
index 00000000000..9e5a74e8d0c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc
new file mode 100644
index 00000000000..959e7689c99
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc
new file mode 100644
index 00000000000..1d997953e63
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc
new file mode 100644
index 00000000000..67f04518350
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc
new file mode 100644
index 00000000000..284c48f9399
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc
new file mode 100644
index 00000000000..0c2973a88c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc
new file mode 100644
index 00000000000..649f5dc5d42
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc
new file mode 100644
index 00000000000..9f454a9bda5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc
new file mode 100644
index 00000000000..d1295fbd4e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc
new file mode 100644
index 00000000000..7e8a3f91b23
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc
new file mode 100644
index 00000000000..fcc2d52a097
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc
new file mode 100644
index 00000000000..92e3fb0bdeb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc
new file mode 100644
index 00000000000..e294906388c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a02e310f606
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..51545f8960d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc
new file mode 100644
index 00000000000..67bb7e9493e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..a71a1cee92e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..8c32dd2a4fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc
new file mode 100644
index 00000000000..ce4e416091b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..bbafb7a5fba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..63b6e61e0ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc
new file mode 100644
index 00000000000..8704ef8bc48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..7279d391ec5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..2bbd1e3cd7e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc
new file mode 100644
index 00000000000..2ec36d041cb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..579ee3cb787
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..eb216b1daf2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc
new file mode 100644
index 00000000000..9d0502d3e80
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..ea8f8d9b68d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..fb8650d4ddd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc
new file mode 100644
index 00000000000..e5d45f12a58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..52b4b70fcda
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..51f485c17e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc
new file mode 100644
index 00000000000..e5df7bc35ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/algorithms-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/algorithms.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc
new file mode 100644
index 00000000000..2441ead5416
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc
new file mode 100644
index 00000000000..b6a8850f988
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc
new file mode 100644
index 00000000000..a7bcdbf8c26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc
new file mode 100644
index 00000000000..7030a405433
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..1124d75c645
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc
new file mode 100644
index 00000000000..cdb827041cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc
new file mode 100644
index 00000000000..7d585299fb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..0a09d2a27d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc
new file mode 100644
index 00000000000..6d127fed41e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc
new file mode 100644
index 00000000000..38acf53cd86
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc
new file mode 100644
index 00000000000..5a8480383c6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc
new file mode 100644
index 00000000000..8a258106dec
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc
new file mode 100644
index 00000000000..02bd74edb45
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc
new file mode 100644
index 00000000000..b0326aebeba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc
new file mode 100644
index 00000000000..210d01aeeec
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc
new file mode 100644
index 00000000000..e810f8a379d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc
new file mode 100644
index 00000000000..2199cae9850
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc
new file mode 100644
index 00000000000..d3945085de0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc
new file mode 100644
index 00000000000..af48c1de8eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc
new file mode 100644
index 00000000000..90531aa4c85
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc
new file mode 100644
index 00000000000..cb839c32c75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc
new file mode 100644
index 00000000000..e9fdce9bb0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc
new file mode 100644
index 00000000000..a237b259b93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc
new file mode 100644
index 00000000000..537a3fcb4c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc
new file mode 100644
index 00000000000..8c0ca3413a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc
new file mode 100644
index 00000000000..586e5380a5d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc
new file mode 100644
index 00000000000..6af141ac126
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc
new file mode 100644
index 00000000000..ab2f19dca8c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc
new file mode 100644
index 00000000000..1d71a7328f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc
new file mode 100644
index 00000000000..2f3a937715c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a7a65fc4869
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..58f30eac548
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc
new file mode 100644
index 00000000000..40e707496c7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..23ffa001a7d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..1dbc5313a22
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc
new file mode 100644
index 00000000000..b266c3d0900
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..114c4d28258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..063559e6a9b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc
new file mode 100644
index 00000000000..234eb95a65c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..86b2fce8258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..148134d7baf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc
new file mode 100644
index 00000000000..a30a6e76162
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..4f10a2ad029
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e04cb2b23b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc
new file mode 100644
index 00000000000..73824beead2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..c87091d3151
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..0f886eacf2e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc
new file mode 100644
index 00000000000..69d786bdca5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..94604182e00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..45aaf4689fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc
new file mode 100644
index 00000000000..319243c448d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/broadcast-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc
new file mode 100644
index 00000000000..378e2fb2df7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc
new file mode 100644
index 00000000000..a098d2e3f4e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char.cc
new file mode 100644
index 00000000000..f64a0024052
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc
new file mode 100644
index 00000000000..1722271157b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..bd8faab2002
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc
new file mode 100644
index 00000000000..bd1268c3612
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc
new file mode 100644
index 00000000000..938dfbb75d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..85951c4320f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc
new file mode 100644
index 00000000000..8ef1e7ce7e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc
new file mode 100644
index 00000000000..0cf0b9e5c79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc
new file mode 100644
index 00000000000..8b7f0c9aed0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-double.cc b/libstdc++-v3/testsuite/experimental/simd/casts-double.cc
new file mode 100644
index 00000000000..41be646beeb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc
new file mode 100644
index 00000000000..f7a4eff264e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc
new file mode 100644
index 00000000000..b854f481ab6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-float.cc b/libstdc++-v3/testsuite/experimental/simd/casts-float.cc
new file mode 100644
index 00000000000..f766426a834
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc
new file mode 100644
index 00000000000..7851a637593
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc
new file mode 100644
index 00000000000..2cdfc6e91b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-int.cc b/libstdc++-v3/testsuite/experimental/simd/casts-int.cc
new file mode 100644
index 00000000000..97d288508b8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc
new file mode 100644
index 00000000000..0cd85e095d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc
new file mode 100644
index 00000000000..9d43269a0f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long.cc
new file mode 100644
index 00000000000..88cb7299fb7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc
new file mode 100644
index 00000000000..d2b2ed1fa29
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc
new file mode 100644
index 00000000000..1705e0b6fbc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc
new file mode 100644
index 00000000000..3ca613c7eab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc
new file mode 100644
index 00000000000..cc135964ab0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc
new file mode 100644
index 00000000000..cdc5a31cee6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc
new file mode 100644
index 00000000000..3a671406607
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc
new file mode 100644
index 00000000000..13b56956836
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc
new file mode 100644
index 00000000000..fe52ac7db75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-short.cc b/libstdc++-v3/testsuite/experimental/simd/casts-short.cc
new file mode 100644
index 00000000000..6e5dbe83784
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc
new file mode 100644
index 00000000000..a1598ca85bc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..27b4d595db1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc
new file mode 100644
index 00000000000..2c5a99941ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..661dce00177
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..0bafc8a31fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc
new file mode 100644
index 00000000000..611642504cd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..5289792e403
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..00f1eb0d41b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc
new file mode 100644
index 00000000000..f7b202c27ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..fabb1d94dd4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..36070941276
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc
new file mode 100644
index 00000000000..cf44cbc2c11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..fee0dc4b452
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..bc0d32f9fd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc
new file mode 100644
index 00000000000..5779b69f410
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..86462e6ebb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..fc7c4a81b15
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc
new file mode 100644
index 00000000000..fbef0ec25f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..ea61dc67a2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a2f7f8eb820
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc
new file mode 100644
index 00000000000..492bd52db7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/casts-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/casts.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc
new file mode 100644
index 00000000000..0c5b3955c4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc
new file mode 100644
index 00000000000..69fc7e9e28f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc
new file mode 100644
index 00000000000..25693b2082c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc
new file mode 100644
index 00000000000..cb2ce60a75c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc
new file mode 100644
index 00000000000..80ca4a6043f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc
new file mode 100644
index 00000000000..886a4d0c83b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc
new file mode 100644
index 00000000000..3d3578f974e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc
new file mode 100644
index 00000000000..6848e5db58c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc
new file mode 100644
index 00000000000..66eb5aaa7be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/fpclassify-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/fpclassify.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc
new file mode 100644
index 00000000000..e86618b52fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc
new file mode 100644
index 00000000000..0cf0d6f6de6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc
new file mode 100644
index 00000000000..ebfc0f1b738
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc
new file mode 100644
index 00000000000..7c5a9838a41
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc
new file mode 100644
index 00000000000..b2d82af9f16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc
new file mode 100644
index 00000000000..584ea43afe3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc
new file mode 100644
index 00000000000..00e2a6c9dbc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc
new file mode 100644
index 00000000000..abdcd9d5126
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc
new file mode 100644
index 00000000000..00a3b61b33d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/frexp-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/frexp.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh b/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh
new file mode 100755
index 00000000000..7acf17c7eed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_testcases.sh
@@ -0,0 +1,81 @@
+#!/bin/bash
+
+floattypes=(
+"long double"
+"double"
+"float"
+)
+alltypes=(
+"${floattypes[@]}"
+"long long"
+"unsigned long long"
+"unsigned long"
+"long"
+"int"
+"unsigned int"
+"short"
+"unsigned short"
+"char"
+"signed char"
+"unsigned char"
+"char32_t"
+"char16_t"
+"wchar_t"
+)
+
+cd ${0%/*}
+for testcase in tests/*.h; do
+ if grep -q "test only floattypes" "$testcase"; then
+ typelist=("${floattypes[@]}")
+ else
+ typelist=("${alltypes[@]}")
+ fi
+ testcase=${testcase%.h}
+ testcase=${testcase##*/}
+ for type in "${typelist[@]}"; do
+ if [[ $testcase == sincos ]]; then
+ # The sincos test requires reference data to run
+ extra='// { dg-do compile }'
+ else
+ extra=''
+ fi
+ filename="${testcase}-${type// /_}"
+
+ cat > "${filename}.cc" <<EOF
+// { dg-options "-std=c++17" }
+${extra}
+#include "tests/${testcase}.h"
+
+int main()
+{
+ iterate_abis<${type}>();
+ return 0;
+}
+EOF
+ cat > "${filename}-constexpr.cc" <<EOF
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+${extra}
+#include "tests/${testcase}.h"
+
+int main()
+{
+ iterate_abis<${type}>();
+ return 0;
+}
+EOF
+ cat > "${filename}-fixed_size.cc" <<EOF
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+${extra}
+#define TESTFIXEDSIZE 1
+#include "tests/${testcase}.h"
+
+int main()
+{
+ iterate_abis<${type}>();
+ return 0;
+}
+EOF
+ done
+done
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc
new file mode 100644
index 00000000000..dceb9ab29b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc
new file mode 100644
index 00000000000..f20cc441d74
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char.cc
new file mode 100644
index 00000000000..790e4e3636a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc
new file mode 100644
index 00000000000..59ea27c0802
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..19fa325ed51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc
new file mode 100644
index 00000000000..897ee1c7a88
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc
new file mode 100644
index 00000000000..4db121300fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..62b5cd6c29f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc
new file mode 100644
index 00000000000..2b04c8bda75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc
new file mode 100644
index 00000000000..de491f79875
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc
new file mode 100644
index 00000000000..e7af2ed7082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-double.cc b/libstdc++-v3/testsuite/experimental/simd/generator-double.cc
new file mode 100644
index 00000000000..09ac4bdc33d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc
new file mode 100644
index 00000000000..edabab7d3e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc
new file mode 100644
index 00000000000..75d18751c02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-float.cc b/libstdc++-v3/testsuite/experimental/simd/generator-float.cc
new file mode 100644
index 00000000000..40f44fae4d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc
new file mode 100644
index 00000000000..643a071d7c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc
new file mode 100644
index 00000000000..acd38d02921
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-int.cc b/libstdc++-v3/testsuite/experimental/simd/generator-int.cc
new file mode 100644
index 00000000000..2166ba8d480
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc
new file mode 100644
index 00000000000..25b994c26a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc
new file mode 100644
index 00000000000..a2d5ecfce3c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long.cc
new file mode 100644
index 00000000000..9529bcc37ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc
new file mode 100644
index 00000000000..f96beaa690a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc
new file mode 100644
index 00000000000..e60f903b48e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc
new file mode 100644
index 00000000000..dbb5cac8e6b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc
new file mode 100644
index 00000000000..e6b9f93fea7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc
new file mode 100644
index 00000000000..cb23b21fcc4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc
new file mode 100644
index 00000000000..b1d1de2a2f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc
new file mode 100644
index 00000000000..84d3314be24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc
new file mode 100644
index 00000000000..44a6764f7e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-short.cc b/libstdc++-v3/testsuite/experimental/simd/generator-short.cc
new file mode 100644
index 00000000000..5343657320f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc
new file mode 100644
index 00000000000..fd35555d54a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bdca8349c33
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc
new file mode 100644
index 00000000000..0c1f5bb6118
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..6802c31a3f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..d990de8de5b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc
new file mode 100644
index 00000000000..2c4a0c57404
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..daba85f07ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..6bdbebcdd24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc
new file mode 100644
index 00000000000..fed7b58d6ab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..da209e2b894
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..ab20c3f87ac
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc
new file mode 100644
index 00000000000..66b330f2d5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..047ff571237
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..6c96a68f2b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc
new file mode 100644
index 00000000000..609e23f5df3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..b24d0d9a60a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..456ece81cdc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc
new file mode 100644
index 00000000000..cc7f8c3d287
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..5cf9521b7c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4f77cfe7a91
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc
new file mode 100644
index 00000000000..6c775fdd0e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/generator-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/generator.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc
new file mode 100644
index 00000000000..bd6936cb40f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc
new file mode 100644
index 00000000000..eba5aa120ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc
new file mode 100644
index 00000000000..442cec265eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc
new file mode 100644
index 00000000000..43fab5b1b8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc
new file mode 100644
index 00000000000..e933dc8aea4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc
new file mode 100644
index 00000000000..24132704a26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc
new file mode 100644
index 00000000000..658c8a2fb6d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc
new file mode 100644
index 00000000000..afed35e475f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc
new file mode 100644
index 00000000000..78cd653f795
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/hypot3_fma-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/hypot3_fma.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc
new file mode 100644
index 00000000000..c3c0bd70f9d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc
new file mode 100644
index 00000000000..c934dac6e65
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc
new file mode 100644
index 00000000000..02c2324be0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..16cd5b3e477
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..56914e866f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc
new file mode 100644
index 00000000000..708c36f3dd0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..fbabea41c66
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..50676c64bdf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc
new file mode 100644
index 00000000000..64d23ab4b8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc
new file mode 100644
index 00000000000..c80490ddfa9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc
new file mode 100644
index 00000000000..65717f6a449
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc
new file mode 100644
index 00000000000..9caf5aadf4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc
new file mode 100644
index 00000000000..4d57562fef6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc
new file mode 100644
index 00000000000..3b7dd998e24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc
new file mode 100644
index 00000000000..9a5219fd89e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc
new file mode 100644
index 00000000000..d829d8bc842
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc
new file mode 100644
index 00000000000..72e1647d920
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc
new file mode 100644
index 00000000000..61b1970c831
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc
new file mode 100644
index 00000000000..fb74cbec9a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc
new file mode 100644
index 00000000000..6f1892f30c5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc
new file mode 100644
index 00000000000..d2ae50b8800
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..42884f0f483
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..a617c0a0a8c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc
new file mode 100644
index 00000000000..67ed81b7001
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..521a4ef7a0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..232f0942d2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc
new file mode 100644
index 00000000000..4768c4fadba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc
new file mode 100644
index 00000000000..aab5b19a523
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc
new file mode 100644
index 00000000000..ec7ed1c31c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc
new file mode 100644
index 00000000000..ba0d08ef1f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..4cd49cc02f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..9f2da6b998d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc
new file mode 100644
index 00000000000..76491b7ae17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..33781182de0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..3a896e64381
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc
new file mode 100644
index 00000000000..d2d8a877ea9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..40720b069a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..64d9ea41c8d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc
new file mode 100644
index 00000000000..b0e397ddb73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..e78204203ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..e39d053246b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc
new file mode 100644
index 00000000000..1776a81b8da
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..dc83a1403a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..620cc2f9f71
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..ab18b1fadb9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..70c79f359d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..74c6cad8d64
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc
new file mode 100644
index 00000000000..a1cc484382a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..abaf5fe0184
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a1457acb5d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc
new file mode 100644
index 00000000000..cd8fe35af87
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/integer_operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/integer_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc
new file mode 100644
index 00000000000..f30cc134fc7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc
new file mode 100644
index 00000000000..026689bb872
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc
new file mode 100644
index 00000000000..04e5c8dcf16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc
new file mode 100644
index 00000000000..be858a25c76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc
new file mode 100644
index 00000000000..5eb7970cd25
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc
new file mode 100644
index 00000000000..5d4b0905dee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc
new file mode 100644
index 00000000000..ad6ab7a5f32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc
new file mode 100644
index 00000000000..dae783e3054
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc
new file mode 100644
index 00000000000..292a093e014
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/ldexp_scalbn_scalbln_modf-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/ldexp_scalbn_scalbln_modf.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc
new file mode 100644
index 00000000000..8f8c86fc723
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc
new file mode 100644
index 00000000000..558bcf5a1ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc
new file mode 100644
index 00000000000..d54a5484b38
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc
new file mode 100644
index 00000000000..89734584aaf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..09afc92c291
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc
new file mode 100644
index 00000000000..13e490d5298
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc
new file mode 100644
index 00000000000..d4bb463c8b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..8f6c0dd2633
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc
new file mode 100644
index 00000000000..8287f749609
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc
new file mode 100644
index 00000000000..2d2cc758dd2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc
new file mode 100644
index 00000000000..48a884f810b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc
new file mode 100644
index 00000000000..ff23a245932
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc
new file mode 100644
index 00000000000..6e9aaaf804f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc
new file mode 100644
index 00000000000..3f78361528e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc
new file mode 100644
index 00000000000..ad07cd19450
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc
new file mode 100644
index 00000000000..7f02b0491d6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc
new file mode 100644
index 00000000000..b64a1dacdbe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc
new file mode 100644
index 00000000000..c18e83e1a6a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc
new file mode 100644
index 00000000000..8e1614d6f9d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc
new file mode 100644
index 00000000000..c5b44d2f4f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc
new file mode 100644
index 00000000000..bfc96cec5aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc
new file mode 100644
index 00000000000..b56af2f577d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc
new file mode 100644
index 00000000000..312bf635926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc
new file mode 100644
index 00000000000..21a40601f0e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc
new file mode 100644
index 00000000000..0c894b52df5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc
new file mode 100644
index 00000000000..3dd727183f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc
new file mode 100644
index 00000000000..5ce328f75ed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc
new file mode 100644
index 00000000000..d54bb34bbf5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc
new file mode 100644
index 00000000000..4bbc320dd39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc
new file mode 100644
index 00000000000..7e478bd4089
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc
new file mode 100644
index 00000000000..c964496e454
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bb40925621d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc
new file mode 100644
index 00000000000..5a58e97a07f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..0b84e78b2d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..38d08864098
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc
new file mode 100644
index 00000000000..5c3e91efa2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a1f7bde48a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..fcfb4f6fe78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc
new file mode 100644
index 00000000000..8326899f1f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..c56a5c92e00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d13dd683603
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc
new file mode 100644
index 00000000000..9415472ea27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..3cf44d29ebb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ce108525c2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc
new file mode 100644
index 00000000000..3d811ad94ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..800689b49f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..503f674e3f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc
new file mode 100644
index 00000000000..8a33738354b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..521839044eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4b7188655b6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc
new file mode 100644
index 00000000000..ebfeef00910
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/loadstore-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc
new file mode 100644
index 00000000000..fc6c1d68f24
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc
new file mode 100644
index 00000000000..fcc3e0c688a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc
new file mode 100644
index 00000000000..5806393f619
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc
new file mode 100644
index 00000000000..5429cd72deb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc
new file mode 100644
index 00000000000..dd8ae7a43d0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc
new file mode 100644
index 00000000000..abb45db813a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc
new file mode 100644
index 00000000000..f59216fe260
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc
new file mode 100644
index 00000000000..143d020ea39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc
new file mode 100644
index 00000000000..00fc1a42d7b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/logarithm-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/logarithm.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc
new file mode 100644
index 00000000000..f11e3d34e64
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc
new file mode 100644
index 00000000000..6c50d321c12
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc
new file mode 100644
index 00000000000..67da72a4f8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc
new file mode 100644
index 00000000000..edaa5da5819
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..967f63be80c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc
new file mode 100644
index 00000000000..fd96e76b041
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc
new file mode 100644
index 00000000000..cbc4bbe77ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..c7c5754a279
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc
new file mode 100644
index 00000000000..7fb0082e9b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc
new file mode 100644
index 00000000000..d1972a66933
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc
new file mode 100644
index 00000000000..57ec0817e81
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc
new file mode 100644
index 00000000000..fb14dc2a93e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc
new file mode 100644
index 00000000000..e2e57cfdd36
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc
new file mode 100644
index 00000000000..d1fb0844d32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc
new file mode 100644
index 00000000000..815a421b6e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc
new file mode 100644
index 00000000000..1f49e480fed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc
new file mode 100644
index 00000000000..ed73ab27b39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc
new file mode 100644
index 00000000000..4bbc4a8357b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc
new file mode 100644
index 00000000000..c8993db6266
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc
new file mode 100644
index 00000000000..8f7237f7565
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc
new file mode 100644
index 00000000000..dad171eb5e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc
new file mode 100644
index 00000000000..c6976640608
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc
new file mode 100644
index 00000000000..7e3b49eeaf3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc
new file mode 100644
index 00000000000..87083517140
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc
new file mode 100644
index 00000000000..9786519ecdd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc
new file mode 100644
index 00000000000..69f68155b2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc
new file mode 100644
index 00000000000..42a3e5fdd2a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc
new file mode 100644
index 00000000000..b3c457e7207
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc
new file mode 100644
index 00000000000..75410a7e1bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc
new file mode 100644
index 00000000000..4dedc6f8394
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc
new file mode 100644
index 00000000000..66cdd5458fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..9f6a5e66da6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc
new file mode 100644
index 00000000000..231236be4dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..297ee8aa460
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..8cb3566e533
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc
new file mode 100644
index 00000000000..85fbd99081a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..8da8192a063
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..68fba792469
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc
new file mode 100644
index 00000000000..de905c0fb5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..9d3b1bc299d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..c1f89bc4831
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc
new file mode 100644
index 00000000000..824f254f89a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..403deff73a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..94e079c9032
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc
new file mode 100644
index 00000000000..a0e8b14c11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..f2187c85f5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e7c6695034f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc
new file mode 100644
index 00000000000..97fb4819951
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..fde03ab1777
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..3076fe9967a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc
new file mode 100644
index 00000000000..33cfb2796df
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_broadcast-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_broadcast.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc
new file mode 100644
index 00000000000..565a723bd10
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc
new file mode 100644
index 00000000000..8b18da853f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc
new file mode 100644
index 00000000000..506e9daf930
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..e08ff6d8759
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..32b6c88b409
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc
new file mode 100644
index 00000000000..1879792e37c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..63cb5c1efae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..1ab9b0047d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc
new file mode 100644
index 00000000000..4bc48f38d1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc
new file mode 100644
index 00000000000..a4b515ae6e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc
new file mode 100644
index 00000000000..b16b3618d79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc
new file mode 100644
index 00000000000..ad5f7d97d00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc
new file mode 100644
index 00000000000..372eba52a7a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc
new file mode 100644
index 00000000000..316bc781f94
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc
new file mode 100644
index 00000000000..054b7de4cc1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc
new file mode 100644
index 00000000000..39dc1f063bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc
new file mode 100644
index 00000000000..ec4dbfddde4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc
new file mode 100644
index 00000000000..337d73f2222
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc
new file mode 100644
index 00000000000..bdff42ea69c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc
new file mode 100644
index 00000000000..cfd18904811
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc
new file mode 100644
index 00000000000..29380dee2d4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc
new file mode 100644
index 00000000000..6438d2e128d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..83538a716dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc
new file mode 100644
index 00000000000..8a5385eb018
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc
new file mode 100644
index 00000000000..cfe1a5863e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..c0d6f5d6e77
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc
new file mode 100644
index 00000000000..a1a6ab41b10
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc
new file mode 100644
index 00000000000..16f59d2c10c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc
new file mode 100644
index 00000000000..bc4c4c02d86
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc
new file mode 100644
index 00000000000..eb5a05907dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..52e468ad65d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3dcf542c3e1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc
new file mode 100644
index 00000000000..07f26470e31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..c49e1400916
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..e3b9896aebf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc
new file mode 100644
index 00000000000..93e302cfff7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a811aebfe48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..d80e9ec27a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc
new file mode 100644
index 00000000000..b54ead172a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..64435a00af8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..e180ab25175
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc
new file mode 100644
index 00000000000..913a8e39d93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..6c0d786355f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..ea85ae5b532
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc
new file mode 100644
index 00000000000..e2d96ad0775
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..ebf6efe8192
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..864241bf103
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc
new file mode 100644
index 00000000000..b5f77babd16
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..d285f886712
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..c94ed844589
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc
new file mode 100644
index 00000000000..58f6454f5eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_conversions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_conversions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..c765ccee423
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..19e870c48ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc
new file mode 100644
index 00000000000..70b4150fdf1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..d9da969ca73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..57bb37af5f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc
new file mode 100644
index 00000000000..ad5159bbabd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c9ee39d0114
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..a30a9d4395f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc
new file mode 100644
index 00000000000..42757de6ea8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..321441421ac
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..bc5bfd9e141
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc
new file mode 100644
index 00000000000..f1c85076865
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..0e547c1b56d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..1465aa38b9a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc
new file mode 100644
index 00000000000..7ab3b192531
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..54f158bb721
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..174b7f9b1a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc
new file mode 100644
index 00000000000..d0c2d723c8a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..a74018f4e32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..4353cdcca72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc
new file mode 100644
index 00000000000..ab3c4247929
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..2459c58a9d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..c29a31539a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc
new file mode 100644
index 00000000000..c0f1954e7b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..033c316c5e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..12adce10f3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc
new file mode 100644
index 00000000000..508309fca1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..91cdf1bfa2e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..c520be67867
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc
new file mode 100644
index 00000000000..35f230c112d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..94a5c86f13b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..2408ee12bc2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc
new file mode 100644
index 00000000000..1de188a59bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..6a502930aa7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..d24f52132c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..a6c215f9be6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..07a59faad93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..827f8f20d3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..f55e5b31510
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..917328fff97
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..bc4e2c1ee1a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..53c5a43538c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..9bb7d41b20f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..c837776083e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..4b224cf2255
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..04d35b01f47
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e2f0fd409b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..deda07a1170
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..e2dc688f3b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..ceb192b6f0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc
new file mode 100644
index 00000000000..6c709aba793
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_implicit_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_implicit_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc
new file mode 100644
index 00000000000..a432ad4e2dc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc
new file mode 100644
index 00000000000..ca36d61a0c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc
new file mode 100644
index 00000000000..25aec9fd2ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc
new file mode 100644
index 00000000000..9207f203e4a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..0d2d5bbf8cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc
new file mode 100644
index 00000000000..aa9e5cbe614
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc
new file mode 100644
index 00000000000..2f8767ac51f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..5a191a1d1f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc
new file mode 100644
index 00000000000..eb2e790b34f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc
new file mode 100644
index 00000000000..1f655c3b5b2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc
new file mode 100644
index 00000000000..24f9a5c2cca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc
new file mode 100644
index 00000000000..86940d1bc79
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc
new file mode 100644
index 00000000000..b9cd9c2c2e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc
new file mode 100644
index 00000000000..e07bd30af26
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc
new file mode 100644
index 00000000000..bf811a459f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc
new file mode 100644
index 00000000000..006e84eef6e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc
new file mode 100644
index 00000000000..5bc6f3940ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc
new file mode 100644
index 00000000000..69f0e1f8ec4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc
new file mode 100644
index 00000000000..473eba397bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc
new file mode 100644
index 00000000000..53cfce6231f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc
new file mode 100644
index 00000000000..493b74dba1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc
new file mode 100644
index 00000000000..40c32ce983b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc
new file mode 100644
index 00000000000..41edd207ef9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc
new file mode 100644
index 00000000000..4315608c710
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc
new file mode 100644
index 00000000000..626b31c29ba
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc
new file mode 100644
index 00000000000..b69d9998c3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc
new file mode 100644
index 00000000000..2d5f7a7bdab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc
new file mode 100644
index 00000000000..5d5b00b1af9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc
new file mode 100644
index 00000000000..d8fa0da4832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc
new file mode 100644
index 00000000000..32223bd3ca5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc
new file mode 100644
index 00000000000..b93b884fe3f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..a68917e8ead
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc
new file mode 100644
index 00000000000..103e0536af9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..8c4f129f868
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..bd4f76b00e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc
new file mode 100644
index 00000000000..0d0795d7628
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..a511307e11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..87ad711269d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc
new file mode 100644
index 00000000000..750e63914be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..abb103eb633
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d5c8db6669d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc
new file mode 100644
index 00000000000..10d2aa2acd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..3663b729bc5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e34f45e3dfd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc
new file mode 100644
index 00000000000..bc5419eb290
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..1b40c7d531a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..d43ee3743fe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc
new file mode 100644
index 00000000000..eb7dd8ae4cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..7e29b05f001
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..73379fc9336
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc
new file mode 100644
index 00000000000..ad80e3c76e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_loadstore-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_loadstore.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..da2323c3890
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..2b7bf71329a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc
new file mode 100644
index 00000000000..90f59f80b14
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..239e4b74d0c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..919175a1304
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc
new file mode 100644
index 00000000000..67546406ccc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..00930432f3c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..3ccfc8205c4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc
new file mode 100644
index 00000000000..d048692b595
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..1a381b73078
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..cc9637e8d02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc
new file mode 100644
index 00000000000..1833537abab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..853be149e87
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..ab935f81066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc
new file mode 100644
index 00000000000..44a076b3b1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..6401f73f090
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..5e31458026e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc
new file mode 100644
index 00000000000..5fd1b352b39
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..5b218f5fb46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..52feda74541
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc
new file mode 100644
index 00000000000..540be86d5cb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..44c4b69e914
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..2bd1c8dc38b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc
new file mode 100644
index 00000000000..d156bdbcfd4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..bf37f7697ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..1249170c3d6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc
new file mode 100644
index 00000000000..364a0758efd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..b4757e68257
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..9b4dea095c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc
new file mode 100644
index 00000000000..1e7c6056e88
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..db74da66b44
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..cc2224e0de4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc
new file mode 100644
index 00000000000..6c35351e603
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..f9f0e7d00cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..fef0cb50bb7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..4a170d1c072
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..d7e6f995534
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..cf98ed3e2bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..be0a0826857
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..0a7473296fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..36a8047c0f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..9b711f0ec2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..d6a23bf065e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..7a85877582d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..9ce0d5563da
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..7d38a5afa60
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..83fb8d5c754
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..174586782fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..3ded73de66b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..5c3c03bf98f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc
new file mode 100644
index 00000000000..84d81226ebc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operator_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc
new file mode 100644
index 00000000000..e8a8e548f73
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc
new file mode 100644
index 00000000000..f05128952f7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc
new file mode 100644
index 00000000000..f7122cd211b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..4c58ba4f49e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..1020726c2c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc
new file mode 100644
index 00000000000..5a3818ec668
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..d369c84ac4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9ccf64efc07
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc
new file mode 100644
index 00000000000..9b8fcc1b10a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc
new file mode 100644
index 00000000000..db98fe95d72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc
new file mode 100644
index 00000000000..2add295d16c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc
new file mode 100644
index 00000000000..5f2e35edbb1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc
new file mode 100644
index 00000000000..b1327432bbd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc
new file mode 100644
index 00000000000..ff4df352830
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc
new file mode 100644
index 00000000000..db0db36fc31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc
new file mode 100644
index 00000000000..dbda11bb3bb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc
new file mode 100644
index 00000000000..85b5fd792e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc
new file mode 100644
index 00000000000..f18d44c3479
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc
new file mode 100644
index 00000000000..75bee8fad61
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc
new file mode 100644
index 00000000000..38501b26ae1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc
new file mode 100644
index 00000000000..5702dfe3b17
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..e40e2ffe1b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..8883052e3a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc
new file mode 100644
index 00000000000..95456f74539
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..20c7cb3a19d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..b2ca775a178
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc
new file mode 100644
index 00000000000..930e52678c0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc
new file mode 100644
index 00000000000..4bed4ff1e04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc
new file mode 100644
index 00000000000..6509df3c534
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc
new file mode 100644
index 00000000000..2e398c863d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..1a60835a33e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..d96abe5803b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc
new file mode 100644
index 00000000000..bbc24e801a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..caeb8ccf67e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..6fedbf1c29d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc
new file mode 100644
index 00000000000..26f47443239
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..de08a525490
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..be930358fdb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc
new file mode 100644
index 00000000000..57ff6ae61af
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..52a0357f05b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..acb3ecb39eb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc
new file mode 100644
index 00000000000..fa003a2ec80
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..622070bb537
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..235c5b2d1ce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..88f2761ff5d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..3cb868c01d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e95ac57ece6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc
new file mode 100644
index 00000000000..9b6fe582036
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..abbd4528f8f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..d30d9d8883a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc
new file mode 100644
index 00000000000..a1466c6507e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc
new file mode 100644
index 00000000000..a88da8d1de9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc
new file mode 100644
index 00000000000..5dded0b55aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc
new file mode 100644
index 00000000000..1a364735198
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..aebd547bc3a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..6afbf82c5dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc
new file mode 100644
index 00000000000..cbc8f919d46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c867a8de9fe
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..0084d8d7078
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc
new file mode 100644
index 00000000000..62797f670d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc
new file mode 100644
index 00000000000..f40e6b04ab4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc
new file mode 100644
index 00000000000..a81bb65f3f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc
new file mode 100644
index 00000000000..3470a9c9159
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc
new file mode 100644
index 00000000000..409c254442b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc
new file mode 100644
index 00000000000..e335ec76c93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc
new file mode 100644
index 00000000000..2db4dea0b3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc
new file mode 100644
index 00000000000..0d447176e25
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc
new file mode 100644
index 00000000000..239a7a6692d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc
new file mode 100644
index 00000000000..9d82f1d1172
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc
new file mode 100644
index 00000000000..3b360555852
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc
new file mode 100644
index 00000000000..fa00db7f4ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc
new file mode 100644
index 00000000000..f809a67ac00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc
new file mode 100644
index 00000000000..6792557a8a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b140a33cc8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc
new file mode 100644
index 00000000000..2d00ae3c934
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc
new file mode 100644
index 00000000000..2d879f88b97
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..4c5b4039503
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc
new file mode 100644
index 00000000000..d3f63cecf00
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc
new file mode 100644
index 00000000000..c42ac91aa6c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc
new file mode 100644
index 00000000000..3ce7bf0a493
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc
new file mode 100644
index 00000000000..f700fa0a398
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..988f5e723ca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..bbe37fade46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc
new file mode 100644
index 00000000000..d6aa66ac88b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..330634647c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..12478a3f4e9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc
new file mode 100644
index 00000000000..8c41bcecea6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..b53eb7c64e8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..564bf132849
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc
new file mode 100644
index 00000000000..17b8714eceb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..5970d009e21
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d62ffbea7d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc
new file mode 100644
index 00000000000..1f56d2d4968
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..a0ce1d47684
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..bcbbdaf78d5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc
new file mode 100644
index 00000000000..96f48b836a8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..9ed5311ab37
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e55707d5077
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc
new file mode 100644
index 00000000000..5cd0ea38f44
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..cd3a2bf5273
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..8139c345a81
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc
new file mode 100644
index 00000000000..e5184dd5a50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/mask_reductions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/mask_reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc
new file mode 100644
index 00000000000..89c7b9d5db7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc
new file mode 100644
index 00000000000..540e66bf038
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc
new file mode 100644
index 00000000000..c92cd794d76
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc
new file mode 100644
index 00000000000..4b3a8f89b92
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc
new file mode 100644
index 00000000000..0caaeaac8f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc
new file mode 100644
index 00000000000..07ee6a1e619
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc
new file mode 100644
index 00000000000..c5a24f463f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc
new file mode 100644
index 00000000000..bd67831e0a3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc
new file mode 100644
index 00000000000..f03c6cc86e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_1arg-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_1arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc
new file mode 100644
index 00000000000..56bb3c2c6c6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc
new file mode 100644
index 00000000000..fb742c73c20
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc
new file mode 100644
index 00000000000..1a03db95e4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc
new file mode 100644
index 00000000000..348355ad4b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc
new file mode 100644
index 00000000000..0b775643a78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc
new file mode 100644
index 00000000000..0325569e8b8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc
new file mode 100644
index 00000000000..3ebc0e5eef3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b3970109140
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc
new file mode 100644
index 00000000000..dd1660bac18
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/math_2arg-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/math_2arg.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc
new file mode 100644
index 00000000000..525d39b0e05
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc
new file mode 100644
index 00000000000..ca07cd1eb15
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc
new file mode 100644
index 00000000000..18e2d574150
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc
new file mode 100644
index 00000000000..bb8c03a5d56
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..cd62bf3a279
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc
new file mode 100644
index 00000000000..8021e3965b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc
new file mode 100644
index 00000000000..ebdb78599d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..968fe783144
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc
new file mode 100644
index 00000000000..14e565bcf33
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc
new file mode 100644
index 00000000000..be62012f2be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc
new file mode 100644
index 00000000000..f97188fcceb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc
new file mode 100644
index 00000000000..047c01cef90
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc
new file mode 100644
index 00000000000..8b8dfc0f097
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc
new file mode 100644
index 00000000000..982e3901c05
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc
new file mode 100644
index 00000000000..c4a8a59b623
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc
new file mode 100644
index 00000000000..f9ceb8b56be
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc
new file mode 100644
index 00000000000..f07524c129a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc
new file mode 100644
index 00000000000..ab0be546457
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc
new file mode 100644
index 00000000000..6655d90326b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc
new file mode 100644
index 00000000000..959a368cc43
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc
new file mode 100644
index 00000000000..5593ff46d08
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc
new file mode 100644
index 00000000000..2073875046e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc
new file mode 100644
index 00000000000..1b7465aaae4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc
new file mode 100644
index 00000000000..e5eddab9dcf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc
new file mode 100644
index 00000000000..6ed1deca404
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc
new file mode 100644
index 00000000000..95ec2baa936
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc
new file mode 100644
index 00000000000..4de27d782fa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc
new file mode 100644
index 00000000000..44c42ce9791
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc
new file mode 100644
index 00000000000..dce845b714b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc
new file mode 100644
index 00000000000..402e6fa555c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc
new file mode 100644
index 00000000000..bc1f652cfe6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3270b07460f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc
new file mode 100644
index 00000000000..a4a84c1438b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..10e303fcbb1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..dd5cb5f6b27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc
new file mode 100644
index 00000000000..12dbfa81102
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..0c951b1e39a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..58a81f027e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc
new file mode 100644
index 00000000000..6655cdae33f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..b919dc872e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d1c2f2edc75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc
new file mode 100644
index 00000000000..a9825e90832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..9837223badd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..b466d3f1827
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc
new file mode 100644
index 00000000000..eb1d51acba5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..f228b70237b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..83b33288b42
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc
new file mode 100644
index 00000000000..b29605032db
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..18c168306a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..187273c581a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc
new file mode 100644
index 00000000000..6525f575d67
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operator_cvt-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operator_cvt.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc
new file mode 100644
index 00000000000..b279c4d6e2c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc
new file mode 100644
index 00000000000..02495cb3b82
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char.cc
new file mode 100644
index 00000000000..c5044c6b99b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc
new file mode 100644
index 00000000000..794872c0833
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ee87fbfb637
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc
new file mode 100644
index 00000000000..b25efd91e37
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc
new file mode 100644
index 00000000000..433dee6758e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..f4b44a44cb3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc
new file mode 100644
index 00000000000..f6479061ad7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc
new file mode 100644
index 00000000000..2d046a21fb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc
new file mode 100644
index 00000000000..919969f8bab
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-double.cc b/libstdc++-v3/testsuite/experimental/simd/operators-double.cc
new file mode 100644
index 00000000000..e22e22dd8d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc
new file mode 100644
index 00000000000..dbd318cd74a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc
new file mode 100644
index 00000000000..2e401bb0263
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-float.cc b/libstdc++-v3/testsuite/experimental/simd/operators-float.cc
new file mode 100644
index 00000000000..074eba229d9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc
new file mode 100644
index 00000000000..abd78d194c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc
new file mode 100644
index 00000000000..bf06696dc5e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-int.cc b/libstdc++-v3/testsuite/experimental/simd/operators-int.cc
new file mode 100644
index 00000000000..00eb40405fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc
new file mode 100644
index 00000000000..8746c9c550e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc
new file mode 100644
index 00000000000..f30884ec6a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long.cc
new file mode 100644
index 00000000000..2610dd74481
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc
new file mode 100644
index 00000000000..efd8f2f307d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc
new file mode 100644
index 00000000000..3b184d8f4a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc
new file mode 100644
index 00000000000..8d4e204a558
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc
new file mode 100644
index 00000000000..dbc8d820951
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc
new file mode 100644
index 00000000000..23ade5883c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc
new file mode 100644
index 00000000000..b318368c4a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc
new file mode 100644
index 00000000000..a03fe143069
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc
new file mode 100644
index 00000000000..b455eebdd89
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-short.cc b/libstdc++-v3/testsuite/experimental/simd/operators-short.cc
new file mode 100644
index 00000000000..a3ad21df595
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc
new file mode 100644
index 00000000000..0442070b9f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..1974f910812
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc
new file mode 100644
index 00000000000..637c25a5c1b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..c12b4aba4a5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..da9b4e86b3e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc
new file mode 100644
index 00000000000..25e3a0a7705
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..83d20644082
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..42cf8f63ad2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc
new file mode 100644
index 00000000000..3196bdc84f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..21b883d0642
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..2b20cf03502
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc
new file mode 100644
index 00000000000..59ab58edf2f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..41131bd5f2d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..b8748fe1bf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc
new file mode 100644
index 00000000000..09b40f12e08
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..9e4d6445a8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..47f6572c21d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc
new file mode 100644
index 00000000000..aad63de33c5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..8880fc0e8e0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..b9d3ca35272
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc
new file mode 100644
index 00000000000..c88dcebda30
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/operators-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/operators.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc
new file mode 100644
index 00000000000..c6c5b258153
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc
new file mode 100644
index 00000000000..1fdb568b0a2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc
new file mode 100644
index 00000000000..66092cdc40b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc
new file mode 100644
index 00000000000..2b5b7a89b50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..9ff0b2c911c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc
new file mode 100644
index 00000000000..277ff5cf799
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc
new file mode 100644
index 00000000000..e42d1adb9a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..dee15db677f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc
new file mode 100644
index 00000000000..8b173d1ff5e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc
new file mode 100644
index 00000000000..6df4d82726c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc
new file mode 100644
index 00000000000..538936d9ec0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc
new file mode 100644
index 00000000000..1d8f787a517
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc
new file mode 100644
index 00000000000..a535b554801
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc
new file mode 100644
index 00000000000..a9a923b39e4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc
new file mode 100644
index 00000000000..983ccd569df
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc
new file mode 100644
index 00000000000..3d9aca0c2cf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc
new file mode 100644
index 00000000000..d5a60b21f6f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc
new file mode 100644
index 00000000000..d067bdc064d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc
new file mode 100644
index 00000000000..0c0494443fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc
new file mode 100644
index 00000000000..bf40cfc4769
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc
new file mode 100644
index 00000000000..c39e9988c3e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc
new file mode 100644
index 00000000000..685c9892f0a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc
new file mode 100644
index 00000000000..ff85cd56ddb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc
new file mode 100644
index 00000000000..c2f39fe67ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc
new file mode 100644
index 00000000000..6d6d33e93c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc
new file mode 100644
index 00000000000..040d809d4f8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc
new file mode 100644
index 00000000000..6223592a039
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc
new file mode 100644
index 00000000000..2175dfbe172
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc
new file mode 100644
index 00000000000..aaef87a2a3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc
new file mode 100644
index 00000000000..b9af0f6d195
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc
new file mode 100644
index 00000000000..98b6acad3a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..41961d26b52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc
new file mode 100644
index 00000000000..4d9414f368c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..9e54f69605d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..53bfac1fa11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc
new file mode 100644
index 00000000000..bc57dc2b24f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..ff5fce0d845
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..422f8a704e7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc
new file mode 100644
index 00000000000..d8521e699b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..7d967629035
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..6c08d3087ff
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc
new file mode 100644
index 00000000000..b605891903a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..df70fb5e234
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..0d133b07a02
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc
new file mode 100644
index 00000000000..70fce75c309
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..b33657d8790
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..3e7666d25b5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc
new file mode 100644
index 00000000000..731ee35e9f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..bba4697efb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..4dc5a4d9352
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc
new file mode 100644
index 00000000000..d726e391412
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/reductions-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/reductions.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc
new file mode 100644
index 00000000000..9a391af162b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc
new file mode 100644
index 00000000000..76bf3d0fdb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc
new file mode 100644
index 00000000000..c3d22e3fe51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc
new file mode 100644
index 00000000000..b8cd91df1a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc
new file mode 100644
index 00000000000..7ec6c9ea47d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc
new file mode 100644
index 00000000000..1780e2dd258
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc
new file mode 100644
index 00000000000..34f987165f4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc
new file mode 100644
index 00000000000..61fb5f28794
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc
new file mode 100644
index 00000000000..5d488071143
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/remqo-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/remqo.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc
new file mode 100644
index 00000000000..d89006bc16d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc
new file mode 100644
index 00000000000..2885e825ecf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char.cc
new file mode 100644
index 00000000000..418c0afde04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc
new file mode 100644
index 00000000000..8e03105545d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..fad6ab5ac1e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc
new file mode 100644
index 00000000000..8bb564337ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc
new file mode 100644
index 00000000000..c86bf41329a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..bb86cb1122e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc
new file mode 100644
index 00000000000..3bde89f1ef4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc
new file mode 100644
index 00000000000..56df57d63e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc
new file mode 100644
index 00000000000..54569e8f2d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-double.cc b/libstdc++-v3/testsuite/experimental/simd/simd-double.cc
new file mode 100644
index 00000000000..bd9af0c8901
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc
new file mode 100644
index 00000000000..f513909e8ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc
new file mode 100644
index 00000000000..ecfdb179758
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-float.cc b/libstdc++-v3/testsuite/experimental/simd/simd-float.cc
new file mode 100644
index 00000000000..4b2bd1c6613
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc
new file mode 100644
index 00000000000..2d758d5eb50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc
new file mode 100644
index 00000000000..d55e3a13751
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-int.cc b/libstdc++-v3/testsuite/experimental/simd/simd-int.cc
new file mode 100644
index 00000000000..14c02ac49cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc
new file mode 100644
index 00000000000..732890cc136
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc
new file mode 100644
index 00000000000..0898e26fd12
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long.cc
new file mode 100644
index 00000000000..882a2bd5e52
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc
new file mode 100644
index 00000000000..b607fe81fe0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc
new file mode 100644
index 00000000000..05581dc5a0d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc
new file mode 100644
index 00000000000..cf741d54b2b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc
new file mode 100644
index 00000000000..0e24adfe874
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc
new file mode 100644
index 00000000000..575575286cd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc
new file mode 100644
index 00000000000..49896a5e1c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc
new file mode 100644
index 00000000000..cdf2bcd0805
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc
new file mode 100644
index 00000000000..1eacae08dee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-short.cc b/libstdc++-v3/testsuite/experimental/simd/simd-short.cc
new file mode 100644
index 00000000000..9afec6a6f66
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc
new file mode 100644
index 00000000000..26abe7185f3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..798fe3b90a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc
new file mode 100644
index 00000000000..b1ff461462d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..2cb9489ab8e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..1ea3ab4c80e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc
new file mode 100644
index 00000000000..c3d0a898ac0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..5711322b7c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..c6ab76b7bd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc
new file mode 100644
index 00000000000..3068d4ca7aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..f640f2e6da2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..ce454db5cf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc
new file mode 100644
index 00000000000..433ae996eb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..a25540981c8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..e5a2be2a6f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc
new file mode 100644
index 00000000000..9735360d999
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..8597525567e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..e08dab57a4e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc
new file mode 100644
index 00000000000..c98a565773c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..a5d37a9949c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..ba02727f6a9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc
new file mode 100644
index 00000000000..07c313833f1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/simd-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/simd.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc
new file mode 100644
index 00000000000..e142e79c84e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc
new file mode 100644
index 00000000000..834c2c3df68
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc
new file mode 100644
index 00000000000..fefc41e3822
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc
new file mode 100644
index 00000000000..88376e09ee9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc
new file mode 100644
index 00000000000..565e225997d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc
new file mode 100644
index 00000000000..25a71653eb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc
new file mode 100644
index 00000000000..096c122d151
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc
new file mode 100644
index 00000000000..3dbb43ce5c7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+// { dg-do compile }
+#define TESTFIXEDSIZE 1
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc
new file mode 100644
index 00000000000..e70d0624fe8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/sincos-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+// { dg-do compile }
+#include "tests/sincos.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc
new file mode 100644
index 00000000000..a4c9a607644
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc
new file mode 100644
index 00000000000..ab6fe04efad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc
new file mode 100644
index 00000000000..3bcb2b145fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc
new file mode 100644
index 00000000000..a3f4ec408d2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..e0136767bb6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc
new file mode 100644
index 00000000000..68c00bd6483
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc
new file mode 100644
index 00000000000..5ea4ee445f6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..9e682c5249d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc
new file mode 100644
index 00000000000..c90f1bb9d54
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc
new file mode 100644
index 00000000000..c7dbd5156ea
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc
new file mode 100644
index 00000000000..2e4eea69f11
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc
new file mode 100644
index 00000000000..49feda56304
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc
new file mode 100644
index 00000000000..8631b832bd9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc
new file mode 100644
index 00000000000..ff9d8573ac2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc
new file mode 100644
index 00000000000..54b889cf83b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc
new file mode 100644
index 00000000000..08915acc8b7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc
new file mode 100644
index 00000000000..4d3cdefd393
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc
new file mode 100644
index 00000000000..e600bcbab3b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc
new file mode 100644
index 00000000000..a6c4208cb99
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc
new file mode 100644
index 00000000000..e2ee03eb9bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc
new file mode 100644
index 00000000000..962dfacf1c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc
new file mode 100644
index 00000000000..86acb97528f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc
new file mode 100644
index 00000000000..13b6178425e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc
new file mode 100644
index 00000000000..0ca90a052d1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc
new file mode 100644
index 00000000000..0e91b631975
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc
new file mode 100644
index 00000000000..004d3e3f5f0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc
new file mode 100644
index 00000000000..2fbfcd80747
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc
new file mode 100644
index 00000000000..01f0e4c18b2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc
new file mode 100644
index 00000000000..34d90771dc2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc
new file mode 100644
index 00000000000..19375502263
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc
new file mode 100644
index 00000000000..34654adc06a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..3b9e2f8d286
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc
new file mode 100644
index 00000000000..167f9cc4f4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..4cd4990fd6a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..7db0ab8e1ea
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc
new file mode 100644
index 00000000000..726683dbb7c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..3195c2f4cf9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..7418dfc973c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc
new file mode 100644
index 00000000000..6527c4dd66d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..da8dbb92f28
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..d9473d3731c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc
new file mode 100644
index 00000000000..aabffd9b45e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..f5d933f600e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..da250a2ae75
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc
new file mode 100644
index 00000000000..fb110076926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..4377025dea5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..19549a6c7c3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc
new file mode 100644
index 00000000000..c8d79627abc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..e02e923bc50
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..cc4f3b6df19
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc
new file mode 100644
index 00000000000..e8cdb8a87a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/split_concat-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/split_concat.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc
new file mode 100644
index 00000000000..b34e52e7ce9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc
new file mode 100644
index 00000000000..5027f4731fc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char.cc
new file mode 100644
index 00000000000..45963f1c3bd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc
new file mode 100644
index 00000000000..7e8d9bc485b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..f8a0d0f423a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc
new file mode 100644
index 00000000000..4a312aafc1c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc
new file mode 100644
index 00000000000..0c4dc08ede6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..f4f72e0079a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc
new file mode 100644
index 00000000000..771f162def2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc
new file mode 100644
index 00000000000..2ce1fba820b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc
new file mode 100644
index 00000000000..4e37962fa98
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-double.cc b/libstdc++-v3/testsuite/experimental/simd/splits-double.cc
new file mode 100644
index 00000000000..a46b26ad82c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc
new file mode 100644
index 00000000000..38bd2b87841
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc
new file mode 100644
index 00000000000..9bd0353a0b3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-float.cc b/libstdc++-v3/testsuite/experimental/simd/splits-float.cc
new file mode 100644
index 00000000000..9f69bb92434
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc
new file mode 100644
index 00000000000..cf34de20fd3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc
new file mode 100644
index 00000000000..723ffa29b7d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-int.cc b/libstdc++-v3/testsuite/experimental/simd/splits-int.cc
new file mode 100644
index 00000000000..3c2f4599d1f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc
new file mode 100644
index 00000000000..ab892ec4882
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc
new file mode 100644
index 00000000000..2034a6dead9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long.cc
new file mode 100644
index 00000000000..c0ea634cb78
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc
new file mode 100644
index 00000000000..cbcd8981926
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc
new file mode 100644
index 00000000000..dcc65a34cd8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc
new file mode 100644
index 00000000000..f07bf55b066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc
new file mode 100644
index 00000000000..19ce4612cb5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc
new file mode 100644
index 00000000000..d7ca3bebbe9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc
new file mode 100644
index 00000000000..a1d1a91fa19
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc
new file mode 100644
index 00000000000..a5e0352d9ae
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc
new file mode 100644
index 00000000000..de7b69b8d36
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-short.cc b/libstdc++-v3/testsuite/experimental/simd/splits-short.cc
new file mode 100644
index 00000000000..d5c3ed05d43
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc
new file mode 100644
index 00000000000..b9242b69551
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..f69adef42c2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc
new file mode 100644
index 00000000000..3d44ee57712
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..72d15dabf41
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..52011535c7b
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc
new file mode 100644
index 00000000000..49167f61bdf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..bd955b7e72e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..4840ffd74fd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc
new file mode 100644
index 00000000000..a725ce1b2a0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..ac94a36f0cc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..a27a5a7cb72
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc
new file mode 100644
index 00000000000..4b9d6d9d0a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..b51f48ef4a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..c3e4386453f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc
new file mode 100644
index 00000000000..c42d302d75f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..29b38f9d276
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..96b6d9403ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc
new file mode 100644
index 00000000000..53b0ce0d9e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..d38710ef118
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..a90234fb0aa
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc
new file mode 100644
index 00000000000..0429dd527d8
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/splits-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/splits.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/abs.h b/libstdc++-v3/testsuite/experimental/simd/tests/abs.h
new file mode 100644
index 00000000000..8769aa0ac20
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/abs.h
@@ -0,0 +1,22 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include <cmath> // abs & sqrt
+#include <cstdlib> // integer abs
+#include "bits/test_values.h"
+
+template <typename V> void test()
+{
+ if constexpr (std::is_signed_v<typename V::value_type>)
+ {
+ using std::abs;
+ using T = typename V::value_type;
+ using L = std::numeric_limits<T>;
+ test_values<V>({L::max(), L::lowest(), L::min(), -L::max() / 2, T(), -T(),
+ T(-1), T(-2)},
+ {1000}, [](V input) {
+ const V expected(
+ [&](auto i) { return T(std::abs(T(input[i]))); });
+ COMPARE(abs(input), expected) << "input: " << input;
+ });
+ }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h b/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h
new file mode 100644
index 00000000000..088646838ef
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/algorithms.h
@@ -0,0 +1,13 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V> void test()
+{
+ using T = typename V::value_type;
+ V a{[](auto i) -> T { return i & 1u; }};
+ V b{[](auto i) -> T { return (i + 1u) & 1u; }};
+ COMPARE(min(a, b), V{0});
+ COMPARE(max(a, b), V{1});
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h
new file mode 100644
index 00000000000..f4e7b3b6f13
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h
@@ -0,0 +1,145 @@
+#include <array>
+
+// is_conversion_undefined {{{1
+/* implementation-defined
+ * ======================
+ * §4.7 p3 (integral conversions)
+ * If the destination type is signed, the value is unchanged if it can be represented in the
+ * destination type (and bit-field width); otherwise, the value is implementation-defined.
+ *
+ * undefined
+ * =========
+ * §4.9/1 (floating-point conversions)
+ * If the source value is neither exactly represented in the destination type nor between
+ * two adjacent destination values the result is undefined.
+ *
+ * §4.10/1 (floating-integral conversions)
+ * floating point type can be converted to integer type.
+ * The behavior is undefined if the truncated value cannot be
+ * represented in the destination type.
+ *
+ * §4.10/2
+ * integer can be converted to floating point type.
+ * If the value being converted is outside the range of values that can be represented, the
+ * behavior is undefined.
+ */
+template <typename To, typename From>
+constexpr bool is_conversion_undefined_impl(From x, std::true_type)
+{
+ return x > static_cast<long double>(std::numeric_limits<To>::max()) ||
+ x < static_cast<long double>(std::numeric_limits<To>::min());
+}
+
+template <typename To, typename From>
+constexpr bool is_conversion_undefined_impl(From, std::false_type)
+{
+ return false;
+}
+
+template <typename To, typename From> constexpr bool is_conversion_undefined(From x)
+{
+ static_assert(std::is_arithmetic<From>::value,
+ "this overload is only meant for builtin arithmetic types");
+ return is_conversion_undefined_impl<To, From>(
+ x, std::integral_constant<bool, (std::is_floating_point<From>::value &&
+ (std::is_integral<To>::value ||
+ (std::is_floating_point<To>::value &&
+ sizeof(From) > sizeof(To))))>());
+}
+
+static_assert(is_conversion_undefined<uint>(float(0x100000000LL)),
+ "testing my expectations of is_conversion_undefined");
+static_assert(!is_conversion_undefined<float>(0x100000000LL),
+ "testing my expectations of is_conversion_undefined");
+
+template <typename To, typename T, typename A>
+inline std::experimental::simd_mask<T, A> is_conversion_undefined(const std::experimental::simd<T, A> &x)
+{
+ std::experimental::simd_mask<T, A> k = false;
+ for (std::size_t i = 0; i < x.size(); ++i) {
+ k[i] = is_conversion_undefined(x[i]);
+ }
+ return k;
+}
+
+//operators helpers //{{{1
+template <class T> constexpr T genHalfBits()
+{
+ return std::numeric_limits<T>::max() >> (std::numeric_limits<T>::digits / 2);
+}
+template <> constexpr long double genHalfBits<long double>() { return 0; }
+template <> constexpr double genHalfBits<double>() { return 0; }
+template <> constexpr float genHalfBits<float>() { return 0; }
+
+template <class U, class T, class UU> constexpr U avoid_ub(UU x)
+{
+ return is_conversion_undefined<T>(U(x)) ? U(0) : U(x);
+}
+
+template <class U, class T, class UU> constexpr U avoid_ub2(UU x)
+{
+ return is_conversion_undefined<U>(x) ? U(0) : avoid_ub<U, T>(x);
+}
+
+// conversion test input data //{{{1
+template <class U, class T>
+static const std::array<U, 53> cvt_input_data = {{
+ avoid_ub<U, T>(0xc0000080U),
+ avoid_ub<U, T>(0xc0000081U),
+ avoid_ub<U, T>(0xc0000082U),
+ avoid_ub<U, T>(0xc0000084U),
+ avoid_ub<U, T>(0xc0000088U),
+ avoid_ub<U, T>(0xc0000090U),
+ avoid_ub<U, T>(0xc00000A0U),
+ avoid_ub<U, T>(0xc00000C0U),
+ avoid_ub<U, T>(0xc000017fU),
+ avoid_ub<U, T>(0xc0000180U),
+ avoid_ub<U, T>(0x100000001LL),
+ avoid_ub<U, T>(0x100000011LL),
+ avoid_ub<U, T>(0x100000111LL),
+ avoid_ub<U, T>(0x100001111LL),
+ avoid_ub<U, T>(0x100011111LL),
+ avoid_ub<U, T>(0x100111111LL),
+ avoid_ub<U, T>(0x101111111LL),
+ avoid_ub<U, T>(-0x100000001LL),
+ avoid_ub<U, T>(-0x100000011LL),
+ avoid_ub<U, T>(-0x100000111LL),
+ avoid_ub<U, T>(-0x100001111LL),
+ avoid_ub<U, T>(-0x100011111LL),
+ avoid_ub<U, T>(-0x100111111LL),
+ avoid_ub<U, T>(-0x101111111LL),
+ avoid_ub<U, T>(std::numeric_limits<U>::min()),
+ avoid_ub<U, T>(std::numeric_limits<U>::min() + 1),
+ avoid_ub<U, T>(std::numeric_limits<U>::lowest()),
+ avoid_ub<U, T>(std::numeric_limits<U>::lowest() + 1),
+ avoid_ub<U, T>(-1),
+ avoid_ub<U, T>(-10),
+ avoid_ub<U, T>(-100),
+ avoid_ub<U, T>(-1000),
+ avoid_ub<U, T>(-10000),
+ avoid_ub<U, T>(0),
+ avoid_ub<U, T>(1),
+ avoid_ub<U, T>(genHalfBits<U>() - 1),
+ avoid_ub<U, T>(genHalfBits<U>()),
+ avoid_ub<U, T>(genHalfBits<U>() + 1),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() - 1),
+ avoid_ub<U, T>(std::numeric_limits<U>::max()),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() - 0xff),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() - 0xff),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() - 0x55),
+ avoid_ub<U, T>(-(std::numeric_limits<U>::min() + 1)),
+ avoid_ub<U, T>(-std::numeric_limits<U>::max()),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 6 - 1)),
+ avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 6 - 1)),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 4 - 1)),
+ avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 4 - 1)),
+ avoid_ub<U, T>(std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 2 - 1)),
+ avoid_ub2<U, T>(-std::numeric_limits<U>::max() / std::pow(2., sizeof(T) * 2 - 1)),
+ avoid_ub<U, T>(std::numeric_limits<T>::max() - 1),
+ avoid_ub<U, T>(std::numeric_limits<T>::max() * 0.75),
+}};
+
+template <class T, class U> struct cvt_inputs {
+ static constexpr size_t size() { return cvt_input_data<U, T>.size(); }
+ U operator[](size_t i) const { return cvt_input_data<U, T>[i]; }
+};
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h
new file mode 100644
index 00000000000..931b36edb61
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/make_vec.h
@@ -0,0 +1,62 @@
+/* This file is part of the Vc library. {{{
+Copyright © 2017 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+ * Neither the names of contributing organizations nor the
+ names of its contributors may be used to endorse or promote products
+ derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#include <experimental/simd>
+
+template <class M> inline M make_mask(const std::initializer_list<bool> &init)
+{
+ std::size_t i = 0;
+ M r = {};
+ for (;;) {
+ for (bool x : init) {
+ r[i] = x;
+ if (++i == M::size()) {
+ return r;
+ }
+ }
+ }
+}
+
+template <class V>
+inline V make_vec(const std::initializer_list<typename V::value_type> &init,
+ typename V::value_type inc = 0)
+{
+ std::size_t i = 0;
+ V r = {};
+ typename V::value_type base = 0;
+ for (;;) {
+ for (auto x : init) {
+ r[i] = base + x;
+ if (++i == V::size()) {
+ return r;
+ }
+ }
+ base += inc;
+ }
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h
new file mode 100644
index 00000000000..ebf58cd0d32
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/mathreference.h
@@ -0,0 +1,112 @@
+#include <tuple>
+#include <utility>
+#include <cstdio>
+
+template <typename T> struct SincosReference //{{{1
+{
+ T x, s, c;
+
+ std::tuple<const T &, const T &, const T &> as_tuple() const
+ {
+ return std::tie(x, s, c);
+ }
+};
+
+template <typename T> struct Reference {
+ T x, ref;
+
+ std::tuple<const T &, const T &> as_tuple() const { return std::tie(x, ref); }
+};
+
+template <typename T> struct Array
+{
+ std::size_t size_;
+ const T *data_;
+ Array() : size_(0), data_(nullptr) {}
+ Array(size_t s, const T *p) : size_(s), data_(p) {}
+ const T *begin() const { return data_; }
+ const T *end() const { return data_ + size_; }
+ std::size_t size() const { return size_; }
+};
+
+namespace function {
+struct sincos{ static constexpr const char *const str = "sincos"; };
+struct atan { static constexpr const char *const str = "atan"; };
+struct asin { static constexpr const char *const str = "asin"; };
+struct acos { static constexpr const char *const str = "acos"; };
+struct log { static constexpr const char *const str = "ln"; };
+struct log2 { static constexpr const char *const str = "log2"; };
+struct log10 { static constexpr const char *const str = "log10"; };
+}
+
+template <class F> struct testdatatype_for_function {
+ template <class T> using type = Reference<T>;
+};
+template <> struct testdatatype_for_function<function::sincos> {
+ template <class T> using type = SincosReference<T>;
+};
+template <class F, class T>
+using testdatatype_for_function_t =
+ typename testdatatype_for_function<F>::template type<T>;
+
+template<typename T> struct StaticDeleter
+{
+ const T *ptr;
+ StaticDeleter(const T *p) : ptr(p) {}
+ ~StaticDeleter() { delete[] ptr; }
+};
+
+template <class F, class T> inline std::string filename()
+{
+ static_assert(std::is_floating_point<T>::value, "");
+ using Lim = std::numeric_limits<T>;
+ static const auto cache =
+ std::string("reference-") + F::str +
+ (sizeof(T) == 4 && Lim::digits == 24 && Lim::max_exponent == 128
+ ? "-sp"
+ : (sizeof(T) == 8 && Lim::digits == 53 && Lim::max_exponent == 1024
+ ? "-dp"
+ : (sizeof(T) == 16 && Lim::digits == 64 &&
+ Lim::max_exponent == 16384
+ ? "-dep"
+ : (sizeof(T) == 16 && Lim::digits == 113 &&
+ Lim::max_exponent == 16384
+ ? "-qp"
+ : "-unknown")))) +
+ ".dat";
+ return cache;
+}
+
+template <class Fun, class T, class Ref = testdatatype_for_function_t<Fun, T>>
+Array<Ref> referenceData()
+{
+ static Array<Ref> data;
+ if (data.data_ == nullptr)
+ {
+ FILE* file = std::fopen(filename<Fun, T>().c_str(), "rb");
+ if (file)
+ {
+ std::fseek(file, 0, SEEK_END);
+ const size_t size = std::ftell(file) / sizeof(Ref);
+ std::rewind(file);
+ auto mem = new Ref[size];
+ static StaticDeleter<Ref> _cleanup(data.data_);
+ data.size_ = std::fread(mem, sizeof(Ref), size, file);
+ data.data_ = mem;
+ std::fclose(file);
+ }
+ else
+ {
+ __builtin_fprintf(
+ stderr,
+ "%s:%d: the reference data %s does not exist in the current "
+ "working directory.\n",
+ __FILE__, __LINE__, filename<Fun, T>().c_str());
+ __builtin_abort();
+ }
+ }
+ return data;
+}
+
+//}}}1
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h
new file mode 100644
index 00000000000..1eb1b0d1681
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/metahelpers.h
@@ -0,0 +1,170 @@
+#ifndef VC_TESTS_METAHELPERS_H_
+#define VC_TESTS_METAHELPERS_H_
+
+#include <functional>
+#include <limits>
+#include <type_traits>
+#include <utility>
+
+namespace vir
+{
+namespace test
+{
+// operator_is_substitution_failure {{{1
+template <class A, class B, class Op>
+constexpr bool operator_is_substitution_failure_impl(float)
+{
+ return true;
+}
+
+template <class A, class B, class Op>
+constexpr
+ typename std::conditional<true, bool,
+ decltype(Op()(std::declval<A>(), std::declval<B>()))>::type
+ operator_is_substitution_failure_impl(int)
+{
+ return false;
+}
+
+template <class... Ts> constexpr bool operator_is_substitution_failure()
+{
+ return operator_is_substitution_failure_impl<Ts...>(int());
+}
+
+// sfinae_is_callable{{{1
+#ifdef Vc_CLANG
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wundefined-inline"
+#endif
+template <class... Args, class F>
+constexpr auto sfinae_is_callable_impl(int, F &&f) -> typename std::conditional<
+ true, std::true_type, decltype(std::forward<F>(f)(std::declval<Args>()...))>::type;
+template <class... Args, class F> constexpr std::false_type sfinae_is_callable_impl(float, const F &);
+template <class... Args, class F> constexpr bool sfinae_is_callable(F &&)
+{
+ return decltype(sfinae_is_callable_impl<Args...>(int(), std::declval<F>()))::value;
+}
+template <class... Args, class F>
+constexpr auto sfinae_is_callable_t(F &&f)
+ -> decltype(sfinae_is_callable_impl<Args...>(int(), std::declval<F>()));
+
+#ifdef Vc_CLANG
+#pragma clang diagnostic pop
+#endif
+
+// traits {{{1
+template <class A, class B> constexpr bool has_less_bits()
+{
+ return std::numeric_limits<A>::digits < std::numeric_limits<B>::digits;
+}
+
+//}}}1
+} // namespace test
+} // namespace vir
+
+// more operator objects {{{1
+struct assignment {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() = std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) = std::forward<B>(b)))
+ {
+ return std::forward<A>(a) = std::forward<B>(b);
+ }
+};
+
+struct bit_shift_left {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() << std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) << std::forward<B>(b)))
+ {
+ return std::forward<A>(a) << std::forward<B>(b);
+ }
+};
+
+struct bit_shift_right {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() >> std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) >> std::forward<B>(b)))
+ {
+ return std::forward<A>(a) >> std::forward<B>(b);
+ }
+};
+
+struct assign_modulus {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() %= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) %= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) %= std::forward<B>(b);
+ }
+};
+
+struct assign_bit_and {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() &= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) &= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) &= std::forward<B>(b);
+ }
+};
+
+struct assign_bit_or {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() |= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) |= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) |= std::forward<B>(b);
+ }
+};
+
+struct assign_bit_xor {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() ^= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) ^= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) ^= std::forward<B>(b);
+ }
+};
+
+struct assign_bit_shift_left {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() <<= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) <<= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) <<= std::forward<B>(b);
+ }
+};
+
+struct assign_bit_shift_right {
+ template <class A, class B>
+ constexpr decltype(std::declval<A>() >>= std::declval<B>()) operator()(A &&a,
+ B &&b) const
+ noexcept(noexcept(std::forward<A>(a) >>= std::forward<B>(b)))
+ {
+ return std::forward<A>(a) >>= std::forward<B>(b);
+ }
+};
+
+// operator_is_substitution_failure {{{1
+template <class A, class B, class Op = std::plus<>>
+constexpr bool is_substitution_failure =
+ vir::test::operator_is_substitution_failure<A, B, Op>();
+
+// sfinae_is_callable{{{1
+using vir::test::sfinae_is_callable;
+
+// traits {{{1
+using vir::test::has_less_bits;
+
+//}}}1
+
+#endif // VC_TESTS_METAHELPERS_H_
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h
new file mode 100644
index 00000000000..1b611c56b1d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/simd_view.h
@@ -0,0 +1,112 @@
+/* This file is part of the Vc library. {{{
+Copyright © 2018 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+ * Neither the names of contributing organizations nor the
+ names of its contributors may be used to endorse or promote products
+ derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#ifndef VC_TESTS_SIMD_VIEW_H_
+#define VC_TESTS_SIMD_VIEW_H_
+
+#include <experimental/simd>
+
+_GLIBCXX_SIMD_BEGIN_NAMESPACE
+namespace experimental
+{
+namespace imported_begin_end
+{
+ using std::begin;
+ using std::end;
+ template <class T> using begin_type = decltype(begin(std::declval<T>()));
+ template <class T> using end_type = decltype(end(std::declval<T>()));
+} // namespace imported_begin_end
+
+template <class V, class It, class End> class viewer
+{
+ It it;
+ const End end;
+
+ template <class F> void for_each_impl(F &&fun, std::index_sequence<0, 1, 2>)
+ {
+ for (; it + V::size() <= end; it += V::size()) {
+ fun(V([&](auto i) { return std::get<0>(it[i].as_tuple()); }),
+ V([&](auto i) { return std::get<1>(it[i].as_tuple()); }),
+ V([&](auto i) { return std::get<2>(it[i].as_tuple()); }));
+ }
+ if (it != end) {
+ fun(V([&](auto i) {
+ auto ii = it + i < end ? i + 0 : 0;
+ return std::get<0>(it[ii].as_tuple());
+ }),
+ V([&](auto i) {
+ auto ii = it + i < end ? i + 0 : 0;
+ return std::get<1>(it[ii].as_tuple());
+ }),
+ V([&](auto i) {
+ auto ii = it + i < end ? i + 0 : 0;
+ return std::get<2>(it[ii].as_tuple());
+ }));
+ }
+ }
+
+ template <class F> void for_each_impl(F &&fun, std::index_sequence<0, 1>)
+ {
+ for (; it + V::size() <= end; it += V::size()) {
+ fun(V([&](auto i) { return std::get<0>(it[i].as_tuple()); }),
+ V([&](auto i) { return std::get<1>(it[i].as_tuple()); }));
+ }
+ if (it != end) {
+ fun(V([&](auto i) {
+ auto ii = it + i < end ? i + 0 : 0;
+ return std::get<0>(it[ii].as_tuple());
+ }),
+ V([&](auto i) {
+ auto ii = it + i < end ? i + 0 : 0;
+ return std::get<1>(it[ii].as_tuple());
+ }));
+ }
+ }
+
+public:
+ viewer(It _it, End _end) : it(_it), end(_end) {}
+
+ template <class F> void for_each(F &&fun) {
+ constexpr size_t N =
+ std::tuple_size<std::decay_t<decltype(it->as_tuple())>>::value;
+ for_each_impl(std::forward<F>(fun), std::make_index_sequence<N>());
+ }
+};
+
+template <class V, class Cont>
+viewer<V, imported_begin_end::begin_type<const Cont &>,
+ imported_begin_end::end_type<const Cont &>>
+simd_view(const Cont &data)
+{
+ using std::begin;
+ using std::end;
+ return {begin(data), end(data)};
+}
+} // namespace experimental
+_GLIBCXX_SIMD_END_NAMESPACE
+
+#endif // VC_TESTS_SIMD_VIEW_H_
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h
new file mode 100644
index 00000000000..1327b814290
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h
@@ -0,0 +1,227 @@
+#include <experimental/simd>
+#include <initializer_list>
+#include <random>
+#include <cfenv>
+
+template <class T, class A>
+std::experimental::simd<T, A> iif(std::experimental::simd_mask<T, A> k,
+ const typename std::experimental::simd_mask<T, A>::simd_type &t,
+ const std::experimental::simd<T, A> &f)
+{
+ auto r = f;
+ where(k, r) = t;
+ return r;
+}
+
+template <class V>
+V epilogue_load(const typename V::value_type *mem, const std::size_t size)
+{
+ const int rem = size % V::size();
+ return where(V([](int i) { return i; }) < rem, V(0))
+ .copy_from(mem + size / V::size() * V::size(), std::experimental::element_aligned);
+}
+
+template <class V, class... F>
+void test_values(const std::initializer_list<typename V::value_type> &inputs,
+ F &&... fun_pack)
+{
+ for (auto it = inputs.begin(); it + V::size() <= inputs.end(); it += V::size()) {
+ [](auto...) {}((fun_pack(V(&it[0], std::experimental::element_aligned)), 0)...);
+ }
+ [](auto...) {}((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size())), 0)...);
+}
+
+template <class V> struct RandomValues {
+ using T = typename V::value_type;
+ using L = std::numeric_limits<T>;
+ static constexpr bool isfp = std::is_floating_point_v<T>;
+ const std::size_t count;
+ std::conditional_t<std::is_floating_point_v<T>,
+ std::uniform_real_distribution<T>,
+ std::uniform_int_distribution<T>>
+ dist;
+ const bool uniform;
+
+ RandomValues(std::size_t count_, T min, T max)
+ : count(count_), dist(min, max), uniform(true)
+ {
+ if constexpr (std::is_floating_point_v<T>)
+ VERIFY(max - min <= L::max());
+ }
+
+ RandomValues(std::size_t count_)
+ : count(count_), dist(isfp ? 1 : L::lowest(), isfp ? 2 : L::max()),
+ uniform(!isfp)
+ {
+ }
+
+ template <typename URBG> V operator()(URBG& gen)
+ {
+ if (uniform)
+ return V([&](int) { return dist(gen); });
+ else
+ {
+ auto exp_dist
+ = std::normal_distribution<float>(0.f, L::max_exponent * .5f);
+ return V([&](int) {
+ const T mant = dist(gen);
+ T fp = 0;
+ do
+ {
+ const int exp = exp_dist(gen);
+ fp = std::ldexp(mant, exp);
+ }
+ while (fp >= L::max() || fp <= L::denorm_min());
+ fp = gen() & 0x4 ? fp : -fp;
+ return fp;
+ });
+ }
+ }
+};
+
+static std::mt19937 g_mt_gen{0};
+
+template <class V, class... F>
+void
+test_values(const std::initializer_list<typename V::value_type>& inputs,
+ RandomValues<V> random, F&&... fun_pack)
+{
+ test_values<V>(inputs, fun_pack...);
+ for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+ {
+ [](auto...) {}((fun_pack(random(g_mt_gen)), 0)...);
+ }
+}
+
+template <class V, class... F>
+void test_values_2arg(const std::initializer_list<typename V::value_type> &inputs,
+ F &&... fun_pack)
+{
+ for (auto scalar_it = inputs.begin(); scalar_it != inputs.end(); ++scalar_it) {
+ for (auto it = inputs.begin(); it + V::size() <= inputs.end(); it += V::size()) {
+ [](auto...) {
+ }((fun_pack(V(&it[0], std::experimental::element_aligned), V(*scalar_it)), 0)...);
+ }
+ [](auto...) {
+ }((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size()), V(*scalar_it)),
+ 0)...);
+ }
+}
+
+template <class V, class... F>
+void
+test_values_2arg(const std::initializer_list<typename V::value_type>& inputs,
+ RandomValues<V> random, F&&... fun_pack)
+{
+ test_values_2arg<V>(inputs, fun_pack...);
+ for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+ {
+ [](auto...) {}((fun_pack(random(g_mt_gen), random(g_mt_gen)), 0)...);
+ }
+}
+
+template <class V, class... F>
+void test_values_3arg(const std::initializer_list<typename V::value_type> &inputs,
+ F &&... fun_pack)
+{
+ for (auto scalar_it1 = inputs.begin(); scalar_it1 != inputs.end(); ++scalar_it1) {
+ for (auto scalar_it2 = inputs.begin(); scalar_it2 != inputs.end(); ++scalar_it2) {
+ for (auto it = inputs.begin(); it + V::size() <= inputs.end();
+ it += V::size()) {
+ [](auto...) {}((fun_pack(V(&it[0], std::experimental::element_aligned), V(*scalar_it1),
+ V(*scalar_it2)),
+ 0)...);
+ }
+ [](auto...) {}((fun_pack(epilogue_load<V>(inputs.begin(), inputs.size()),
+ V(*scalar_it1), V(*scalar_it2)),
+ 0)...);
+ }
+ }
+}
+
+template <class V, class... F>
+void
+test_values_3arg(const std::initializer_list<typename V::value_type>& inputs,
+ RandomValues<V> random, F&&... fun_pack)
+{
+ test_values_3arg<V>(inputs, fun_pack...);
+ for (size_t i = 0; i < (random.count + V::size() - 1) / V::size(); ++i)
+ {
+ [](auto...) {
+ }((fun_pack(random(g_mt_gen), random(g_mt_gen), random(g_mt_gen)), 0)...);
+ }
+}
+
+#define MAKE_TESTER_2(name_, reference_) \
+ [&](const auto... inputs) { \
+ const auto totest = name_(inputs...); \
+ using R = std::remove_const_t<decltype(totest)>; \
+ auto&& expected = [&](const auto&... vs) -> const R { \
+ R tmp = {}; \
+ for (std::size_t i = 0; i < R::size(); ++i) \
+ { \
+ tmp[i] = reference_(vs[i]...); \
+ } \
+ return tmp; \
+ }; \
+ const R expect1 = expected(inputs...); \
+ if constexpr (std::is_floating_point_v<typename R::value_type>) \
+ { \
+ ((COMPARE(isnan(totest), isnan(expect1)) << #name_ "(") \
+ << ... << inputs) \
+ << ") = " << totest << " != " << expect1; \
+ const R expect2 = expected(iif(isnan(expect1), 0, inputs)...); \
+ ((FUZZY_COMPARE(name_(iif(isnan(expect1), 0, inputs)...), expect2) \
+ << "\nclean = ") \
+ << ... << iif(isnan(expect1), 0, inputs)); \
+ } \
+ else \
+ { \
+ ((COMPARE(name_(inputs...), expect1) << "\n" #name_ "(") \
+ << ... << inputs) \
+ << ")"; \
+ } \
+ }
+
+#define MAKE_TESTER(name_) MAKE_TESTER_2(name_, std::name_)
+
+#define MAKE_TESTER_NOFPEXCEPT(name_) \
+ [&](const auto... inputs) { \
+ std::feclearexcept(FE_ALL_EXCEPT); \
+ auto totest = name_(inputs...); \
+ ((COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0) << "\n" #name_ "(") \
+ << ... << inputs) \
+ << ")"; \
+ using R = std::remove_const_t<decltype(totest)>; \
+ auto&& expected = [&](const auto&... vs) -> const R { \
+ R tmp = {}; \
+ for (std::size_t i = 0; i < R::size(); ++i) \
+ { \
+ tmp[i] = std::name_(vs[i]...); \
+ } \
+ return tmp; \
+ }; \
+ const R expect1 = expected(inputs...); \
+ if constexpr (std::is_floating_point_v<typename R::value_type>) \
+ { \
+ ((COMPARE(isnan(totest), isnan(expect1)) << #name_ "(") \
+ << ... << inputs) \
+ << ") = " << totest << " != " << expect1; \
+ const R expect2 = expected(iif(isnan(expect1), 0, inputs)...); \
+ std::feclearexcept(FE_ALL_EXCEPT); \
+ asm volatile(""); \
+ totest = name_(iif(isnan(expect1), 0, inputs)...); \
+ asm volatile(""); \
+ ((COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0) << "\n" #name_ "(") \
+ << ... << inputs) \
+ << ")"; \
+ FUZZY_COMPARE(totest, expect2); \
+ } \
+ else \
+ { \
+ ((COMPARE(totest, expect1) << "\n" #name_ "(") << ... << inputs) \
+ << ")"; \
+ } \
+ }
+
+// vim: foldmethod=marker ts=8 sw=2 noet sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h
new file mode 100644
index 00000000000..0c61255d381
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/ulp.h
@@ -0,0 +1,113 @@
+/*{{{
+Copyright © 2011-2018 Matthias Kretz <kretz@kde.org>
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+ * Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ * Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+ * Neither the names of contributing organizations nor the
+ names of its contributors may be used to endorse or promote products
+ derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+}}}*/
+
+#ifndef ULP_H
+#define ULP_H
+
+#include <cmath>
+#include <experimental/simd>
+#include <limits>
+#include <type_traits>
+#include <cfenv>
+
+namespace vir {
+namespace test {
+template <typename T, typename R = typename T::value_type>
+R
+value_type_impl(int);
+
+template <typename T>
+T
+value_type_impl(float);
+
+template <typename T> using value_type_t = decltype(value_type_impl<T>(int()));
+
+template <typename T>
+inline T
+ulp_distance(const T& val_, const T& ref_)
+{
+ if constexpr (std::is_floating_point_v<value_type_t<T>>)
+ {
+ const int fp_exceptions = std::fetestexcept(FE_ALL_EXCEPT);
+ T val = val_;
+ T ref = ref_;
+
+ T diff = T();
+
+ using std::abs;
+ using std::fpclassify;
+ using std::frexp;
+ using std::isnan;
+ using std::isinf;
+ using std::ldexp;
+ using std::max;
+ using std::experimental::where;
+ using limits = std::numeric_limits<value_type_t<T>>;
+
+ where(ref == 0, val) = abs(val);
+ where(ref == 0, diff) = 1;
+ where(ref == 0, ref) = limits::min();
+ where(isinf(ref) && ref == val, ref)
+ = 0; // where(val_ == ref_) = 0 below will fix it up
+
+ where(val == 0, ref) = abs(ref);
+ where(val == 0, diff) += 1;
+ where(val == 0, val) = limits::min();
+
+ using I = decltype(fpclassify(std::declval<T>()));
+ I exp = {};
+ frexp(ref, &exp);
+ // lower bound for exp must be min_exponent to scale the resulting
+ // difference from a denormal correctly
+ exp = max(exp, I(limits::min_exponent));
+ diff += ldexp(abs(ref - val), limits::digits - exp);
+ where(val_ == ref_ || (isnan(val_) && isnan(ref_)), diff) = T();
+ std::feclearexcept(FE_ALL_EXCEPT ^ fp_exceptions);
+ return diff;
+ }
+ else
+ {
+ if (val_ > ref_)
+ return val_ - ref_;
+ else
+ return ref_ - val_;
+ }
+}
+
+template <typename T>
+inline T
+ulp_distance_signed(const T& _val, const T& _ref)
+{
+ using std::copysign;
+ return copysign(ulp_distance(_val, _ref), _val - _ref);
+}
+} // namespace test
+} // namespace vir
+
+#endif // ULP_H
+
+// vim: sw=2 et sts=2 foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
new file mode 100644
index 00000000000..eca79417e09
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
@@ -0,0 +1,250 @@
+#ifndef TESTS_BITS_VERIFY_H_
+#define TESTS_BITS_VERIFY_H_
+
+#include <experimental/simd>
+#include <sstream>
+#include <iomanip>
+#include "ulp.h"
+
+#ifdef _GLIBCXX_SIMD_HAVE_NEON
+// work around PR89357:
+#define alignas(...) __attribute__((aligned(__VA_ARGS__)))
+#endif
+
+using schar = signed char;
+using uchar = unsigned char;
+using ushort = unsigned short;
+using uint = unsigned int;
+using ulong = unsigned long;
+using llong = long long;
+using ullong = unsigned long long;
+using ldouble = long double;
+using wchar = wchar_t;
+using char16 = char16_t;
+using char32 = char32_t;
+
+template <class T>
+T
+make_value_unknown(const T& x)
+{
+ if constexpr (std::is_constructible_v<T, const volatile T&>)
+ {
+ const volatile T& y = x;
+ return y;
+ }
+ else
+ {
+ T y = x;
+ asm("" : "+m"(y));
+ return y;
+ }
+}
+
+class verify
+{
+ const bool m_failed = false;
+
+ template <typename T,
+ typename = decltype(std::declval<std::stringstream&>()
+ << std::declval<const T&>())>
+ void print(const T& x, int) const
+ {
+ std::stringstream ss;
+ ss << x;
+ __builtin_fprintf(stderr, "%s", ss.str().c_str());
+ }
+
+ template <typename T>
+ void print(const T& x, ...) const
+ {
+ if constexpr (std::experimental::is_simd_v<T>)
+ {
+ std::stringstream ss;
+ if constexpr (std::is_floating_point_v<typename T::value_type>)
+ {
+ ss << "\n(" << x[0] << " == " << std::hexfloat << x[0]
+ << std::defaultfloat << ')';
+ for (unsigned i = 1; i < x.size(); ++i)
+ {
+ ss << (i % 4 == 0 ? ",\n(" : ", (") << x[i]
+ << " == " << std::hexfloat << x[i] << std::defaultfloat
+ << ')';
+ }
+ }
+ else
+ {
+ ss << +x[0];
+ for (unsigned i = 1; i < x.size(); ++i)
+ {
+ ss << ", " << +x[i];
+ }
+ }
+ __builtin_fprintf(stderr, "%s", ss.str().c_str());
+ }
+ else if constexpr (std::experimental::is_simd_mask_v<T>)
+ {
+ __builtin_fprintf(stderr, (x[0] ? "[1" : "[0"));
+ for (unsigned i = 1; i < x.size(); ++i)
+ {
+ __builtin_fprintf(stderr, (x[i] ? "1" : "0"));
+ }
+ __builtin_fprintf(stderr, "]");
+ }
+ else
+ {
+ print_hex(&x, sizeof(T));
+ }
+ }
+
+ void print_hex(const void* x, std::size_t n) const
+ {
+ __builtin_fprintf(stderr, "0x");
+ const auto* bytes = static_cast<const unsigned char*>(x);
+ for (std::size_t i = 0; i < n; ++i)
+ {
+ __builtin_fprintf(stderr, (i && i % 4 == 0) ? "'%02x" : "%02x",
+ bytes[i]);
+ }
+ }
+
+public:
+ template <typename... Ts>
+ verify(bool ok,
+ const char* file,
+ const int line,
+ const char* func,
+ const char* cond,
+ const Ts&... extra_info)
+ : m_failed(!ok)
+ {
+ if (m_failed)
+ {
+ __builtin_fprintf(stderr, "%s:%d: (%s): Assertion '%s' failed.\n", file,
+ line, func, cond);
+ auto &&unused [[maybe_unused]] = {0, (print(extra_info, int()), 0)...};
+ }
+ }
+
+ ~verify()
+ {
+ if (m_failed)
+ {
+ __builtin_fprintf(stderr, "\n");
+ __builtin_abort();
+ }
+ }
+
+ template <typename T>
+ const verify& operator<<(const T& x) const
+ {
+ if (m_failed)
+ {
+ print(x, int());
+ }
+ return *this;
+ }
+};
+
+#define COMPARE(_a, _b) \
+ [&](auto&& _aa, auto&& _bb) { \
+ return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__, \
+ __PRETTY_FUNCTION__, "all_of(" #_a " == " #_b ")", \
+ #_a " = ", _aa, "\n" #_b " = ", _bb); \
+ }((_a), (_b))
+
+#define VERIFY(_test) \
+ verify(_test, __FILE__, __LINE__, __PRETTY_FUNCTION__, #_test)
+
+// ulp_distance_signed can raise FP exceptions and thus must be conditionally
+// executed
+#define ULP_COMPARE(_a, _b, _allowed_distance) \
+ [&](auto&& _aa, auto&& _bb) { \
+ const bool success = std::experimental::all_of( \
+ vir::test::ulp_distance(_aa, _bb) <= (_allowed_distance)); \
+ return verify(success, __FILE__, __LINE__, __PRETTY_FUNCTION__, \
+ "all_of(" #_a " ~~ " #_b ")", #_a " = ", _aa, \
+ "\n" #_b " = ", _bb, "\ndistance = ", \
+ success ? 0 : vir::test::ulp_distance_signed(_aa, _bb)); \
+ }((_a), (_b))
+
+namespace vir
+{
+namespace test
+{
+ template <typename T>
+ inline T _S_fuzzyness = 0;
+ template <typename T>
+ void setFuzzyness(T x)
+ {
+ _S_fuzzyness<T> = x;
+ }
+} // namespace test
+} // namespace vir
+
+#define FUZZY_COMPARE(_a, _b) \
+ ULP_COMPARE( \
+ _a, _b, \
+ vir::test::_S_fuzzyness<vir::test::value_type_t<decltype((_a) + (_b))>>)
+
+template <typename V>
+void test();
+template <typename V>
+void invoke_test(...)
+{
+}
+template <typename V, typename = decltype(V())>
+void invoke_test(int)
+{
+ test<V>();
+ __builtin_fprintf(stderr, "PASS: %s\n", __PRETTY_FUNCTION__);
+}
+
+template <class T> void iterate_abis()/*{{{*/
+{
+ using namespace std::experimental::parallelism_v2;
+#ifndef TESTFIXEDSIZE
+ invoke_test<simd<T, simd_abi::scalar>>(int());
+ invoke_test<simd<T, simd_abi::_VecBuiltin<16>>>(int());
+ invoke_test<simd<T, simd_abi::_VecBltnBtmsk<64>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<3>>>(int());
+#ifdef STRESSTEST
+ invoke_test<simd<T, simd_abi::_VecBuiltin<12>>>(int());
+ invoke_test<simd<T, simd_abi::_VecBuiltin<32>>>(int());
+ invoke_test<simd<T, simd_abi::_VecBltnBtmsk<56>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<4>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<12>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<24>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<28>>>(int());
+#endif
+#else
+ invoke_test<simd<T, simd_abi::fixed_size<1>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<2>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<5>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<6>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<7>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<8>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<9>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<10>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<11>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<13>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<14>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<15>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<16>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<17>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<18>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<19>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<20>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<21>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<22>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<23>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<25>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<26>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<27>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<29>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<30>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<31>>>(int());
+ invoke_test<simd<T, simd_abi::fixed_size<32>>>(int());
+#endif
+}/*}}}*/
+
+#endif // TESTS_BITS_VERIFY_H_
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h b/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h
new file mode 100644
index 00000000000..76ac8143fe1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/broadcast.h
@@ -0,0 +1,87 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+enum unscoped_enum
+{
+ foo
+};
+enum class scoped_enum
+{
+ bar
+};
+struct convertible
+{
+ operator int();
+ operator float();
+};
+
+template <typename V> void test()
+{
+ using T = typename V::value_type;
+ VERIFY(std::experimental::is_simd_v<V>);
+ VERIFY(std::experimental::is_abi_tag_v<typename V::abi_type>);
+
+ {
+ V x; // not initialized
+ x = V{}; // default broadcasts 0
+ COMPARE(x, V(0));
+ COMPARE(x, V());
+ COMPARE(x, V{});
+ x = V(); // default broadcasts 0
+ COMPARE(x, V(0));
+ COMPARE(x, V());
+ COMPARE(x, V{});
+ x = 0;
+ COMPARE(x, V(0));
+ COMPARE(x, V());
+ COMPARE(x, V{});
+
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(T(x[i]), T(0)) << "i = " << i;
+ COMPARE(x[i], T(0)) << "i = " << i;
+ }
+ }
+
+ V x = 3;
+ V y = T(0);
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(x[i], T(3)) << "i = " << i;
+ COMPARE(y[i], T(0)) << "i = " << i;
+ }
+ y = 3;
+ COMPARE(x, y);
+
+ VERIFY(!(is_substitution_failure<V&, unscoped_enum, assignment>) );
+ VERIFY((is_substitution_failure<V&, scoped_enum, assignment>) );
+ COMPARE((is_substitution_failure<V&, convertible, assignment>),
+ (!std::is_convertible<convertible, T>::value));
+ COMPARE((is_substitution_failure<V&, long double, assignment>),
+ (sizeof(long double) > sizeof(T) || std::is_integral<T>::value));
+ COMPARE((is_substitution_failure<V&, double, assignment>),
+ (sizeof(double) > sizeof(T) || std::is_integral<T>::value));
+ COMPARE((is_substitution_failure<V&, float, assignment>),
+ (sizeof(float) > sizeof(T) || std::is_integral<T>::value));
+ COMPARE((is_substitution_failure<V&, long long, assignment>),
+ (has_less_bits<T, long long>() || std::is_unsigned<T>::value));
+ COMPARE((is_substitution_failure<V&, unsigned long long, assignment>),
+ (has_less_bits<T, unsigned long long>()));
+ COMPARE((is_substitution_failure<V&, long, assignment>),
+ (has_less_bits<T, long>() || std::is_unsigned<T>::value));
+ COMPARE((is_substitution_failure<V&, unsigned long, assignment>),
+ (has_less_bits<T, unsigned long>()));
+ // int broadcast *always* works:
+ VERIFY(!(is_substitution_failure<V&, int, assignment>) );
+ // uint broadcast works for any unsigned T:
+ COMPARE((is_substitution_failure<V&, unsigned int, assignment>),
+ (!std::is_unsigned<T>::value && has_less_bits<T, unsigned int>()));
+ COMPARE((is_substitution_failure<V&, short, assignment>),
+ (has_less_bits<T, short>() || std::is_unsigned<T>::value));
+ COMPARE((is_substitution_failure<V&, unsigned short, assignment>),
+ (has_less_bits<T, unsigned short>()));
+ COMPARE((is_substitution_failure<V&, signed char, assignment>),
+ (has_less_bits<T, signed char>() || std::is_unsigned<T>::value));
+ COMPARE((is_substitution_failure<V&, unsigned char, assignment>),
+ (has_less_bits<T, unsigned char>()));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/casts.h b/libstdc++-v3/testsuite/experimental/simd/tests/casts.h
new file mode 100644
index 00000000000..ecf757a4fc7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/casts.h
@@ -0,0 +1,132 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/conversions.h"
+
+using std::experimental::simd_cast;
+using std::experimental::static_simd_cast;
+
+template <class T, size_t N> struct gen_cast
+{
+ std::array<T, N> data;
+ template <class V> gen_cast(const V& v)
+ {
+ for (size_t i = 0; i < V::size(); ++i)
+ {
+ data[i] = static_cast<T>(v[i]);
+ }
+ }
+ template <class I> constexpr T operator()(I) { return data[I::value]; }
+};
+
+template <class V, class To> struct gen_seq_t
+{
+ using From = typename V::value_type;
+ const size_t N = cvt_input_data<From, To>.size();
+ size_t offset = 0;
+ constexpr void operator++() { offset += V::size(); }
+ explicit constexpr operator bool() const { return offset < N; }
+ template <class I> constexpr From operator()(I) const
+ {
+ size_t i = I::value + offset;
+ return i < N ? cvt_input_data<From, To>[i] : From(i);
+ }
+};
+
+template <class To> struct foo
+{
+ template <class T> auto operator()(const T& v) -> decltype(simd_cast<To>(v));
+};
+
+template <typename V, typename To>
+void
+casts()
+{
+ using From = typename V::value_type;
+ constexpr auto N = V::size();
+ if constexpr (N <= std::experimental::simd_abi::max_fixed_size<To>)
+ {
+ using W = std::experimental::fixed_size_simd<To, N>;
+
+ if constexpr (std::is_integral_v<From>)
+ {
+ using A = typename V::abi_type;
+ using TU = std::make_unsigned_t<From>;
+ using TS = std::make_signed_t<From>;
+ COMPARE(typeid(static_simd_cast<TU>(V())),
+ typeid(std::experimental::simd<TU, A>));
+ COMPARE(typeid(static_simd_cast<TS>(V())),
+ typeid(std::experimental::simd<TS, A>));
+ }
+
+ using is_simd_cast_allowed
+ = decltype(vir::test::sfinae_is_callable_t<const V&>(foo<To>()));
+
+ COMPARE(
+ is_simd_cast_allowed::value,
+ std::numeric_limits<From>::digits <= std::numeric_limits<To>::digits
+ && std::numeric_limits<From>::max() <= std::numeric_limits<To>::max()
+ && !(std::is_signed<From>::value && std::is_unsigned<To>::value));
+
+ if constexpr (is_simd_cast_allowed::value)
+ {
+ for (gen_seq_t<V, To> gen_seq; gen_seq; ++gen_seq)
+ {
+ const V seq(gen_seq);
+ COMPARE(simd_cast<V>(seq), seq);
+ COMPARE(simd_cast<W>(seq), W(gen_cast<To, N>(seq)))
+ << "seq = " << seq;
+ auto test = simd_cast<To>(seq);
+ // decltype(test) is not W if
+ // a) V::abi_type is not fixed_size and
+ // b.1) V::value_type and To are integral and of equal rank or
+ // b.2) V::value_type and To are equal
+ COMPARE(test, decltype(test)(gen_cast<To, N>(seq)));
+ if (std::is_same<To, From>::value)
+ {
+ COMPARE(typeid(decltype(test)), typeid(V));
+ }
+ }
+ }
+
+ for (gen_seq_t<V, To> gen_seq; gen_seq; ++gen_seq)
+ {
+ const V seq(gen_seq);
+ COMPARE(static_simd_cast<V>(seq), seq);
+ COMPARE(static_simd_cast<W>(seq), W(gen_cast<To, N>(seq))) << '\n'
+ << seq;
+ auto test = static_simd_cast<To>(seq);
+ // decltype(test) is not W if
+ // a) V::abi_type is not fixed_size and
+ // b.1) V::value_type and To are integral and of equal rank or
+ // b.2) V::value_type and To are equal
+ COMPARE(test, decltype(test)(gen_cast<To, N>(seq)));
+ if (std::is_same<To, From>::value)
+ {
+ COMPARE(typeid(decltype(test)), typeid(V));
+ }
+ }
+ }
+}
+
+template <typename V>
+void
+test()
+{
+ casts<V, long double>();
+ casts<V, double>();
+ casts<V, float>();
+ casts<V, long long>();
+ casts<V, unsigned long long>();
+ casts<V, unsigned long>();
+ casts<V, long>();
+ casts<V, int>();
+ casts<V, unsigned int>();
+ casts<V, short>();
+ casts<V, unsigned short>();
+ casts<V, char>();
+ casts<V, signed char>();
+ casts<V, unsigned char>();
+ casts<V, char32_t>();
+ casts<V, char16_t>();
+ casts<V, wchar_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h b/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h
new file mode 100644
index 00000000000..bd9aff386ad
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/fpclassify.h
@@ -0,0 +1,64 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+#include <cfenv>
+
+template <typename F>
+auto
+verify_no_fp_exceptions(F&& fun)
+{
+ std::feclearexcept(FE_ALL_EXCEPT);
+ auto r = fun();
+ COMPARE(std::fetestexcept(FE_ALL_EXCEPT), 0);
+ return r;
+}
+
+#define NOFPEXCEPT(...) verify_no_fp_exceptions([&]() { return __VA_ARGS__; })
+
+template <typename V>
+void
+test()
+{
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>(
+ {
+ 0., 1., -1.,
+#if __GCC_IEC_559 >= 2
+ -0., limits::infinity(), -limits::infinity(), limits::denorm_min(),
+ -limits::denorm_min(), limits::quiet_NaN(),
+#ifdef __SUPPORT_SNAN__
+ limits::signaling_NaN(),
+#endif
+#endif
+ limits::max(), -limits::max(), limits::min(), limits::min() * 0.9,
+ -limits::min(), -limits::min() * 0.9
+ },
+ [](const V input) {
+ using intv = std::experimental::fixed_size_simd<int, V::size()>;
+ COMPARE(NOFPEXCEPT(isfinite(input)),
+ !V([&](auto i) { return std::isfinite(input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(isinf(input)),
+ !V([&](auto i) { return std::isinf(input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(isnan(input)),
+ !V([&](auto i) { return std::isnan(input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(isnormal(input)),
+ !V([&](auto i) { return std::isnormal(input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(signbit(input)),
+ !V([&](auto i) { return std::signbit(input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(isunordered(input, V())),
+ !V([&](auto i) { return std::isunordered(input[i], 0) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(isunordered(V(), input)),
+ !V([&](auto i) { return std::isunordered(0, input[i]) ? 0 : 1; }))
+ << input;
+ COMPARE(NOFPEXCEPT(fpclassify(input)),
+ intv([&](auto i) { return std::fpclassify(input[i]); }))
+ << input;
+ });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h b/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h
new file mode 100644
index 00000000000..deafa2f1296
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/frexp.h
@@ -0,0 +1,88 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ using int_v = std::experimental::fixed_size_simd<int, V::size()>;
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>(
+ {
+ 0, 0.25, 0.5, 1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
+ 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 31, -0., -0.25, -0.5, -1,
+ -3, -4, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15, -16, -17, -18,
+ -19, -20, -21, -22, -23, -24, -25, -26, -27, -28, -29, -32, -31,
+#if __GCC_IEC_559 >= 2
+ limits::denorm_min(), -limits::denorm_min(), limits::min() / 2,
+ -limits::min() / 2,
+#endif
+ limits::max(), -limits::max(), limits::max() * 0.123f,
+ -limits::max() * 0.123f
+ },
+ [](const V input) {
+ V expectedFraction;
+ const int_v expectedExponent([&](auto i) {
+ int exp;
+ expectedFraction[i] = std::frexp(input[i], &exp);
+ return exp;
+ });
+ int_v exponent = {};
+ const V fraction = frexp(input, &exponent);
+ COMPARE(fraction, expectedFraction)
+ << ", input = " << input << ", delta: " << fraction - expectedFraction;
+ COMPARE(exponent, expectedExponent)
+ << "\ninput: " << input << ", fraction: " << fraction;
+ });
+#ifdef __STDC_IEC_559__
+ test_values<V>(
+ // If x is a NaN, a NaN is returned, and the value of *exp is unspecified.
+ //
+ // If x is positive infinity (negative infinity), positive infinity
+ // (negative infinity) is returned, and the value of *exp is unspecified.
+ // This behavior is only guaranteed with C's Annex F when __STDC_IEC_559__
+ // is defined.
+ {limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::denorm_min(),
+ limits::denorm_min() * 1.72,
+ -limits::denorm_min(),
+ -limits::denorm_min() * 1.72,
+ 0.,
+ -0.,
+ 1,
+ -1},
+ [](const V input) {
+ const V expectedFraction([&](auto i) {
+ int exp;
+ return std::frexp(input[i], &exp);
+ });
+ int_v exponent = {};
+ const V fraction = frexp(input, &exponent);
+ COMPARE(isnan(fraction), isnan(expectedFraction))
+ << fraction << ", input = " << input
+ << ", delta: " << fraction - expectedFraction;
+ COMPARE(isinf(fraction), isinf(expectedFraction))
+ << fraction << ", input = " << input
+ << ", delta: " << fraction - expectedFraction;
+ COMPARE(signbit(fraction), signbit(expectedFraction))
+ << fraction << ", input = " << input
+ << ", delta: " << fraction - expectedFraction;
+ });
+#endif
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/generator.h b/libstdc++-v3/testsuite/experimental/simd/tests/generator.h
new file mode 100644
index 00000000000..5b824772962
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/generator.h
@@ -0,0 +1,39 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <class V> struct call_generator
+{
+ template <class F> auto operator()(const F& f) -> decltype(V(f));
+};
+
+using schar = signed char;
+using uchar = unsigned char;
+using ullong = unsigned long long;
+
+template <typename V>
+void
+test()
+{
+ using T = typename V::value_type;
+ V x([](int) { return T(1); });
+ COMPARE(x, V(1));
+ x = V(
+ [](int) { return 1; }); // unconditionally returns int from generator lambda
+ COMPARE(x, V(1));
+ x = V([](auto i) { return T(i); });
+ COMPARE(x, V([](T i) { return i; }));
+
+ VERIFY((
+ sfinae_is_callable<int (&)(int)>(call_generator<V>()))); // int always works
+ COMPARE(sfinae_is_callable<schar (&)(int)>(call_generator<V>()),
+ std::is_signed<T>::value);
+ COMPARE(sfinae_is_callable<uchar (&)(int)>(call_generator<V>()),
+ !(std::is_signed_v<T> && sizeof(T) <= sizeof(uchar)));
+ COMPARE(sfinae_is_callable<float (&)(int)>(call_generator<V>()),
+ (std::is_floating_point<T>::value));
+
+ COMPARE(sfinae_is_callable<ullong (&)(int)>(call_generator<V>()),
+ std::numeric_limits<T>::max() >= std::numeric_limits<ullong>::max()
+ && std::numeric_limits<T>::digits
+ >= std::numeric_limits<ullong>::digits);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h
new file mode 100644
index 00000000000..b06abe31b46
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.h
@@ -0,0 +1,131 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(1);
+ vir::test::setFuzzyness<double>(1);
+ vir::test::setFuzzyness<long double>(2); // because of the bad reference
+
+ using T = typename V::value_type;
+ using limits = std::numeric_limits<T>;
+ // 3-arg std::hypot needs to be fixed, this is a better reference:
+ auto&& hypot3 = [](T x, T y, T z) -> T {
+ x = std::abs(x);
+ y = std::abs(y);
+ z = std::abs(z);
+ if (std::isinf(x) || std::isinf(y) || std::isinf(z))
+ {
+ return limits::infinity();
+ }
+ else if (std::isnan(x) || std::isnan(y) || std::isnan(z))
+ {
+ return limits::quiet_NaN();
+ }
+ else if (x == y && y == z)
+ {
+ return x * std::sqrt(T(3));
+ }
+ else if (z == 0 && y == 0)
+ return x;
+ else if (x == 0 && z == 0)
+ return y;
+ else if (x == 0 && y == 0)
+ return z;
+ else if (x == 0)
+ return std::hypot(y, z);
+ else if (y == 0)
+ return std::hypot(x, z);
+ else if (z == 0)
+ return std::hypot(x, y);
+ else
+ {
+ long double hi = std::max(std::max(x, y), z);
+ long double lo0 = std::min(std::max(x, y), z);
+ long double lo1 = std::min(x, y);
+ if (std::isinf(x * x + y * y + z * z) || 0 == (lo0 * lo0 + lo1 * lo1))
+ {
+ lo0 /= hi;
+ lo1 /= hi;
+ return std::abs(hi) * std::sqrt(1 + (lo0 * lo0 + lo1 * lo1));
+ }
+ else
+ {
+ return std::sqrt(hi * hi + (lo0 * lo0 + lo1 * lo1));
+ }
+ }
+ };
+ test_values_3arg<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+ limits::min() / 3, -0., limits::denorm_min(),
+#endif
+ 0., 1., -1., limits::min(), limits::max(), -limits::max()},
+ {100000}, MAKE_TESTER_2(hypot, hypot3));
+ COMPARE(hypot(V(limits::max()), V(limits::max()), V()),
+ V(limits::infinity()));
+ COMPARE(hypot(V(limits::max()), V(), V(limits::max())),
+ V(limits::infinity()));
+ COMPARE(hypot(V(), V(limits::max()), V(limits::max())),
+ V(limits::infinity()));
+ COMPARE(hypot(V(limits::min()), V(limits::min()), V(limits::min())),
+ V(limits::min() * std::sqrt(T(3))));
+ VERIFY((sfinae_is_callable<V, V, V>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, T, V>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, T>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, V, T>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, V, V>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, V>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, V, T>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<int, int, V>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<int, V, int>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, int>(
+ [](auto a, auto b, auto c) -> decltype(hypot(a, b, c)) { return {}; })));
+
+ vir::test::setFuzzyness<float>(0);
+ vir::test::setFuzzyness<double>(0);
+ test_values_3arg<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+ limits::min() / 3, limits::denorm_min(),
+#endif
+ 0., limits::min(), limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(fma));
+ VERIFY((sfinae_is_callable<V, V, V>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, T, V>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, T>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, V, T>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<T, V, V>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, V>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, V, T>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<int, int, V>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<int, V, int>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, T, int>(
+ [](auto a, auto b, auto c) -> decltype(fma(a, b, c)) { return {}; })));
+}
+
+// vim: ts=8 noet sw=2 sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h
new file mode 100644
index 00000000000..384bfa4d897
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/integer_operators.h
@@ -0,0 +1,220 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/metahelpers.h"
+
+// for_constexpr {{{1
+template <typename T, T Begin, T End, T Stride = 1, typename F>
+void
+for_constexpr(F&& fun)
+{
+ if constexpr (Begin <= End)
+ {
+ fun(std::integral_constant<T, Begin>());
+ if constexpr (Begin < End)
+ {
+ for_constexpr<T, Begin + Stride, End, Stride>(static_cast<F&&>(fun));
+ }
+ }
+}
+
+template <typename V>
+void
+test() //{{{1
+{
+ using T = typename V::value_type;
+ if constexpr (std::is_integral_v<T>)
+ {
+ constexpr int nbits(sizeof(T) * CHAR_BIT);
+ constexpr int n_promo_bits = std::max(nbits, int(sizeof(int) * CHAR_BIT));
+
+ // complement{{{2
+ COMPARE(~V(), V(~T()));
+ COMPARE(~V(~T()), V());
+
+ { // modulus{{{2
+ V x = make_vec<V>({3, 4}, 2);
+ COMPARE(x % x, V(0));
+ V y = x - 1;
+ COMPARE(x % y, V(1));
+ y = x + 1;
+ COMPARE(x % y, x);
+ if (std::is_signed<T>::value)
+ {
+ x = -x;
+ COMPARE(x % y, x);
+ x = -y;
+ COMPARE(x % y, V(0));
+ x = x - 1;
+ COMPARE(x % y, V(-1));
+ x %= y;
+ COMPARE(x, V(-1));
+ }
+ }
+
+ { // bit_and{{{2
+ V x = make_vec<V>({3, 4, 5}, 8);
+ COMPARE(x & x, x);
+ COMPARE(x & ~x, V());
+ COMPARE(x & V(), V());
+ COMPARE(V() & x, V());
+ V y = make_vec<V>({1, 5, 3}, 8);
+ COMPARE(x & y, make_vec<V>({1, 4, 1}, 8));
+ x &= y;
+ COMPARE(x, make_vec<V>({1, 4, 1}, 8));
+ }
+
+ { // bit_or{{{2
+ V x = make_vec<V>({3, 4, 5}, 8);
+ COMPARE(x | x, x);
+ COMPARE(x | ~x, ~V());
+ COMPARE(x | V(), x);
+ COMPARE(V() | x, x);
+ V y = make_vec<V>({1, 5, 3}, 8);
+ COMPARE(x | y, make_vec<V>({3, 5, 7}, 8));
+ x |= y;
+ COMPARE(x, make_vec<V>({3, 5, 7}, 8));
+ }
+
+ { // bit_xor{{{2
+ V x = make_vec<V>({3, 4, 5}, 8);
+ COMPARE(x ^ x, V());
+ COMPARE(x ^ ~x, ~V());
+ COMPARE(x ^ V(), x);
+ COMPARE(V() ^ x, x);
+ V y = make_vec<V>({1, 5, 3}, 8);
+ COMPARE(x ^ y, make_vec<V>({2, 1, 6}, 0));
+ x ^= y;
+ COMPARE(x, make_vec<V>({2, 1, 6}, 0));
+ }
+
+ { // bit_shift_left{{{2
+ // Note:
+ // - negative RHS or RHS >= max(#bits(T), #bits(int)) is UB
+ // - negative LHS is UB
+ // - shifting into (or over) the sign bit is UB
+ // - unsigned LHS overflow is modulo arithmetic
+ COMPARE(V() << 1, V());
+ for (int i = 0; i < nbits - 1; ++i)
+ {
+ COMPARE(V(1) << i, V(T(1) << i)) << "i: " << i;
+ }
+ for_constexpr<int, 0, n_promo_bits - 1>([](auto shift_ic) {
+ constexpr int shift = shift_ic;
+ const V seq = make_value_unknown(V([&](T i) {
+ if constexpr (std::is_signed_v<T>)
+ {
+ const T max = std::numeric_limits<T>::max() >> shift;
+ return max == 0 ? 1 : (std::abs(max - i) % max) + 1;
+ }
+ else
+ {
+ return ~T() - i;
+ }
+ }));
+ const V ref([&](T i) { return T(seq[i] << shift); });
+ COMPARE(seq << shift, ref) << "seq: " << seq << ", shift: " << shift;
+ COMPARE(seq << make_value_unknown(shift), ref)
+ << "seq: " << seq << ", shift: " << shift;
+ });
+ {
+ V seq = make_vec<V>({0, 1}, nbits - 2);
+ seq %= nbits - 1;
+ COMPARE(make_vec<V>({0, 1}, 0) << seq,
+ V([&](auto i) { return T(T(i & 1) << seq[i]); }))
+ << "seq = " << seq;
+ COMPARE(make_vec<V>({1, 0}, 0) << seq,
+ V([&](auto i) { return T(T(~i & 1) << seq[i]); }));
+ COMPARE(V(1) << seq, V([&](auto i) { return T(T(1) << seq[i]); }));
+ }
+ if (std::is_unsigned<T>::value)
+ {
+ constexpr int shift_count = nbits - 1;
+ COMPARE(V(1) << shift_count, V(T(1) << shift_count));
+ constexpr T max = // avoid overflow warning in the last COMPARE
+ std::is_unsigned<T>::value ? std::numeric_limits<T>::max() : T(1);
+ COMPARE(V(max) << shift_count, V(max << shift_count))
+ << "shift_count: " << shift_count;
+ }
+ }
+
+ { // bit_shift_right{{{2
+ // Note:
+ // - negative LHS is implementation defined
+ // - negative RHS or RHS >= #bits is UB
+ // - no other UB
+ COMPARE(V(~T()) >> V(0), V(~T()));
+ COMPARE(V(~T()) >> V(make_value_unknown(0)), V(~T()));
+ for (int s = 1; s < nbits; ++s)
+ {
+ COMPARE(V(~T()) >> V(s), V(T(~T()) >> s)) << "s: " << s;
+ }
+ for (int s = 1; s < nbits; ++s)
+ {
+ COMPARE(V(~T(1)) >> V(s), V(T(~T(1)) >> s)) << "s: " << s;
+ }
+ COMPARE(V(0) >> V(1), V(0));
+ COMPARE(V(1) >> V(1), V(0));
+ COMPARE(V(2) >> V(1), V(1));
+ COMPARE(V(3) >> V(1), V(1));
+ COMPARE(V(7) >> V(2), V(1));
+ for (int j = 0; j < 100; ++j)
+ {
+ const V seq([&](auto i) -> T { return (j + i) % n_promo_bits; });
+ COMPARE(V(1) >> seq, V([&](auto i) { return T(T(1) >> seq[i]); }))
+ << "seq = " << seq;
+ COMPARE(make_value_unknown(V(1)) >> make_value_unknown(seq),
+ V([&](auto i) { return T(T(1) >> seq[i]); }))
+ << "seq = " << seq;
+ }
+ for_constexpr<int, 0, n_promo_bits - 1>([](auto shift_ic) {
+ constexpr int shift = shift_ic;
+ const V seq = make_value_unknown(V([&](int i) {
+ using U = std::make_unsigned_t<T>;
+ return T(~U() >> (i % 32));
+ }));
+ const V ref([&](T i) { return T(seq[i] >> shift); });
+ COMPARE(seq >> shift, ref) << "seq: " << seq << ", shift: " << shift;
+ COMPARE(seq >> make_value_unknown(shift), ref)
+ << "seq: " << seq << ", shift: " << shift;
+ });
+ }
+
+ //}}}2
+ }
+ else
+ {
+ VERIFY((is_substitution_failure<V, V, std::modulus<>>) );
+ VERIFY((is_substitution_failure<V, V, std::bit_and<>>) );
+ VERIFY((is_substitution_failure<V, V, std::bit_or<>>) );
+ VERIFY((is_substitution_failure<V, V, std::bit_xor<>>) );
+ VERIFY((is_substitution_failure<V, V, bit_shift_left>) );
+ VERIFY((is_substitution_failure<V, V, bit_shift_right>) );
+
+ VERIFY((is_substitution_failure<V&, V, assign_modulus>) );
+ VERIFY((is_substitution_failure<V&, V, assign_bit_and>) );
+ VERIFY((is_substitution_failure<V&, V, assign_bit_or>) );
+ VERIFY((is_substitution_failure<V&, V, assign_bit_xor>) );
+ VERIFY((is_substitution_failure<V&, V, assign_bit_shift_left>) );
+ VERIFY((is_substitution_failure<V&, V, assign_bit_shift_right>) );
+ }
+}
+// }}}1
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h b/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h
new file mode 100644
index 00000000000..8947d483a09
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.h
@@ -0,0 +1,135 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(0);
+ vir::test::setFuzzyness<double>(0);
+
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ -0.,
+ limits::denorm_min(),
+ limits::min() / 3,
+ -limits::denorm_min(),
+ -limits::min() / 3,
+#endif
+ +0.,
+ +1.3,
+ -1.3,
+ 2.1,
+ -2.1,
+ 0.99,
+ 0.9,
+ -0.9,
+ -0.99,
+ limits::min(),
+ limits::max(),
+ -limits::min(),
+ -limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2},
+ [](const V input) {
+ for (int exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+ {
+ const auto totest = ldexp(input, exp);
+ using R = std::remove_const_t<decltype(totest)>;
+ auto&& expected = [&](const auto& v) -> const R {
+ R tmp = {};
+ using std::ldexp;
+ for (std::size_t i = 0; i < R::size(); ++i)
+ {
+ tmp[i] = ldexp(v[i], exp);
+ }
+ return tmp;
+ };
+ const R expect1 = expected(input);
+ COMPARE(isnan(totest), isnan(expect1))
+ << "ldexp(" << input << ", " << exp << ") = " << totest
+ << " != " << expect1;
+ FUZZY_COMPARE(ldexp(iif(isnan(expect1), 0, input), exp),
+ expected(iif(isnan(expect1), 0, input)))
+ << "\nclean = " << iif(isnan(expect1), 0, input);
+ }
+ },
+ [](const V input) {
+ for (int exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+ {
+ const auto totest = scalbn(input, exp);
+ using R = std::remove_const_t<decltype(totest)>;
+ auto&& expected = [&](const auto& v) -> const R {
+ R tmp = {};
+ using std::scalbn;
+ for (std::size_t i = 0; i < R::size(); ++i)
+ {
+ tmp[i] = scalbn(v[i], exp);
+ }
+ return tmp;
+ };
+ const R expect1 = expected(input);
+ COMPARE(isnan(totest), isnan(expect1))
+ << "scalbn(" << input << ", " << exp << ") = " << totest
+ << " != " << expect1;
+ FUZZY_COMPARE(scalbn(iif(isnan(expect1), 0, input), exp),
+ expected(iif(isnan(expect1), 0, input)))
+ << "\nclean = " << iif(isnan(expect1), 0, input);
+ }
+ },
+ [](const V input) {
+ for (long exp : {-10000, -100, -10, -1, 0, 1, 10, 100, 10000})
+ {
+ const auto totest = scalbln(input, exp);
+ using R = std::remove_const_t<decltype(totest)>;
+ auto&& expected = [&](const auto& v) -> const R {
+ R tmp = {};
+ using std::scalbln;
+ for (std::size_t i = 0; i < R::size(); ++i)
+ {
+ tmp[i] = scalbln(v[i], exp);
+ }
+ return tmp;
+ };
+ const R expect1 = expected(input);
+ COMPARE(isnan(totest), isnan(expect1))
+ << "scalbln(" << input << ", " << exp << ") = " << totest
+ << " != " << expect1;
+ FUZZY_COMPARE(scalbln(iif(isnan(expect1), 0, input), exp),
+ expected(iif(isnan(expect1), 0, input)))
+ << "\nclean = " << iif(isnan(expect1), 0, input);
+ }
+ },
+ [](const V input) {
+ V integral = {};
+ const V totest = modf(input, &integral);
+ auto&& expected = [&](const auto& v) -> std::pair<const V, const V> {
+ std::pair<V, V> tmp = {};
+ using std::modf;
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ typename V::value_type tmp2;
+ tmp.first[i] = modf(v[i], &tmp2);
+ tmp.second[i] = tmp2;
+ }
+ return tmp;
+ };
+ const auto expect1 = expected(input);
+ COMPARE(isnan(totest), isnan(expect1.first))
+ << "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+ COMPARE(isnan(integral), isnan(expect1.second))
+ << "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+ COMPARE(isnan(totest), isnan(integral))
+ << "modf(" << input << ", iptr) = " << totest << " != " << expect1;
+ const V clean = iif(isnan(totest), 0, input);
+ const auto expect2 = expected(clean);
+ COMPARE(modf(clean, &integral), expect2.first) << "\nclean = " << clean;
+ COMPARE(integral, expect2.second);
+ });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h
new file mode 100644
index 00000000000..74945532734
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.h
@@ -0,0 +1,209 @@
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/conversions.h"
+
+template <typename V, typename U>
+void
+load_store()
+{
+ // types, tags, and constants {{{2
+ using T = typename V::value_type;
+ auto&& gen = make_vec<V>;
+ using std::experimental::element_aligned;
+ using std::experimental::vector_aligned;
+
+ // stride_alignment: consider V::size() == 6. The only reliable alignment is
+ // 2 * sizeof(U). I.e. if the first address is aligned to 8 * sizeof(U), then
+ // the next address is 6 * sizeof(U) larger, thus only aligned to 2 *
+ // sizeof(U).
+ // => the LSB determines the stride alignment
+ constexpr size_t stride_alignment = size_t(1) << __builtin_ctz(V::size());
+ using stride_aligned_t = std::conditional_t<
+ V::size() == stride_alignment, decltype(vector_aligned),
+ std::experimental::overaligned_tag<stride_alignment * sizeof(U)>>;
+ constexpr stride_aligned_t stride_aligned = {};
+ constexpr size_t alignment = 2 * std::experimental::memory_alignment_v<V, U>;
+ constexpr auto overaligned = std::experimental::overaligned<alignment>;
+ const V indexes_from_0([](auto i) { return i; });
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(indexes_from_0[i], T(i));
+ }
+
+ // loads {{{2
+ cvt_inputs<T, U> test_values;
+
+ constexpr auto mem_size
+ = test_values.size() > 3 * V::size() ? test_values.size() : 3 * V::size();
+ alignas(std::experimental::memory_alignment_v<V, U> * 2) U mem[mem_size] = {};
+ alignas(std::experimental::memory_alignment_v<V, T> * 2) T reference[mem_size]
+ = {};
+ for (std::size_t i = 0; i < test_values.size(); ++i)
+ {
+ const U value = test_values[i];
+ mem[i] = value;
+ reference[i] = static_cast<T>(value);
+ }
+ for (std::size_t i = test_values.size(); i < mem_size; ++i)
+ {
+ mem[i] = U(i);
+ reference[i] = mem[i];
+ }
+
+ V x(&mem[V::size()], stride_aligned);
+ auto&& compare = [&](const std::size_t offset) {
+ static int n = 0;
+ const V ref(&reference[offset], element_aligned);
+ for (auto i = 0ul; i < V::size(); ++i)
+ {
+ if (is_conversion_undefined<T>(mem[i + offset]))
+ {
+ continue;
+ }
+ COMPARE(x[i], reference[i + offset])
+ << "\nbefore conversion: " << mem[i + offset]
+ << "\n offset = " << offset << "\n x = " << x
+ << "\nreference = " << ref << "\nx == ref = " << (x == ref)
+ << "\ncall no. " << n;
+ }
+ ++n;
+ };
+ compare(V::size());
+ x = V{mem, overaligned};
+ compare(0);
+ x = {&mem[1], element_aligned};
+ compare(1);
+
+ x.copy_from(&mem[V::size()], stride_aligned);
+ compare(V::size());
+ x.copy_from(&mem[1], element_aligned);
+ compare(1);
+ x.copy_from(mem, vector_aligned);
+ compare(0);
+
+ for (std::size_t i = 0; i < mem_size - V::size(); ++i)
+ {
+ x.copy_from(&mem[i], element_aligned);
+ compare(i);
+ }
+
+ for (std::size_t i = 0; i < test_values.size(); ++i)
+ {
+ mem[i] = U(i);
+ }
+ x = indexes_from_0;
+ using M = typename V::mask_type;
+ const M alternating_mask = make_mask<M>({0, 1});
+ where(alternating_mask, x).copy_from(&mem[V::size()], stride_aligned);
+
+ const V indexes_from_size = gen({T(V::size())}, 1);
+ COMPARE(x == indexes_from_size, alternating_mask)
+ << "x: " << x << "\nindexes_from_size: " << indexes_from_size;
+ COMPARE(x == indexes_from_0, !alternating_mask);
+ where(alternating_mask, x).copy_from(&mem[1], element_aligned);
+
+ const V indexes_from_1 = gen({1, 2, 3, 4}, 4);
+ COMPARE(x == indexes_from_1, alternating_mask);
+ COMPARE(x == indexes_from_0, !alternating_mask);
+ where(!alternating_mask, x).copy_from(mem, overaligned);
+ COMPARE(x == indexes_from_0, !alternating_mask);
+ COMPARE(x == indexes_from_1, alternating_mask);
+
+ x = where(alternating_mask, V()).copy_from(&mem[V::size()], stride_aligned);
+ COMPARE(x == indexes_from_size, alternating_mask);
+ COMPARE(x == 0, !alternating_mask);
+
+ x = where(!alternating_mask, V()).copy_from(&mem[1], element_aligned);
+ COMPARE(x == indexes_from_1, !alternating_mask);
+ COMPARE(x == 0, alternating_mask);
+
+ // stores {{{2
+ auto&& init_mem = [&mem](U init) {
+ for (auto i = mem_size; i; --i)
+ {
+ mem[i - 1] = init;
+ }
+ };
+ init_mem(-1);
+ x = indexes_from_1;
+ x.copy_to(&mem[V::size()], stride_aligned);
+ std::size_t i = 0;
+ for (; i < V::size(); ++i)
+ {
+ COMPARE(mem[i], U(-1)) << "i: " << i;
+ }
+ for (; i < 2 * V::size(); ++i)
+ {
+ COMPARE(mem[i], U(i - V::size() + 1)) << "i: " << i;
+ }
+ for (; i < 3 * V::size(); ++i)
+ {
+ COMPARE(mem[i], U(-1)) << "i: " << i;
+ }
+
+ init_mem(-1);
+ x.copy_to(&mem[1], element_aligned);
+ COMPARE(mem[0], U(-1));
+ for (i = 1; i <= V::size(); ++i)
+ {
+ COMPARE(mem[i], U(i));
+ }
+ for (; i < 3 * V::size(); ++i)
+ {
+ COMPARE(mem[i], U(-1));
+ }
+
+ init_mem(-1);
+ x.copy_to(mem, vector_aligned);
+ for (i = 0; i < V::size(); ++i)
+ {
+ COMPARE(mem[i], U(i + 1));
+ }
+ for (; i < 3 * V::size(); ++i)
+ {
+ COMPARE(mem[i], U(-1));
+ }
+
+ init_mem(-1);
+ where(alternating_mask, indexes_from_0)
+ .copy_to(&mem[V::size()], stride_aligned);
+ for (i = 0; i < V::size() + 1; ++i)
+ {
+ COMPARE(mem[i], U(-1));
+ }
+ for (; i < 2 * V::size(); i += 2)
+ {
+ COMPARE(mem[i], U(i - V::size()));
+ }
+ for (i = V::size() + 2; i < 2 * V::size(); i += 2)
+ {
+ COMPARE(mem[i], U(-1));
+ }
+ for (; i < 3 * V::size(); ++i)
+ {
+ COMPARE(mem[i], U(-1));
+ }
+}
+
+template <typename V>
+void
+test()
+{
+ load_store<V, long double>();
+ load_store<V, double>();
+ load_store<V, float>();
+ load_store<V, long long>();
+ load_store<V, unsigned long long>();
+ load_store<V, unsigned long>();
+ load_store<V, long>();
+ load_store<V, int>();
+ load_store<V, unsigned int>();
+ load_store<V, short>();
+ load_store<V, unsigned short>();
+ load_store<V, char>();
+ load_store<V, signed char>();
+ load_store<V, unsigned char>();
+ load_store<V, char32_t>();
+ load_store<V, char16_t>();
+ load_store<V, wchar_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h b/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h
new file mode 100644
index 00000000000..159edbf34e3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/logarithm.h
@@ -0,0 +1,54 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/mathreference.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(1);
+ vir::test::setFuzzyness<double>(1);
+
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>({1,
+ 2,
+ 4,
+ 8,
+ 16,
+ 32,
+ 64,
+ 128,
+ 256,
+ 512,
+ 1024,
+ 2048,
+ 3,
+ 5,
+ 7,
+ 15,
+ 17,
+ 31,
+ 33,
+ 63,
+ 65,
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(),
+ limits::infinity(),
+ -limits::infinity(),
+ limits::denorm_min(),
+ -limits::denorm_min(),
+ limits::min() / 3,
+ -limits::min() / 3,
+ -0.,
+#endif
+ +0.,
+ limits::min(),
+ limits::max(),
+ -limits::min(),
+ -limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2},
+ MAKE_TESTER(log), MAKE_TESTER(log10), MAKE_TESTER(log1p),
+ MAKE_TESTER(log2), MAKE_TESTER(logb));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h
new file mode 100644
index 00000000000..dc9af0a1ac4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_broadcast.h
@@ -0,0 +1,50 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ static_assert(std::is_convertible<typename M::reference, bool>::value,
+ "A smart_reference<simd_mask> must be convertible to bool.");
+ static_assert(
+ std::is_same<bool, decltype(std::declval<const typename M::reference&>()
+ == true)>::value,
+ "A smart_reference<simd_mask> must be comparable against bool.");
+ static_assert(
+ vir::test::sfinae_is_callable<typename M::reference&&, bool>(
+ [](auto&& a, auto&& b) -> decltype(std::declval<decltype(a)>()
+ == std::declval<decltype(b)>()) {
+ return {};
+ }),
+ "A smart_reference<simd_mask> must be comparable against bool.");
+ VERIFY(std::experimental::is_simd_mask_v<M>);
+
+ {
+ M x; // uninitialized
+ x = M{}; // default broadcasts 0
+ COMPARE(x, M(false));
+ COMPARE(x, M());
+ COMPARE(x, M{});
+ x = M(); // default broadcasts 0
+ COMPARE(x, M(false));
+ COMPARE(x, M());
+ COMPARE(x, M{});
+ x = x;
+ for (std::size_t i = 0; i < M::size(); ++i)
+ {
+ COMPARE(x[i], false);
+ }
+ }
+
+ M x(true);
+ M y(false);
+ for (std::size_t i = 0; i < M::size(); ++i)
+ {
+ COMPARE(x[i], true);
+ COMPARE(y[i], false);
+ }
+ y = M(true);
+ COMPARE(x, y);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h
new file mode 100644
index 00000000000..cc2cdcac7af
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_conversions.h
@@ -0,0 +1,94 @@
+#include "bits/verify.h"
+
+namespace stdx = std::experimental;
+
+template <typename From, typename To>
+void
+conversions()
+{
+ using ToV = typename To::simd_type;
+
+ using stdx::simd_cast;
+ using stdx::static_simd_cast;
+ using stdx::__proposed::resizing_simd_cast;
+
+ auto x = resizing_simd_cast<To>(From());
+ COMPARE(typeid(x), typeid(To));
+ COMPARE(x, To());
+
+ x = resizing_simd_cast<To>(From(true));
+ const To ref = ToV([](auto i) { return i; }) < int(From::size());
+ COMPARE(x, ref) << "converted from: " << From(true);
+
+ const ullong all_bits = ~ullong() >> (64 - From::size());
+ for (ullong bit_pos = 1; bit_pos /*until overflow*/; bit_pos *= 2)
+ {
+ for (ullong bits : {bit_pos & all_bits, ~bit_pos & all_bits})
+ {
+ const auto from = From::__from_bitset(bits);
+ const auto to = resizing_simd_cast<To>(from);
+ COMPARE(to, To::__from_bitset(bits))
+ << "\nfrom: " << from << "\nbits: " << std::hex << bits << std::dec;
+ for (std::size_t i = 0; i < To::size(); ++i)
+ {
+ COMPARE(to[i], (bits >> i) & 1)
+ << "\nfrom: " << from << "\nto: " << to
+ << "\nbits: " << std::hex << bits << std::dec << "\ni: " << i;
+ }
+ }
+ }
+}
+
+template <typename T, typename V, typename = void> struct rebind_or_max_fixed
+{
+ using type = stdx::rebind_simd_t<
+ T, stdx::resize_simd_t<stdx::simd_abi::max_fixed_size<T>, V>>;
+};
+template <typename T, typename V>
+struct rebind_or_max_fixed<T, V, std::void_t<stdx::rebind_simd_t<T, V>>>
+{
+ using type = stdx::rebind_simd_t<T, V>;
+};
+
+template <typename From, typename To>
+void
+apply_abis()
+{
+ using M0 = typename rebind_or_max_fixed<To, From>::type;
+ using M1 = stdx::native_simd_mask<To>;
+ using M2 = stdx::simd_mask<To>;
+ using M3 = stdx::simd_mask<To, stdx::simd_abi::scalar>;
+
+ using std::is_same_v;
+ conversions<From, M0>();
+ if constexpr (!is_same_v<M1, M0>)
+ conversions<From, M1>();
+ if constexpr (!is_same_v<M2, M0> && !is_same_v<M2, M1>)
+ conversions<From, M2>();
+ if constexpr (!is_same_v<M3, M0> && !is_same_v<M3, M1> && !is_same_v<M3, M2>)
+ conversions<From, M3>();
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ apply_abis<M, ldouble>();
+ apply_abis<M, double>();
+ apply_abis<M, float>();
+ apply_abis<M, ullong>();
+ apply_abis<M, llong>();
+ apply_abis<M, ulong>();
+ apply_abis<M, long>();
+ apply_abis<M, uint>();
+ apply_abis<M, int>();
+ apply_abis<M, ushort>();
+ apply_abis<M, short>();
+ apply_abis<M, uchar>();
+ apply_abis<M, schar>();
+ apply_abis<M, char>();
+ apply_abis<M, wchar_t>();
+ apply_abis<M, char16_t>();
+ apply_abis<M, char32_t>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h
new file mode 100644
index 00000000000..56149ba343e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_implicit_cvt.h
@@ -0,0 +1,84 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <class M, class M2>
+constexpr bool assign_should_work
+ = std::is_same<M, M2>::value
+ || (std::is_same<typename M::abi_type,
+ std::experimental::simd_abi::fixed_size<M::size()>>::value
+ && std::is_same<typename M::abi_type, typename M2::abi_type>::value);
+template <class M, class M2>
+constexpr bool assign_should_not_work = !assign_should_work<M, M2>;
+
+template <class L, class R>
+std::enable_if_t<assign_should_work<L, R>>
+implicit_conversions_test()
+{
+ L x = R(true);
+ COMPARE(x, L(true));
+ x = R(false);
+ COMPARE(x, L(false));
+ R y(false);
+ y[0] = true;
+ x = y;
+ L ref(false);
+ ref[0] = true;
+ COMPARE(x, ref);
+}
+
+template <class L, class R>
+std::enable_if_t<assign_should_not_work<L, R>>
+implicit_conversions_test()
+{
+ VERIFY((is_substitution_failure<L&, R, assignment>) );
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ using std::experimental::fixed_size_simd_mask;
+ using std::experimental::native_simd_mask;
+ using std::experimental::simd_mask;
+
+ implicit_conversions_test<M, simd_mask<ldouble>>();
+ implicit_conversions_test<M, simd_mask<double>>();
+ implicit_conversions_test<M, simd_mask<float>>();
+ implicit_conversions_test<M, simd_mask<ullong>>();
+ implicit_conversions_test<M, simd_mask<llong>>();
+ implicit_conversions_test<M, simd_mask<ulong>>();
+ implicit_conversions_test<M, simd_mask<long>>();
+ implicit_conversions_test<M, simd_mask<uint>>();
+ implicit_conversions_test<M, simd_mask<int>>();
+ implicit_conversions_test<M, simd_mask<ushort>>();
+ implicit_conversions_test<M, simd_mask<short>>();
+ implicit_conversions_test<M, simd_mask<uchar>>();
+ implicit_conversions_test<M, simd_mask<schar>>();
+ implicit_conversions_test<M, native_simd_mask<ldouble>>();
+ implicit_conversions_test<M, native_simd_mask<double>>();
+ implicit_conversions_test<M, native_simd_mask<float>>();
+ implicit_conversions_test<M, native_simd_mask<ullong>>();
+ implicit_conversions_test<M, native_simd_mask<llong>>();
+ implicit_conversions_test<M, native_simd_mask<ulong>>();
+ implicit_conversions_test<M, native_simd_mask<long>>();
+ implicit_conversions_test<M, native_simd_mask<uint>>();
+ implicit_conversions_test<M, native_simd_mask<int>>();
+ implicit_conversions_test<M, native_simd_mask<ushort>>();
+ implicit_conversions_test<M, native_simd_mask<short>>();
+ implicit_conversions_test<M, native_simd_mask<uchar>>();
+ implicit_conversions_test<M, native_simd_mask<schar>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<ldouble, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<double, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<float, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<ullong, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<llong, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<ulong, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<long, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<uint, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<int, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<ushort, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<short, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<uchar, M::size()>>();
+ implicit_conversions_test<M, fixed_size_simd_mask<schar, M::size()>>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h
new file mode 100644
index 00000000000..933d30ed7a6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_loadstore.h
@@ -0,0 +1,144 @@
+#include "bits/verify.h"
+
+// simd_mask generator functions {{{1
+template <class M>
+M
+make_mask(const std::initializer_list<bool>& init)
+{
+ std::size_t i = 0;
+ M r = {};
+ for (;;)
+ {
+ for (bool x : init)
+ {
+ r[i] = x;
+ if (++i == M::size())
+ {
+ return r;
+ }
+ }
+ }
+}
+
+template <class M>
+M
+make_alternating_mask()
+{
+ return make_mask<M>({false, true});
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ // loads {{{2
+ constexpr size_t alignment = 2 * std::experimental::memory_alignment_v<M>;
+ alignas(alignment) bool mem[3 * M::size()];
+ std::memset(mem, 0, sizeof(mem));
+ for (std::size_t i = 1; i < sizeof(mem) / sizeof(*mem); i += 2)
+ {
+ COMPARE(mem[i - 1], false);
+ mem[i] = true;
+ }
+ using std::experimental::element_aligned;
+ using std::experimental::vector_aligned;
+ constexpr size_t stride_alignment
+ = M::size() & 1
+ ? 1
+ : M::size() & 2
+ ? 2
+ : M::size() & 4
+ ? 4
+ : M::size() & 8
+ ? 8
+ : M::size() & 16
+ ? 16
+ : M::size() & 32
+ ? 32
+ : M::size() & 64
+ ? 64
+ : M::size() & 128 ? 128
+ : M::size() & 256 ? 256 : 512;
+ using stride_aligned_t = std::conditional_t<
+ M::size() == stride_alignment, decltype(vector_aligned),
+ std::experimental::overaligned_tag<stride_alignment * sizeof(bool)>>;
+ constexpr stride_aligned_t stride_aligned = {};
+ constexpr auto overaligned = std::experimental::overaligned<alignment>;
+
+ const M alternating_mask = make_alternating_mask<M>();
+
+ M x(&mem[M::size()], stride_aligned);
+ COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : alternating_mask)
+ << x.__to_bitset()
+ << ", alternating_mask: " << alternating_mask.__to_bitset();
+ x = {&mem[1], element_aligned};
+ COMPARE(x, !alternating_mask);
+ x = M{mem, overaligned};
+ COMPARE(x, alternating_mask);
+
+ x.copy_from(&mem[M::size()], stride_aligned);
+ COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : alternating_mask);
+ x.copy_from(&mem[1], element_aligned);
+ COMPARE(x, !alternating_mask);
+ x.copy_from(mem, vector_aligned);
+ COMPARE(x, alternating_mask);
+
+ x = !alternating_mask;
+ where(alternating_mask, x).copy_from(&mem[M::size()], stride_aligned);
+ COMPARE(x, M::size() % 2 == 1 ? !alternating_mask : M{true});
+ x = M(true); // 1111
+ where(alternating_mask, x).copy_from(&mem[1], element_aligned); // load .0.0
+ COMPARE(x, !alternating_mask); // 1010
+ where(alternating_mask, x).copy_from(mem, overaligned); // load .1.1
+ COMPARE(x, M{true}); // 1111
+
+ // stores {{{2
+ memset(mem, 0, sizeof(mem));
+ x = M(true);
+ x.copy_to(&mem[M::size()], stride_aligned);
+ std::size_t i = 0;
+ for (; i < M::size(); ++i)
+ {
+ COMPARE(mem[i], false);
+ }
+ for (; i < 2 * M::size(); ++i)
+ {
+ COMPARE(mem[i], true) << "i: " << i << ", x: " << x;
+ }
+ for (; i < 3 * M::size(); ++i)
+ {
+ COMPARE(mem[i], false);
+ }
+ memset(mem, 0, sizeof(mem));
+ x.copy_to(&mem[1], element_aligned);
+ COMPARE(mem[0], false);
+ for (i = 1; i <= M::size(); ++i)
+ {
+ COMPARE(mem[i], true);
+ }
+ for (; i < 3 * M::size(); ++i)
+ {
+ COMPARE(mem[i], false);
+ }
+ memset(mem, 0, sizeof(mem));
+ alternating_mask.copy_to(mem, overaligned);
+ for (i = 0; i < M::size(); ++i)
+ {
+ COMPARE(mem[i], (i & 1) == 1);
+ }
+ for (; i < 3 * M::size(); ++i)
+ {
+ COMPARE(mem[i], false);
+ }
+ x.copy_to(mem, vector_aligned);
+ where(alternating_mask, !x).copy_to(mem, overaligned);
+ for (i = 0; i < M::size(); ++i)
+ {
+ COMPARE(mem[i], i % 2 == 0);
+ }
+ for (; i < 3 * M::size(); ++i)
+ {
+ COMPARE(mem[i], false);
+ }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h
new file mode 100644
index 00000000000..ebfce1ed569
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operator_cvt.h
@@ -0,0 +1,94 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+using schar = signed char;
+using uchar = unsigned char;
+using ushort = unsigned short;
+using uint = unsigned int;
+using ulong = unsigned long;
+using llong = long long;
+using ullong = unsigned long long;
+using ldouble = long double;
+using wchar = wchar_t;
+using char16 = char16_t;
+using char32 = char32_t;
+
+template <typename M0, typename M1>
+constexpr bool
+bit_and_is_illformed()
+{
+ return is_substitution_failure<M0, M1, std::bit_and<>>;
+}
+
+template <typename M0, typename M1>
+void
+test_binary_op_cvt()
+{
+ COMPARE((bit_and_is_illformed<M0, M1>()), !(std::is_same_v<M0, M1>) );
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ // binary ops without conversions work
+ COMPARE(typeid(M() & M()), typeid(M));
+
+ // nothing else works: no implicit conv. or ambiguous
+ using std::experimental::fixed_size_simd_mask;
+ using std::experimental::native_simd_mask;
+ using std::experimental::simd_mask;
+ test_binary_op_cvt<M, bool>();
+
+ test_binary_op_cvt<M, simd_mask<ldouble>>();
+ test_binary_op_cvt<M, simd_mask<double>>();
+ test_binary_op_cvt<M, simd_mask<float>>();
+ test_binary_op_cvt<M, simd_mask<ullong>>();
+ test_binary_op_cvt<M, simd_mask<llong>>();
+ test_binary_op_cvt<M, simd_mask<ulong>>();
+ test_binary_op_cvt<M, simd_mask<long>>();
+ test_binary_op_cvt<M, simd_mask<uint>>();
+ test_binary_op_cvt<M, simd_mask<int>>();
+ test_binary_op_cvt<M, simd_mask<ushort>>();
+ test_binary_op_cvt<M, simd_mask<short>>();
+ test_binary_op_cvt<M, simd_mask<uchar>>();
+ test_binary_op_cvt<M, simd_mask<schar>>();
+ test_binary_op_cvt<M, simd_mask<wchar>>();
+ test_binary_op_cvt<M, simd_mask<char16>>();
+ test_binary_op_cvt<M, simd_mask<char32>>();
+
+ test_binary_op_cvt<M, native_simd_mask<ldouble>>();
+ test_binary_op_cvt<M, native_simd_mask<double>>();
+ test_binary_op_cvt<M, native_simd_mask<float>>();
+ test_binary_op_cvt<M, native_simd_mask<ullong>>();
+ test_binary_op_cvt<M, native_simd_mask<llong>>();
+ test_binary_op_cvt<M, native_simd_mask<ulong>>();
+ test_binary_op_cvt<M, native_simd_mask<long>>();
+ test_binary_op_cvt<M, native_simd_mask<uint>>();
+ test_binary_op_cvt<M, native_simd_mask<int>>();
+ test_binary_op_cvt<M, native_simd_mask<ushort>>();
+ test_binary_op_cvt<M, native_simd_mask<short>>();
+ test_binary_op_cvt<M, native_simd_mask<uchar>>();
+ test_binary_op_cvt<M, native_simd_mask<schar>>();
+ test_binary_op_cvt<M, native_simd_mask<wchar>>();
+ test_binary_op_cvt<M, native_simd_mask<char16>>();
+ test_binary_op_cvt<M, native_simd_mask<char32>>();
+
+ test_binary_op_cvt<M, fixed_size_simd_mask<ldouble, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<double, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<float, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<ullong, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<llong, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<ulong, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<long, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<uint, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<int, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<ushort, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<short, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<uchar, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<schar, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<wchar, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<char16, 2>>();
+ test_binary_op_cvt<M, fixed_size_simd_mask<char32, 2>>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h
new file mode 100644
index 00000000000..5156bc09021
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_operators.h
@@ -0,0 +1,40 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ { // compares{{{2
+ M x(true), y(false);
+ VERIFY(all_of(x == x));
+ VERIFY(all_of(x != y));
+ VERIFY(all_of(y != x));
+ VERIFY(!all_of(x != x));
+ VERIFY(!all_of(x == y));
+ VERIFY(!all_of(y == x));
+ }
+ { // subscripting{{{2
+ M x(true);
+ for (std::size_t i = 0; i < M::size(); ++i)
+ {
+ COMPARE(x[i], true) << "\nx: " << x << ", i: " << i;
+ x[i] = !x[i];
+ }
+ COMPARE(x, M{false});
+ for (std::size_t i = 0; i < M::size(); ++i)
+ {
+ COMPARE(x[i], false) << "\nx: " << x << ", i: " << i;
+ x[i] = !x[i];
+ }
+ COMPARE(x, M{true});
+ }
+ { // negation{{{2
+ M x(false);
+ M y = !x;
+ COMPARE(y, M{true});
+ COMPARE(!y, x);
+ }
+}
+
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h b/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h
new file mode 100644
index 00000000000..33011a2bf4c
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/mask_reductions.h
@@ -0,0 +1,211 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+// simd_mask generator functions {{{1
+template <class M>
+M
+make_mask(const std::initializer_list<bool>& init)
+{
+ std::size_t i = 0;
+ M r = {};
+ for (;;)
+ {
+ for (bool x : init)
+ {
+ r[i] = x;
+ if (++i == M::size())
+ {
+ return r;
+ }
+ }
+ }
+}
+
+template <class M>
+M
+make_alternating_mask()
+{
+ return make_mask<M>({false, true});
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ const M alternating_mask = make_alternating_mask<M>();
+ COMPARE(alternating_mask[0], false); // assumption below
+ auto&& gen = make_mask<M>;
+
+ // all_of
+ VERIFY(all_of(M{true}));
+ VERIFY(!all_of(alternating_mask));
+ VERIFY(!all_of(M{false}));
+ using std::experimental::all_of;
+ VERIFY(all_of(true));
+ VERIFY(!all_of(false));
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::all_of(x)) { return {}; }));
+
+ // any_of
+ VERIFY(any_of(M{true}));
+ COMPARE(any_of(alternating_mask), M::size() > 1);
+ VERIFY(!any_of(M{false}));
+ using std::experimental::any_of;
+ VERIFY(any_of(true));
+ VERIFY(!any_of(false));
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::any_of(x)) { return {}; }));
+
+ // none_of
+ VERIFY(!none_of(M{true}));
+ COMPARE(none_of(alternating_mask), M::size() == 1);
+ VERIFY(none_of(M{false}));
+ using std::experimental::none_of;
+ VERIFY(!none_of(true));
+ VERIFY(none_of(false));
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::none_of(x)) { return {}; }));
+
+ // some_of
+ VERIFY(!some_of(M{true}));
+ VERIFY(!some_of(M{false}));
+ if (M::size() > 1)
+ {
+ VERIFY(some_of(gen({true, false})));
+ VERIFY(some_of(gen({false, true})));
+ if (M::size() > 3)
+ {
+ VERIFY(some_of(gen({0, 0, 0, 1})));
+ }
+ }
+ using std::experimental::some_of;
+ VERIFY(!some_of(true));
+ VERIFY(!some_of(false));
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::some_of(x)) { return {}; }));
+
+ // popcount
+ COMPARE(popcount(M{true}), int(M::size()));
+ COMPARE(popcount(alternating_mask), int(M::size()) / 2);
+ COMPARE(popcount(M{false}), 0);
+ COMPARE(popcount(gen({0, 0, 1})), int(M::size()) / 3);
+ COMPARE(popcount(gen({0, 0, 0, 1})), int(M::size()) / 4);
+ COMPARE(popcount(gen({0, 0, 0, 0, 1})), int(M::size()) / 5);
+ COMPARE(std::experimental::popcount(true), 1);
+ COMPARE(std::experimental::popcount(false), 0);
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::popcount(x)) { return {}; }));
+
+ // find_first_set
+ {
+ M x(false);
+ for (int i = int(M::size() / 2 - 1); i >= 0; --i)
+ {
+ x[i] = true;
+ COMPARE(find_first_set(x), i) << x;
+ }
+ x = M(false);
+ for (int i = int(M::size() - 1); i >= 0; --i)
+ {
+ x[i] = true;
+ COMPARE(find_first_set(x), i) << x;
+ }
+ }
+ COMPARE(find_first_set(M{true}), 0);
+ if (M::size() > 1)
+ {
+ COMPARE(find_first_set(gen({0, 1})), 1);
+ }
+ if (M::size() > 2)
+ {
+ COMPARE(find_first_set(gen({0, 0, 1})), 2);
+ }
+ COMPARE(std::experimental::find_first_set(true), 0);
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::find_first_set(x)) {
+ return {};
+ }));
+
+ // find_last_set
+ {
+ M x(false);
+ for (int i = 0; i < int(M::size()); ++i)
+ {
+ x[i] = true;
+ COMPARE(find_last_set(x), i) << x;
+ }
+ }
+ COMPARE(find_last_set(M{true}), int(M::size()) - 1);
+ if (M::size() > 1)
+ {
+ COMPARE(find_last_set(gen({1, 0})),
+ int(M::size()) - 2 + int(M::size() & 1));
+ }
+ if (M::size() > 3 && (M::size() & 3) == 0)
+ {
+ COMPARE(find_last_set(gen({1, 0, 0, 0})),
+ int(M::size()) - 4 - int(M::size() & 3));
+ }
+ COMPARE(std::experimental::find_last_set(true), 0);
+ VERIFY(sfinae_is_callable<bool>(
+ [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<int>(
+ [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<float>(
+ [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+ return {};
+ }));
+ VERIFY(!sfinae_is_callable<char>(
+ [](auto x) -> decltype(std::experimental::find_last_set(x)) {
+ return {};
+ }));
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h b/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h
new file mode 100644
index 00000000000..b0eec615a49
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/math_1arg.h
@@ -0,0 +1,57 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(0);
+ vir::test::setFuzzyness<double>(0);
+
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>({+0.,
+ 0.5,
+ -0.5,
+ 1.5,
+ -1.5,
+ 2.5,
+ -2.5,
+ 0x1.fffffffffffffp52,
+ -0x1.fffffffffffffp52,
+ 0x1.ffffffffffffep52,
+ -0x1.ffffffffffffep52,
+ 0x1.ffffffffffffdp52,
+ -0x1.ffffffffffffdp52,
+ 0x1.fffffep21,
+ -0x1.fffffep21,
+ 0x1.fffffcp21,
+ -0x1.fffffcp21,
+ 0x1.fffffep22,
+ -0x1.fffffep22,
+ 0x1.fffffcp22,
+ -0x1.fffffcp22,
+ 0x1.fffffep23,
+ -0x1.fffffep23,
+ 0x1.fffffcp23,
+ -0x1.fffffcp23,
+ 0x1.8p23,
+ -0x1.8p23,
+#ifdef __STDC_IEC_559__
+ limits::infinity(),
+ -limits::infinity(),
+ -0.,
+ limits::quiet_NaN(),
+ limits::denorm_min(),
+ limits::min() / 3,
+#endif
+ limits::min(),
+ limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2},
+ MAKE_TESTER(sqrt), MAKE_TESTER(erf), MAKE_TESTER(erfc),
+ MAKE_TESTER(tgamma), MAKE_TESTER(lgamma), MAKE_TESTER(ceil),
+ MAKE_TESTER(floor), MAKE_TESTER(trunc), MAKE_TESTER(round),
+ MAKE_TESTER(lround), MAKE_TESTER(llround),
+ MAKE_TESTER(nearbyint), MAKE_TESTER(rint), MAKE_TESTER(lrint),
+ MAKE_TESTER(llrint), MAKE_TESTER(ilogb));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h b/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h
new file mode 100644
index 00000000000..ae7cf257ec9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/math_2arg.h
@@ -0,0 +1,53 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ using T = typename V::value_type;
+ using limits = std::numeric_limits<T>;
+
+ vir::test::setFuzzyness<float>(1);
+ vir::test::setFuzzyness<double>(1);
+ vir::test::setFuzzyness<long double>(1);
+ test_values_2arg<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+ limits::denorm_min(), limits::min() / 3,
+#endif
+ +0., limits::min(), limits::max()},
+ {100000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(hypot));
+ COMPARE(hypot(V(limits::max()), V(limits::max())), V(limits::infinity()));
+ COMPARE(hypot(V(limits::min()), V(limits::min())),
+ V(limits::min() * std::sqrt(T(2))));
+ VERIFY((sfinae_is_callable<V, V>(
+ [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+ VERIFY((sfinae_is_callable<typename V::value_type, V>(
+ [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+ VERIFY((sfinae_is_callable<V, typename V::value_type>(
+ [](auto a, auto b) -> decltype(hypot(a, b)) { return {}; })));
+
+ vir::test::setFuzzyness<float>(0);
+ vir::test::setFuzzyness<double>(0);
+ vir::test::setFuzzyness<long double>(0);
+ test_values_2arg<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+ limits::denorm_min(), limits::min() / 3, -0.,
+#endif
+ +0., limits::min(), limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(pow),
+ MAKE_TESTER(fmod), MAKE_TESTER(remainder), MAKE_TESTER_NOFPEXCEPT(copysign),
+ MAKE_TESTER(nextafter), // MAKE_TESTER(nexttoward),
+ MAKE_TESTER(fdim), MAKE_TESTER(fmax), MAKE_TESTER(fmin),
+ MAKE_TESTER_NOFPEXCEPT(isgreater), MAKE_TESTER_NOFPEXCEPT(isgreaterequal),
+ MAKE_TESTER_NOFPEXCEPT(isless), MAKE_TESTER_NOFPEXCEPT(islessequal),
+ MAKE_TESTER_NOFPEXCEPT(islessgreater), MAKE_TESTER_NOFPEXCEPT(isunordered));
+}
+
+// vim: ts=8 et sw=2 sts=2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h b/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h
new file mode 100644
index 00000000000..82bb1ee5981
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/operator_cvt.h
@@ -0,0 +1,1064 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+
+// type with sizeof(char) but different signedness
+using xchar = std::conditional_t<std::is_unsigned_v<char>, schar, uchar>;
+
+// vT {{{
+using vschar = std::experimental::native_simd<schar>;
+using vuchar = std::experimental::native_simd<uchar>;
+using vshort = std::experimental::native_simd<short>;
+using vushort = std::experimental::native_simd<ushort>;
+using vint = std::experimental::native_simd<int>;
+using vuint = std::experimental::native_simd<uint>;
+using vlong = std::experimental::native_simd<long>;
+using vulong = std::experimental::native_simd<ulong>;
+using vllong = std::experimental::native_simd<llong>;
+using vullong = std::experimental::native_simd<ullong>;
+using vfloat = std::experimental::native_simd<float>;
+using vdouble = std::experimental::native_simd<double>;
+using vldouble = std::experimental::native_simd<long double>;
+using vchar = std::experimental::native_simd<char>;
+using vxchar = std::experimental::native_simd<xchar>;
+// }}}
+// viN/vfN {{{
+template <typename T>
+using vi8 = std::experimental::fixed_size_simd<T, vschar::size()>;
+template <typename T>
+using vi16 = std::experimental::fixed_size_simd<T, vshort::size()>;
+template <typename T>
+using vf32 = std::experimental::fixed_size_simd<T, vfloat::size()>;
+template <typename T>
+using vi32 = std::experimental::fixed_size_simd<T, vint::size()>;
+template <typename T>
+using vf64 = std::experimental::fixed_size_simd<T, vdouble::size()>;
+template <typename T>
+using vi64 = std::experimental::fixed_size_simd<T, vllong::size()>;
+template <typename T>
+using vl = typename std::conditional<sizeof(long) == sizeof(llong), vi64<T>,
+ vi32<T>>::type;
+// }}}
+
+template <class A, class B, class Expected = A>
+void
+binary_op_return_type()
+{
+ using namespace vir::test;
+ static_assert(std::is_same<A, Expected>::value, "");
+ using AC = std::add_const_t<A>;
+ using BC = std::add_const_t<B>;
+ COMPARE(typeid(A() + B()), typeid(Expected));
+ COMPARE(typeid(B() + A()), typeid(Expected));
+ COMPARE(typeid(AC() + BC()), typeid(Expected));
+ COMPARE(typeid(BC() + AC()), typeid(Expected));
+}
+
+template <typename V>
+void
+test()
+{
+ using T = typename V::value_type;
+ namespace simd_abi = std::experimental::simd_abi;
+ binary_op_return_type<V, V, V>();
+ binary_op_return_type<V, T, V>();
+ binary_op_return_type<V, int, V>();
+
+ if constexpr (std::is_same_v<V, vfloat>)
+ { //{{{2
+ binary_op_return_type<vfloat, schar>();
+ binary_op_return_type<vfloat, uchar>();
+ binary_op_return_type<vfloat, short>();
+ binary_op_return_type<vfloat, ushort>();
+
+ binary_op_return_type<vf32<float>, schar>();
+ binary_op_return_type<vf32<float>, uchar>();
+ binary_op_return_type<vf32<float>, short>();
+ binary_op_return_type<vf32<float>, ushort>();
+ binary_op_return_type<vf32<float>, int>();
+ binary_op_return_type<vf32<float>, float>();
+
+ binary_op_return_type<vf32<float>, vf32<schar>>();
+ binary_op_return_type<vf32<float>, vf32<uchar>>();
+ binary_op_return_type<vf32<float>, vf32<short>>();
+ binary_op_return_type<vf32<float>, vf32<ushort>>();
+ binary_op_return_type<vf32<float>, vf32<float>>();
+
+ VERIFY((is_substitution_failure<vfloat, uint>) );
+ VERIFY((is_substitution_failure<vfloat, long>) );
+ VERIFY((is_substitution_failure<vfloat, ulong>) );
+ VERIFY((is_substitution_failure<vfloat, llong>) );
+ VERIFY((is_substitution_failure<vfloat, ullong>) );
+ VERIFY((is_substitution_failure<vfloat, double>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<schar>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<uchar>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<short>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<ushort>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<int>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<uint>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<long>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<ulong>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<llong>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<ullong>>) );
+ VERIFY((is_substitution_failure<vfloat, vf32<float>>) );
+
+ VERIFY((is_substitution_failure<vf32<float>, vfloat>) );
+ VERIFY((is_substitution_failure<vf32<float>, uint>) );
+ VERIFY((is_substitution_failure<vf32<float>, long>) );
+ VERIFY((is_substitution_failure<vf32<float>, ulong>) );
+ VERIFY((is_substitution_failure<vf32<float>, llong>) );
+ VERIFY((is_substitution_failure<vf32<float>, ullong>) );
+ VERIFY((is_substitution_failure<vf32<float>, double>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<int>>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<uint>>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<long>>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<ulong>>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<llong>>) );
+ VERIFY((is_substitution_failure<vf32<float>, vf32<ullong>>) );
+
+ VERIFY((is_substitution_failure<vfloat, vf32<double>>) );
+ }
+ else if constexpr (std::is_same_v<V, vdouble>)
+ { //{{{2
+ binary_op_return_type<vdouble, float, vdouble>();
+ binary_op_return_type<vdouble, schar>();
+ binary_op_return_type<vdouble, uchar>();
+ binary_op_return_type<vdouble, short>();
+ binary_op_return_type<vdouble, ushort>();
+ binary_op_return_type<vdouble, uint>();
+
+ binary_op_return_type<vf64<double>, schar>();
+ binary_op_return_type<vf64<double>, uchar>();
+ binary_op_return_type<vf64<double>, short>();
+ binary_op_return_type<vf64<double>, ushort>();
+ binary_op_return_type<vf64<double>, uint>();
+ binary_op_return_type<vf64<double>, int, vf64<double>>();
+ binary_op_return_type<vf64<double>, float, vf64<double>>();
+ binary_op_return_type<vf64<double>, double, vf64<double>>();
+ binary_op_return_type<vf64<double>, vf64<double>, vf64<double>>();
+ binary_op_return_type<vf32<double>, schar>();
+ binary_op_return_type<vf32<double>, uchar>();
+ binary_op_return_type<vf32<double>, short>();
+ binary_op_return_type<vf32<double>, ushort>();
+ binary_op_return_type<vf32<double>, uint>();
+ binary_op_return_type<vf32<double>, int, vf32<double>>();
+ binary_op_return_type<vf32<double>, float, vf32<double>>();
+ binary_op_return_type<vf32<double>, double, vf32<double>>();
+ binary_op_return_type<vf64<double>, vf64<schar>>();
+ binary_op_return_type<vf64<double>, vf64<uchar>>();
+ binary_op_return_type<vf64<double>, vf64<short>>();
+ binary_op_return_type<vf64<double>, vf64<ushort>>();
+ binary_op_return_type<vf64<double>, vf64<int>>();
+ binary_op_return_type<vf64<double>, vf64<uint>>();
+ binary_op_return_type<vf64<double>, vf64<float>>();
+
+ VERIFY((is_substitution_failure<vdouble, llong>) );
+ VERIFY((is_substitution_failure<vdouble, ullong>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<schar>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<uchar>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<short>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<ushort>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<int>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<uint>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<long>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<ulong>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<llong>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<ullong>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<float>>) );
+ VERIFY((is_substitution_failure<vdouble, vf64<double>>) );
+
+ VERIFY((is_substitution_failure<vf64<double>, vdouble>) );
+ VERIFY((is_substitution_failure<vf64<double>, llong>) );
+ VERIFY((is_substitution_failure<vf64<double>, ullong>) );
+ VERIFY((is_substitution_failure<vf64<double>, vf64<llong>>) );
+ VERIFY((is_substitution_failure<vf64<double>, vf64<ullong>>) );
+
+ VERIFY((is_substitution_failure<vf32<double>, llong>) );
+ VERIFY((is_substitution_failure<vf32<double>, ullong>) );
+
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ VERIFY((is_substitution_failure<vdouble, long>) );
+ VERIFY((is_substitution_failure<vdouble, ulong>) );
+ VERIFY((is_substitution_failure<vf64<double>, long>) );
+ VERIFY((is_substitution_failure<vf64<double>, ulong>) );
+ VERIFY((is_substitution_failure<vf64<double>, vf64<long>>) );
+ VERIFY((is_substitution_failure<vf64<double>, vf64<ulong>>) );
+ VERIFY((is_substitution_failure<vf32<double>, long>) );
+ VERIFY((is_substitution_failure<vf32<double>, ulong>) );
+ }
+ else
+ {
+ binary_op_return_type<vdouble, long>();
+ binary_op_return_type<vdouble, ulong>();
+ binary_op_return_type<vf64<double>, long>();
+ binary_op_return_type<vf64<double>, ulong>();
+ binary_op_return_type<vf64<double>, vf64<long>>();
+ binary_op_return_type<vf64<double>, vf64<ulong>>();
+ binary_op_return_type<vf32<double>, long>();
+ binary_op_return_type<vf32<double>, ulong>();
+ }
+ }
+ else if constexpr (std::is_same_v<V, vldouble>)
+ { //{{{2
+ binary_op_return_type<vldouble, schar>();
+ binary_op_return_type<vldouble, uchar>();
+ binary_op_return_type<vldouble, short>();
+ binary_op_return_type<vldouble, ushort>();
+ binary_op_return_type<vldouble, uint>();
+ binary_op_return_type<vldouble, long>();
+ binary_op_return_type<vldouble, ulong>();
+ binary_op_return_type<vldouble, float>();
+ binary_op_return_type<vldouble, double>();
+
+ binary_op_return_type<vf64<long double>, schar>();
+ binary_op_return_type<vf64<long double>, uchar>();
+ binary_op_return_type<vf64<long double>, short>();
+ binary_op_return_type<vf64<long double>, ushort>();
+ binary_op_return_type<vf64<long double>, int>();
+ binary_op_return_type<vf64<long double>, uint>();
+ binary_op_return_type<vf64<long double>, long>();
+ binary_op_return_type<vf64<long double>, ulong>();
+ binary_op_return_type<vf64<long double>, float>();
+ binary_op_return_type<vf64<long double>, double>();
+ binary_op_return_type<vf64<long double>, vf64<long double>>();
+
+ using std::experimental::simd;
+ using A = simd_abi::fixed_size<vldouble::size()>;
+ binary_op_return_type<simd<long double, A>, schar>();
+ binary_op_return_type<simd<long double, A>, uchar>();
+ binary_op_return_type<simd<long double, A>, short>();
+ binary_op_return_type<simd<long double, A>, ushort>();
+ binary_op_return_type<simd<long double, A>, int>();
+ binary_op_return_type<simd<long double, A>, uint>();
+ binary_op_return_type<simd<long double, A>, long>();
+ binary_op_return_type<simd<long double, A>, ulong>();
+ binary_op_return_type<simd<long double, A>, float>();
+ binary_op_return_type<simd<long double, A>, double>();
+
+ if constexpr (sizeof(ldouble) == sizeof(double))
+ {
+ VERIFY((is_substitution_failure<vldouble, llong>) );
+ VERIFY((is_substitution_failure<vldouble, ullong>) );
+ VERIFY((is_substitution_failure<vf64<ldouble>, llong>) );
+ VERIFY((is_substitution_failure<vf64<ldouble>, ullong>) );
+ VERIFY((is_substitution_failure<simd<ldouble, A>, llong>) );
+ VERIFY((is_substitution_failure<simd<ldouble, A>, ullong>) );
+ }
+ else
+ {
+ binary_op_return_type<vldouble, llong>();
+ binary_op_return_type<vldouble, ullong>();
+ binary_op_return_type<vf64<long double>, llong>();
+ binary_op_return_type<vf64<long double>, ullong>();
+ binary_op_return_type<simd<long double, A>, llong>();
+ binary_op_return_type<simd<long double, A>, ullong>();
+ }
+
+ VERIFY((is_substitution_failure<vf64<long double>, vldouble>) );
+ COMPARE((is_substitution_failure<simd<long double, A>, vldouble>),
+ (!std::is_same<A, vldouble::abi_type>::value));
+ }
+ else if constexpr (std::is_same_v<V, vlong>)
+ { //{{{2
+ VERIFY((is_substitution_failure<vi32<long>, double>) );
+ VERIFY((is_substitution_failure<vi32<long>, float>) );
+ VERIFY((is_substitution_failure<vi32<long>, vi32<float>>) );
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ binary_op_return_type<vlong, uint>();
+ binary_op_return_type<vlong, llong>();
+ binary_op_return_type<vi32<long>, uint>();
+ binary_op_return_type<vi32<long>, llong>();
+ binary_op_return_type<vi64<long>, uint>();
+ binary_op_return_type<vi64<long>, llong>();
+ binary_op_return_type<vi32<long>, vi32<uint>>();
+ binary_op_return_type<vi64<long>, vi64<uint>>();
+ VERIFY((is_substitution_failure<vi32<long>, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi64<long>, vi64<double>>) );
+ }
+ else
+ {
+ VERIFY((is_substitution_failure<vlong, uint>) );
+ VERIFY((is_substitution_failure<vlong, llong>) );
+ VERIFY((is_substitution_failure<vi32<long>, uint>) );
+ VERIFY((is_substitution_failure<vi32<long>, llong>) );
+ VERIFY((is_substitution_failure<vi64<long>, uint>) );
+ VERIFY((is_substitution_failure<vi64<long>, llong>) );
+ VERIFY((is_substitution_failure<vi32<long>, vi32<uint>>) );
+ VERIFY((is_substitution_failure<vi64<long>, vi64<uint>>) );
+ binary_op_return_type<vi32<double>, vi32<long>>();
+ binary_op_return_type<vi64<double>, vi64<long>>();
+ }
+
+ binary_op_return_type<vlong, schar, vlong>();
+ binary_op_return_type<vlong, uchar, vlong>();
+ binary_op_return_type<vlong, short, vlong>();
+ binary_op_return_type<vlong, ushort, vlong>();
+
+ binary_op_return_type<vi32<long>, schar, vi32<long>>();
+ binary_op_return_type<vi32<long>, uchar, vi32<long>>();
+ binary_op_return_type<vi32<long>, short, vi32<long>>();
+ binary_op_return_type<vi32<long>, ushort, vi32<long>>();
+ binary_op_return_type<vi32<long>, int, vi32<long>>();
+ binary_op_return_type<vi32<long>, long, vi32<long>>();
+ binary_op_return_type<vi32<long>, vi32<long>, vi32<long>>();
+ binary_op_return_type<vi64<long>, schar, vi64<long>>();
+ binary_op_return_type<vi64<long>, uchar, vi64<long>>();
+ binary_op_return_type<vi64<long>, short, vi64<long>>();
+ binary_op_return_type<vi64<long>, ushort, vi64<long>>();
+ binary_op_return_type<vi64<long>, int, vi64<long>>();
+ binary_op_return_type<vi64<long>, long, vi64<long>>();
+ binary_op_return_type<vi64<long>, vi64<long>, vi64<long>>();
+
+ VERIFY((is_substitution_failure<vlong, vulong>) );
+ VERIFY((is_substitution_failure<vlong, ulong>) );
+ VERIFY((is_substitution_failure<vlong, ullong>) );
+ VERIFY((is_substitution_failure<vlong, float>) );
+ VERIFY((is_substitution_failure<vlong, double>) );
+ VERIFY((is_substitution_failure<vlong, vl<schar>>) );
+ VERIFY((is_substitution_failure<vlong, vl<uchar>>) );
+ VERIFY((is_substitution_failure<vlong, vl<short>>) );
+ VERIFY((is_substitution_failure<vlong, vl<ushort>>) );
+ VERIFY((is_substitution_failure<vlong, vl<int>>) );
+ VERIFY((is_substitution_failure<vlong, vl<uint>>) );
+ VERIFY((is_substitution_failure<vlong, vl<long>>) );
+ VERIFY((is_substitution_failure<vlong, vl<ulong>>) );
+ VERIFY((is_substitution_failure<vlong, vl<llong>>) );
+ VERIFY((is_substitution_failure<vlong, vl<ullong>>) );
+ VERIFY((is_substitution_failure<vlong, vl<float>>) );
+ VERIFY((is_substitution_failure<vlong, vl<double>>) );
+ VERIFY((is_substitution_failure<vl<long>, vlong>) );
+ VERIFY((is_substitution_failure<vl<long>, vulong>) );
+ VERIFY((is_substitution_failure<vi32<long>, ulong>) );
+ VERIFY((is_substitution_failure<vi32<long>, ullong>) );
+ binary_op_return_type<vi32<long>, vi32<schar>>();
+ binary_op_return_type<vi32<long>, vi32<uchar>>();
+ binary_op_return_type<vi32<long>, vi32<short>>();
+ binary_op_return_type<vi32<long>, vi32<ushort>>();
+ binary_op_return_type<vi32<long>, vi32<int>>();
+ VERIFY((is_substitution_failure<vi32<long>, vi32<ulong>>) );
+ VERIFY((is_substitution_failure<vi32<long>, vi32<ullong>>) );
+ VERIFY((is_substitution_failure<vi64<long>, ulong>) );
+ VERIFY((is_substitution_failure<vi64<long>, ullong>) );
+ VERIFY((is_substitution_failure<vi64<long>, float>) );
+ VERIFY((is_substitution_failure<vi64<long>, double>) );
+ binary_op_return_type<vi64<long>, vi64<schar>>();
+ binary_op_return_type<vi64<long>, vi64<uchar>>();
+ binary_op_return_type<vi64<long>, vi64<short>>();
+ binary_op_return_type<vi64<long>, vi64<ushort>>();
+ binary_op_return_type<vi64<long>, vi64<int>>();
+ VERIFY((is_substitution_failure<vi64<long>, vi64<ulong>>) );
+ VERIFY((is_substitution_failure<vi64<long>, vi64<ullong>>) );
+ VERIFY((is_substitution_failure<vi64<long>, vi64<float>>) );
+
+ binary_op_return_type<vi32<llong>, vi32<long>>();
+ binary_op_return_type<vi64<llong>, vi64<long>>();
+ }
+ else if constexpr (std::is_same_v<V, vulong>)
+ { //{{{2
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ binary_op_return_type<vulong, ullong, vulong>();
+ binary_op_return_type<vi32<ulong>, ullong, vi32<ulong>>();
+ binary_op_return_type<vi64<ulong>, ullong, vi64<ulong>>();
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<llong>>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<llong>>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<double>>) );
+ }
+ else
+ {
+ VERIFY((is_substitution_failure<vulong, ullong>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, ullong>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, ullong>) );
+ binary_op_return_type<vi32<llong>, vi32<ulong>>();
+ binary_op_return_type<vi32<double>, vi32<ulong>>();
+ binary_op_return_type<vi64<llong>, vi64<ulong>>();
+ binary_op_return_type<vi64<double>, vi64<ulong>>();
+ }
+
+ binary_op_return_type<vulong, uchar, vulong>();
+ binary_op_return_type<vulong, ushort, vulong>();
+ binary_op_return_type<vulong, uint, vulong>();
+ binary_op_return_type<vi32<ulong>, uchar, vi32<ulong>>();
+ binary_op_return_type<vi32<ulong>, ushort, vi32<ulong>>();
+ binary_op_return_type<vi32<ulong>, int, vi32<ulong>>();
+ binary_op_return_type<vi32<ulong>, uint, vi32<ulong>>();
+ binary_op_return_type<vi32<ulong>, ulong, vi32<ulong>>();
+ binary_op_return_type<vi32<ulong>, vi32<ulong>, vi32<ulong>>();
+ binary_op_return_type<vi64<ulong>, uchar, vi64<ulong>>();
+ binary_op_return_type<vi64<ulong>, ushort, vi64<ulong>>();
+ binary_op_return_type<vi64<ulong>, int, vi64<ulong>>();
+ binary_op_return_type<vi64<ulong>, uint, vi64<ulong>>();
+ binary_op_return_type<vi64<ulong>, ulong, vi64<ulong>>();
+ binary_op_return_type<vi64<ulong>, vi64<ulong>, vi64<ulong>>();
+
+ VERIFY((is_substitution_failure<vi32<ulong>, llong>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, float>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, double>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<float>>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<float>>) );
+ VERIFY((is_substitution_failure<vulong, schar>) );
+ VERIFY((is_substitution_failure<vulong, short>) );
+ VERIFY((is_substitution_failure<vulong, vlong>) );
+ VERIFY((is_substitution_failure<vulong, long>) );
+ VERIFY((is_substitution_failure<vulong, llong>) );
+ VERIFY((is_substitution_failure<vulong, float>) );
+ VERIFY((is_substitution_failure<vulong, double>) );
+ VERIFY((is_substitution_failure<vulong, vl<schar>>) );
+ VERIFY((is_substitution_failure<vulong, vl<uchar>>) );
+ VERIFY((is_substitution_failure<vulong, vl<short>>) );
+ VERIFY((is_substitution_failure<vulong, vl<ushort>>) );
+ VERIFY((is_substitution_failure<vulong, vl<int>>) );
+ VERIFY((is_substitution_failure<vulong, vl<uint>>) );
+ VERIFY((is_substitution_failure<vulong, vl<long>>) );
+ VERIFY((is_substitution_failure<vulong, vl<ulong>>) );
+ VERIFY((is_substitution_failure<vulong, vl<llong>>) );
+ VERIFY((is_substitution_failure<vulong, vl<ullong>>) );
+ VERIFY((is_substitution_failure<vulong, vl<float>>) );
+ VERIFY((is_substitution_failure<vulong, vl<double>>) );
+ VERIFY((is_substitution_failure<vl<ulong>, vlong>) );
+ VERIFY((is_substitution_failure<vl<ulong>, vulong>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, schar>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, short>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, long>) );
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<schar>>) );
+ binary_op_return_type<vi32<ulong>, vi32<uchar>>();
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<short>>) );
+ binary_op_return_type<vi32<ulong>, vi32<ushort>>();
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<int>>) );
+ binary_op_return_type<vi32<ulong>, vi32<uint>>();
+ VERIFY((is_substitution_failure<vi32<ulong>, vi32<long>>) );
+ binary_op_return_type<vi32<ullong>, vi32<ulong>>();
+ VERIFY((is_substitution_failure<vi64<ulong>, schar>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, short>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, long>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, llong>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, float>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, double>) );
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<schar>>) );
+ binary_op_return_type<vi64<ulong>, vi64<uchar>>();
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<short>>) );
+ binary_op_return_type<vi64<ulong>, vi64<ushort>>();
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<int>>) );
+ binary_op_return_type<vi64<ulong>, vi64<uint>>();
+ VERIFY((is_substitution_failure<vi64<ulong>, vi64<long>>) );
+ binary_op_return_type<vi64<ullong>, vi64<ulong>>();
+ }
+ else if constexpr (std::is_same_v<V, vllong>)
+ { //{{{2
+ binary_op_return_type<vllong, schar, vllong>();
+ binary_op_return_type<vllong, uchar, vllong>();
+ binary_op_return_type<vllong, short, vllong>();
+ binary_op_return_type<vllong, ushort, vllong>();
+ binary_op_return_type<vllong, uint, vllong>();
+ binary_op_return_type<vllong, long, vllong>();
+ binary_op_return_type<vi32<llong>, schar, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, uchar, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, short, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, ushort, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, int, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, uint, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, long, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, llong, vi32<llong>>();
+ binary_op_return_type<vi32<llong>, vi32<llong>, vi32<llong>>();
+ binary_op_return_type<vi64<llong>, schar, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, uchar, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, short, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, ushort, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, int, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, uint, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, long, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, llong, vi64<llong>>();
+ binary_op_return_type<vi64<llong>, vi64<llong>>();
+ binary_op_return_type<vi32<llong>, vi32<schar>>();
+ binary_op_return_type<vi32<llong>, vi32<uchar>>();
+ binary_op_return_type<vi32<llong>, vi32<short>>();
+ binary_op_return_type<vi32<llong>, vi32<ushort>>();
+ binary_op_return_type<vi32<llong>, vi32<int>>();
+ binary_op_return_type<vi32<llong>, vi32<uint>>();
+ binary_op_return_type<vi32<llong>, vi32<long>>();
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ VERIFY((is_substitution_failure<vi32<llong>, vi32<ulong>>) );
+ VERIFY((is_substitution_failure<vi32<llong>, ulong>) );
+ VERIFY((is_substitution_failure<vi64<llong>, ulong>) );
+ VERIFY((is_substitution_failure<vllong, ulong>) );
+ }
+ else
+ {
+ binary_op_return_type<vi32<llong>, vi32<ulong>>();
+ binary_op_return_type<vi32<llong>, ulong>();
+ binary_op_return_type<vi64<llong>, ulong>();
+ binary_op_return_type<vllong, ulong>();
+ }
+
+ VERIFY((is_substitution_failure<vllong, vullong>) );
+ VERIFY((is_substitution_failure<vllong, ullong>) );
+ VERIFY((is_substitution_failure<vllong, float>) );
+ VERIFY((is_substitution_failure<vllong, double>) );
+ VERIFY((is_substitution_failure<vllong, vi64<schar>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<uchar>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<short>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<ushort>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<int>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<uint>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<long>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<ulong>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<llong>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<ullong>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<float>>) );
+ VERIFY((is_substitution_failure<vllong, vi64<double>>) );
+ VERIFY((is_substitution_failure<vi32<llong>, ullong>) );
+ VERIFY((is_substitution_failure<vi32<llong>, float>) );
+ VERIFY((is_substitution_failure<vi32<llong>, double>) );
+ VERIFY((is_substitution_failure<vi32<llong>, vi32<ullong>>) );
+ VERIFY((is_substitution_failure<vi32<llong>, vi32<float>>) );
+ VERIFY((is_substitution_failure<vi32<llong>, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi64<llong>, vllong>) );
+ VERIFY((is_substitution_failure<vi64<llong>, vullong>) );
+ VERIFY((is_substitution_failure<vi64<llong>, ullong>) );
+ VERIFY((is_substitution_failure<vi64<llong>, float>) );
+ VERIFY((is_substitution_failure<vi64<llong>, double>) );
+ binary_op_return_type<vi64<llong>, vi64<schar>>();
+ binary_op_return_type<vi64<llong>, vi64<uchar>>();
+ binary_op_return_type<vi64<llong>, vi64<short>>();
+ binary_op_return_type<vi64<llong>, vi64<ushort>>();
+ binary_op_return_type<vi64<llong>, vi64<int>>();
+ binary_op_return_type<vi64<llong>, vi64<uint>>();
+ binary_op_return_type<vi64<llong>, vi64<long>>();
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ VERIFY((is_substitution_failure<vi64<llong>, vi64<ulong>>) );
+ }
+ else
+ {
+ binary_op_return_type<vi64<llong>, vi64<ulong>>();
+ }
+ VERIFY((is_substitution_failure<vi64<llong>, vi64<ullong>>) );
+ VERIFY((is_substitution_failure<vi64<llong>, vi64<float>>) );
+ VERIFY((is_substitution_failure<vi64<llong>, vi64<double>>) );
+ }
+ else if constexpr (std::is_same_v<V, vullong>)
+ { //{{{2
+ binary_op_return_type<vullong, uchar, vullong>();
+ binary_op_return_type<vullong, ushort, vullong>();
+ binary_op_return_type<vullong, uint, vullong>();
+ binary_op_return_type<vullong, ulong, vullong>();
+ binary_op_return_type<vi32<ullong>, uchar, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, ushort, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, int, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, uint, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, ulong, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, ullong, vi32<ullong>>();
+ binary_op_return_type<vi32<ullong>, vi32<ullong>, vi32<ullong>>();
+ binary_op_return_type<vi64<ullong>, uchar, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, ushort, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, int, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, uint, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, ulong, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, ullong, vi64<ullong>>();
+ binary_op_return_type<vi64<ullong>, vi64<ullong>, vi64<ullong>>();
+
+ VERIFY((is_substitution_failure<vullong, schar>) );
+ VERIFY((is_substitution_failure<vullong, short>) );
+ VERIFY((is_substitution_failure<vullong, long>) );
+ VERIFY((is_substitution_failure<vullong, llong>) );
+ VERIFY((is_substitution_failure<vullong, vllong>) );
+ VERIFY((is_substitution_failure<vullong, float>) );
+ VERIFY((is_substitution_failure<vullong, double>) );
+ VERIFY((is_substitution_failure<vullong, vi64<schar>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<uchar>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<short>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<ushort>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<int>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<uint>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<long>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<ulong>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<llong>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<ullong>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<float>>) );
+ VERIFY((is_substitution_failure<vullong, vi64<double>>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, schar>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, short>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, long>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, llong>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, float>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, double>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<schar>>) );
+ binary_op_return_type<vi32<ullong>, vi32<uchar>>();
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<short>>) );
+ binary_op_return_type<vi32<ullong>, vi32<ushort>>();
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<int>>) );
+ binary_op_return_type<vi32<ullong>, vi32<uint>>();
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<long>>) );
+ binary_op_return_type<vi32<ullong>, vi32<ulong>>();
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<llong>>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<float>>) );
+ VERIFY((is_substitution_failure<vi32<ullong>, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, schar>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, short>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, long>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, llong>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, vllong>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, vullong>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, float>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, double>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<schar>>) );
+ binary_op_return_type<vi64<ullong>, vi64<uchar>>();
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<short>>) );
+ binary_op_return_type<vi64<ullong>, vi64<ushort>>();
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<int>>) );
+ binary_op_return_type<vi64<ullong>, vi64<uint>>();
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<long>>) );
+ binary_op_return_type<vi64<ullong>, vi64<ulong>>();
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<llong>>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<float>>) );
+ VERIFY((is_substitution_failure<vi64<ullong>, vi64<double>>) );
+ }
+ else if constexpr (std::is_same_v<V, vint>)
+ { //{{{2
+ binary_op_return_type<vint, schar, vint>();
+ binary_op_return_type<vint, uchar, vint>();
+ binary_op_return_type<vint, short, vint>();
+ binary_op_return_type<vint, ushort, vint>();
+ binary_op_return_type<vi32<int>, schar, vi32<int>>();
+ binary_op_return_type<vi32<int>, uchar, vi32<int>>();
+ binary_op_return_type<vi32<int>, short, vi32<int>>();
+ binary_op_return_type<vi32<int>, ushort, vi32<int>>();
+ binary_op_return_type<vi32<int>, int, vi32<int>>();
+ binary_op_return_type<vi32<int>, vi32<int>, vi32<int>>();
+ binary_op_return_type<vi32<int>, vi32<schar>>();
+ binary_op_return_type<vi32<int>, vi32<uchar>>();
+ binary_op_return_type<vi32<int>, vi32<short>>();
+ binary_op_return_type<vi32<int>, vi32<ushort>>();
+
+ binary_op_return_type<vi32<llong>, vi32<int>>();
+ binary_op_return_type<vi32<double>, vi32<int>>();
+
+ // order is important for MSVC. This compiler is just crazy: It considers
+ // operators from unrelated simd template instantiations as candidates -
+ // but only after they have been tested. So e.g. vi32<int> + llong will
+ // produce a vi32<llong> if a vi32<llong> operator test is done before the
+ // vi32<int> + llong test.
+ VERIFY((is_substitution_failure<vi32<int>, double>) );
+ VERIFY((is_substitution_failure<vi32<int>, float>) );
+ VERIFY((is_substitution_failure<vi32<int>, llong>) );
+ VERIFY((is_substitution_failure<vi32<int>, vi32<float>>) );
+ VERIFY((is_substitution_failure<vint, vuint>) );
+ VERIFY((is_substitution_failure<vint, uint>) );
+ VERIFY((is_substitution_failure<vint, ulong>) );
+ VERIFY((is_substitution_failure<vint, llong>) );
+ VERIFY((is_substitution_failure<vint, ullong>) );
+ VERIFY((is_substitution_failure<vint, float>) );
+ VERIFY((is_substitution_failure<vint, double>) );
+ VERIFY((is_substitution_failure<vint, vi32<schar>>) );
+ VERIFY((is_substitution_failure<vint, vi32<uchar>>) );
+ VERIFY((is_substitution_failure<vint, vi32<short>>) );
+ VERIFY((is_substitution_failure<vint, vi32<ushort>>) );
+ VERIFY((is_substitution_failure<vint, vi32<int>>) );
+ VERIFY((is_substitution_failure<vint, vi32<uint>>) );
+ VERIFY((is_substitution_failure<vint, vi32<long>>) );
+ VERIFY((is_substitution_failure<vint, vi32<ulong>>) );
+ VERIFY((is_substitution_failure<vint, vi32<llong>>) );
+ VERIFY((is_substitution_failure<vint, vi32<ullong>>) );
+ VERIFY((is_substitution_failure<vint, vi32<float>>) );
+ VERIFY((is_substitution_failure<vint, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi32<int>, vint>) );
+ VERIFY((is_substitution_failure<vi32<int>, vuint>) );
+ VERIFY((is_substitution_failure<vi32<int>, uint>) );
+ VERIFY((is_substitution_failure<vi32<int>, ulong>) );
+ VERIFY((is_substitution_failure<vi32<int>, ullong>) );
+ VERIFY((is_substitution_failure<vi32<int>, vi32<uint>>) );
+ VERIFY((is_substitution_failure<vi32<int>, vi32<ulong>>) );
+ VERIFY((is_substitution_failure<vi32<int>, vi32<ullong>>) );
+
+ binary_op_return_type<vi32<long>, vi32<int>>();
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ VERIFY((is_substitution_failure<vint, long>) );
+ VERIFY((is_substitution_failure<vi32<int>, long>) );
+ }
+ else
+ {
+ binary_op_return_type<vint, long>();
+ binary_op_return_type<vi32<int>, long>();
+ }
+ }
+ else if constexpr (std::is_same_v<V, vuint>)
+ { //{{{2
+ VERIFY((is_substitution_failure<vi32<uint>, llong>) );
+ VERIFY((is_substitution_failure<vi32<uint>, ullong>) );
+ VERIFY((is_substitution_failure<vi32<uint>, float>) );
+ VERIFY((is_substitution_failure<vi32<uint>, double>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vi32<float>>) );
+
+ binary_op_return_type<vuint, uchar, vuint>();
+ binary_op_return_type<vuint, ushort, vuint>();
+ binary_op_return_type<vi32<uint>, uchar, vi32<uint>>();
+ binary_op_return_type<vi32<uint>, ushort, vi32<uint>>();
+ binary_op_return_type<vi32<uint>, int, vi32<uint>>();
+ binary_op_return_type<vi32<uint>, uint, vi32<uint>>();
+ binary_op_return_type<vi32<uint>, vi32<uint>, vi32<uint>>();
+ binary_op_return_type<vi32<uint>, vi32<uchar>>();
+ binary_op_return_type<vi32<uint>, vi32<ushort>>();
+
+ binary_op_return_type<vi32<llong>, vi32<uint>>();
+ binary_op_return_type<vi32<ullong>, vi32<uint>>();
+ binary_op_return_type<vi32<double>, vi32<uint>>();
+
+ VERIFY((is_substitution_failure<vuint, schar>) );
+ VERIFY((is_substitution_failure<vuint, short>) );
+ VERIFY((is_substitution_failure<vuint, vint>) );
+ VERIFY((is_substitution_failure<vuint, long>) );
+ VERIFY((is_substitution_failure<vuint, llong>) );
+ VERIFY((is_substitution_failure<vuint, ullong>) );
+ VERIFY((is_substitution_failure<vuint, float>) );
+ VERIFY((is_substitution_failure<vuint, double>) );
+ VERIFY((is_substitution_failure<vuint, vi32<schar>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<uchar>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<short>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<ushort>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<int>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<uint>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<long>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<ulong>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<llong>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<ullong>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<float>>) );
+ VERIFY((is_substitution_failure<vuint, vi32<double>>) );
+ VERIFY((is_substitution_failure<vi32<uint>, schar>) );
+ VERIFY((is_substitution_failure<vi32<uint>, short>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vint>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vuint>) );
+ VERIFY((is_substitution_failure<vi32<uint>, long>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vi32<schar>>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vi32<short>>) );
+ VERIFY((is_substitution_failure<vi32<uint>, vi32<int>>) );
+
+ binary_op_return_type<vi32<ulong>, vi32<uint>>();
+ if constexpr (sizeof(long) == sizeof(llong))
+ {
+ VERIFY((is_substitution_failure<vuint, ulong>) );
+ VERIFY((is_substitution_failure<vi32<uint>, ulong>) );
+ binary_op_return_type<vi32<long>, vi32<uint>>();
+ }
+ else
+ {
+ binary_op_return_type<vuint, ulong>();
+ binary_op_return_type<vi32<uint>, ulong>();
+ VERIFY((is_substitution_failure<vi32<uint>, vi32<long>>) );
+ }
+ }
+ else if constexpr (std::is_same_v<V, vshort>)
+ { //{{{2
+ binary_op_return_type<vshort, schar, vshort>();
+ binary_op_return_type<vshort, uchar, vshort>();
+ binary_op_return_type<vi16<short>, schar, vi16<short>>();
+ binary_op_return_type<vi16<short>, uchar, vi16<short>>();
+ binary_op_return_type<vi16<short>, short, vi16<short>>();
+ binary_op_return_type<vi16<short>, int, vi16<short>>();
+ binary_op_return_type<vi16<short>, vi16<schar>>();
+ binary_op_return_type<vi16<short>, vi16<uchar>>();
+ binary_op_return_type<vi16<short>, vi16<short>>();
+
+ binary_op_return_type<vi16<int>, vi16<short>>();
+ binary_op_return_type<vi16<long>, vi16<short>>();
+ binary_op_return_type<vi16<llong>, vi16<short>>();
+ binary_op_return_type<vi16<float>, vi16<short>>();
+ binary_op_return_type<vi16<double>, vi16<short>>();
+
+ VERIFY((is_substitution_failure<vi16<short>, double>) );
+ VERIFY((is_substitution_failure<vi16<short>, llong>) );
+ VERIFY((is_substitution_failure<vshort, vushort>) );
+ VERIFY((is_substitution_failure<vshort, ushort>) );
+ VERIFY((is_substitution_failure<vshort, uint>) );
+ VERIFY((is_substitution_failure<vshort, long>) );
+ VERIFY((is_substitution_failure<vshort, ulong>) );
+ VERIFY((is_substitution_failure<vshort, llong>) );
+ VERIFY((is_substitution_failure<vshort, ullong>) );
+ VERIFY((is_substitution_failure<vshort, float>) );
+ VERIFY((is_substitution_failure<vshort, double>) );
+ VERIFY((is_substitution_failure<vshort, vi16<schar>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<uchar>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<short>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<ushort>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<int>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<uint>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<long>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<ulong>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<llong>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<ullong>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<float>>) );
+ VERIFY((is_substitution_failure<vshort, vi16<double>>) );
+ VERIFY((is_substitution_failure<vi16<short>, vshort>) );
+ VERIFY((is_substitution_failure<vi16<short>, vushort>) );
+ VERIFY((is_substitution_failure<vi16<short>, ushort>) );
+ VERIFY((is_substitution_failure<vi16<short>, uint>) );
+ VERIFY((is_substitution_failure<vi16<short>, long>) );
+ VERIFY((is_substitution_failure<vi16<short>, ulong>) );
+ VERIFY((is_substitution_failure<vi16<short>, ullong>) );
+ VERIFY((is_substitution_failure<vi16<short>, float>) );
+ VERIFY((is_substitution_failure<vi16<short>, vi16<ushort>>) );
+ VERIFY((is_substitution_failure<vi16<short>, vi16<uint>>) );
+ VERIFY((is_substitution_failure<vi16<short>, vi16<ulong>>) );
+ VERIFY((is_substitution_failure<vi16<short>, vi16<ullong>>) );
+ }
+ else if constexpr (std::is_same_v<V, vushort>)
+ { //{{{2
+ binary_op_return_type<vushort, uchar, vushort>();
+ binary_op_return_type<vushort, uint, vushort>();
+ binary_op_return_type<vi16<ushort>, uchar, vi16<ushort>>();
+ binary_op_return_type<vi16<ushort>, ushort, vi16<ushort>>();
+ binary_op_return_type<vi16<ushort>, int, vi16<ushort>>();
+ binary_op_return_type<vi16<ushort>, uint, vi16<ushort>>();
+ binary_op_return_type<vi16<ushort>, vi16<uchar>>();
+ binary_op_return_type<vi16<ushort>, vi16<ushort>>();
+
+ binary_op_return_type<vi16<int>, vi16<ushort>>();
+ binary_op_return_type<vi16<long>, vi16<ushort>>();
+ binary_op_return_type<vi16<llong>, vi16<ushort>>();
+ binary_op_return_type<vi16<uint>, vi16<ushort>>();
+ binary_op_return_type<vi16<ulong>, vi16<ushort>>();
+ binary_op_return_type<vi16<ullong>, vi16<ushort>>();
+ binary_op_return_type<vi16<float>, vi16<ushort>>();
+ binary_op_return_type<vi16<double>, vi16<ushort>>();
+
+ VERIFY((is_substitution_failure<vi16<ushort>, llong>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, ullong>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, double>) );
+ VERIFY((is_substitution_failure<vushort, schar>) );
+ VERIFY((is_substitution_failure<vushort, short>) );
+ VERIFY((is_substitution_failure<vushort, vshort>) );
+ VERIFY((is_substitution_failure<vushort, long>) );
+ VERIFY((is_substitution_failure<vushort, ulong>) );
+ VERIFY((is_substitution_failure<vushort, llong>) );
+ VERIFY((is_substitution_failure<vushort, ullong>) );
+ VERIFY((is_substitution_failure<vushort, float>) );
+ VERIFY((is_substitution_failure<vushort, double>) );
+ VERIFY((is_substitution_failure<vushort, vi16<schar>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<uchar>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<short>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<ushort>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<int>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<uint>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<long>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<ulong>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<llong>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<ullong>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<float>>) );
+ VERIFY((is_substitution_failure<vushort, vi16<double>>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, schar>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, short>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, vshort>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, vushort>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, long>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, ulong>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, float>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, vi16<schar>>) );
+ VERIFY((is_substitution_failure<vi16<ushort>, vi16<short>>) );
+ }
+ else if constexpr (std::is_same_v<V, vchar>)
+ { //{{{2
+ binary_op_return_type<vi8<char>, char, vi8<char>>();
+ binary_op_return_type<vi8<char>, int, vi8<char>>();
+ binary_op_return_type<vi8<char>, vi8<char>, vi8<char>>();
+
+ if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+ {
+ COMPARE((is_substitution_failure<vi8<char>, vi8<short>>),
+ std::is_unsigned_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<int>>),
+ std::is_unsigned_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<long>>),
+ std::is_unsigned_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<llong>>),
+ std::is_unsigned_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<ushort>>),
+ std::is_signed_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<uint>>),
+ std::is_signed_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<ulong>>),
+ std::is_signed_v<char>);
+ COMPARE((is_substitution_failure<vi8<char>, vi8<ullong>>),
+ std::is_signed_v<char>);
+ if constexpr (std::is_signed_v<char>)
+ {
+ binary_op_return_type<vi8<short>, vi8<char>>();
+ binary_op_return_type<vi8<int>, vi8<char>>();
+ binary_op_return_type<vi8<long>, vi8<char>>();
+ binary_op_return_type<vi8<llong>, vi8<char>>();
+ }
+ else
+ {
+ binary_op_return_type<vi8<ushort>, vi8<char>>();
+ binary_op_return_type<vi8<uint>, vi8<char>>();
+ binary_op_return_type<vi8<ulong>, vi8<char>>();
+ binary_op_return_type<vi8<ullong>, vi8<char>>();
+ }
+ binary_op_return_type<vi8<float>, vi8<char>>();
+ binary_op_return_type<vi8<double>, vi8<char>>();
+ }
+
+ VERIFY((is_substitution_failure<vi8<char>, llong>) );
+ VERIFY((is_substitution_failure<vi8<char>, double>) );
+ VERIFY((is_substitution_failure<vchar, vxchar>) );
+ VERIFY((is_substitution_failure<vchar, xchar>) );
+ VERIFY((is_substitution_failure<vchar, short>) );
+ VERIFY((is_substitution_failure<vchar, ushort>) );
+ COMPARE((is_substitution_failure<vchar, uint>), std::is_signed_v<char>);
+ VERIFY((is_substitution_failure<vchar, long>) );
+ VERIFY((is_substitution_failure<vchar, ulong>) );
+ VERIFY((is_substitution_failure<vchar, llong>) );
+ VERIFY((is_substitution_failure<vchar, ullong>) );
+ VERIFY((is_substitution_failure<vchar, float>) );
+ VERIFY((is_substitution_failure<vchar, double>) );
+ VERIFY((is_substitution_failure<vchar, vi8<char>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<uchar>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<schar>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<short>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<ushort>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<int>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<uint>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<long>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<ulong>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<llong>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<ullong>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<float>>) );
+ VERIFY((is_substitution_failure<vchar, vi8<double>>) );
+ VERIFY((is_substitution_failure<vi8<char>, vchar>) );
+ VERIFY((is_substitution_failure<vi8<char>, vuchar>) );
+ VERIFY((is_substitution_failure<vi8<char>, vschar>) );
+ VERIFY((is_substitution_failure<vi8<char>, xchar>) );
+ VERIFY((is_substitution_failure<vi8<char>, short>) );
+ VERIFY((is_substitution_failure<vi8<char>, ushort>) );
+ COMPARE((is_substitution_failure<vi8<char>, uint>),
+ std::is_signed_v<char>);
+ VERIFY((is_substitution_failure<vi8<char>, long>) );
+ VERIFY((is_substitution_failure<vi8<char>, ulong>) );
+ VERIFY((is_substitution_failure<vi8<char>, ullong>) );
+ VERIFY((is_substitution_failure<vi8<char>, float>) );
+
+ // conversion between any char types must fail because the dst type's
+ // integer conversion rank isn't greater (as required by 9.6.4p4.3)
+ VERIFY((is_substitution_failure<vi8<char>, vi8<schar>>) );
+ VERIFY((is_substitution_failure<vi8<char>, vi8<uchar>>) );
+ }
+ else if constexpr (std::is_same_v<V, vschar>)
+ { //{{{2
+ binary_op_return_type<vi8<schar>, schar, vi8<schar>>();
+ binary_op_return_type<vi8<schar>, int, vi8<schar>>();
+ binary_op_return_type<vi8<schar>, vi8<schar>, vi8<schar>>();
+
+ if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+ {
+ binary_op_return_type<vi8<short>, vi8<schar>>();
+ binary_op_return_type<vi8<int>, vi8<schar>>();
+ binary_op_return_type<vi8<long>, vi8<schar>>();
+ binary_op_return_type<vi8<llong>, vi8<schar>>();
+ binary_op_return_type<vi8<float>, vi8<schar>>();
+ binary_op_return_type<vi8<double>, vi8<schar>>();
+ }
+
+ VERIFY((is_substitution_failure<vi8<schar>, llong>) );
+ VERIFY((is_substitution_failure<vi8<schar>, double>) );
+ VERIFY((is_substitution_failure<vschar, vuchar>) );
+ VERIFY((is_substitution_failure<vschar, uchar>) );
+ VERIFY((is_substitution_failure<vschar, short>) );
+ VERIFY((is_substitution_failure<vschar, ushort>) );
+ VERIFY((is_substitution_failure<vschar, uint>) );
+ VERIFY((is_substitution_failure<vschar, long>) );
+ VERIFY((is_substitution_failure<vschar, ulong>) );
+ VERIFY((is_substitution_failure<vschar, llong>) );
+ VERIFY((is_substitution_failure<vschar, ullong>) );
+ VERIFY((is_substitution_failure<vschar, float>) );
+ VERIFY((is_substitution_failure<vschar, double>) );
+ VERIFY((is_substitution_failure<vschar, vi8<schar>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<uchar>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<short>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<ushort>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<int>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<uint>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<long>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<ulong>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<llong>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<ullong>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<float>>) );
+ VERIFY((is_substitution_failure<vschar, vi8<double>>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vschar>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vuchar>) );
+ VERIFY((is_substitution_failure<vi8<schar>, uchar>) );
+ VERIFY((is_substitution_failure<vi8<schar>, short>) );
+ VERIFY((is_substitution_failure<vi8<schar>, ushort>) );
+ VERIFY((is_substitution_failure<vi8<schar>, uint>) );
+ VERIFY((is_substitution_failure<vi8<schar>, long>) );
+ VERIFY((is_substitution_failure<vi8<schar>, ulong>) );
+ VERIFY((is_substitution_failure<vi8<schar>, ullong>) );
+ VERIFY((is_substitution_failure<vi8<schar>, float>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vi8<uchar>>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vi8<ushort>>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vi8<uint>>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vi8<ulong>>) );
+ VERIFY((is_substitution_failure<vi8<schar>, vi8<ullong>>) );
+ }
+ else if constexpr (std::is_same_v<V, vuchar>)
+ { //{{{2
+ VERIFY((is_substitution_failure<vi8<uchar>, llong>) );
+
+ binary_op_return_type<vuchar, uint, vuchar>();
+ binary_op_return_type<vi8<uchar>, uchar, vi8<uchar>>();
+ binary_op_return_type<vi8<uchar>, int, vi8<uchar>>();
+ binary_op_return_type<vi8<uchar>, uint, vi8<uchar>>();
+ binary_op_return_type<vi8<uchar>, vi8<uchar>, vi8<uchar>>();
+
+ if constexpr (vi8<schar>::size() <= simd_abi::max_fixed_size<short>)
+ {
+ binary_op_return_type<vi8<short>, vi8<uchar>>();
+ binary_op_return_type<vi8<ushort>, vi8<uchar>>();
+ binary_op_return_type<vi8<int>, vi8<uchar>>();
+ binary_op_return_type<vi8<uint>, vi8<uchar>>();
+ binary_op_return_type<vi8<long>, vi8<uchar>>();
+ binary_op_return_type<vi8<ulong>, vi8<uchar>>();
+ binary_op_return_type<vi8<llong>, vi8<uchar>>();
+ binary_op_return_type<vi8<ullong>, vi8<uchar>>();
+ binary_op_return_type<vi8<float>, vi8<uchar>>();
+ binary_op_return_type<vi8<double>, vi8<uchar>>();
+ }
+
+ VERIFY((is_substitution_failure<vi8<uchar>, ullong>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, double>) );
+ VERIFY((is_substitution_failure<vuchar, schar>) );
+ VERIFY((is_substitution_failure<vuchar, vschar>) );
+ VERIFY((is_substitution_failure<vuchar, short>) );
+ VERIFY((is_substitution_failure<vuchar, ushort>) );
+ VERIFY((is_substitution_failure<vuchar, long>) );
+ VERIFY((is_substitution_failure<vuchar, ulong>) );
+ VERIFY((is_substitution_failure<vuchar, llong>) );
+ VERIFY((is_substitution_failure<vuchar, ullong>) );
+ VERIFY((is_substitution_failure<vuchar, float>) );
+ VERIFY((is_substitution_failure<vuchar, double>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<schar>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<uchar>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<short>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<ushort>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<int>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<uint>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<long>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<ulong>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<llong>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<ullong>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<float>>) );
+ VERIFY((is_substitution_failure<vuchar, vi8<double>>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, schar>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, vschar>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, vuchar>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, short>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, ushort>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, long>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, ulong>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, float>) );
+ VERIFY((is_substitution_failure<vi8<uchar>, vi8<schar>>) );
+ } //}}}2
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/operators.h b/libstdc++-v3/testsuite/experimental/simd/tests/operators.h
new file mode 100644
index 00000000000..2388bcfd166
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/operators.h
@@ -0,0 +1,285 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+
+// operators helpers //{{{1
+template <class T>
+constexpr T
+genHalfBits()
+{
+ return std::numeric_limits<T>::max() >> (std::numeric_limits<T>::digits / 2);
+}
+template <>
+constexpr long double
+genHalfBits<long double>()
+{
+ return 0;
+}
+template <>
+constexpr double
+genHalfBits<double>()
+{
+ return 0;
+}
+template <>
+constexpr float
+genHalfBits<float>()
+{
+ return 0;
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ using T = typename V::value_type;
+ constexpr auto min = std::numeric_limits<T>::min();
+ constexpr auto max = std::numeric_limits<T>::max();
+ { // compares{{{2
+ COMPARE(V(0) == make_vec<V>({0, 1}, 0), make_mask<M>({1, 0}));
+ COMPARE(V(0) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({1, 0, 0}));
+ COMPARE(V(1) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 1, 0}));
+ COMPARE(V(2) == make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 0, 1}));
+ COMPARE(V(0) < make_vec<V>({0, 1, 2}, 0), make_mask<M>({0, 1, 1}));
+
+ constexpr T half = genHalfBits<T>();
+ for (T lo_ : {min, T(min + 1), T(-1), T(0), T(1), T(half - 1), half,
+ T(half + 1), T(max - 1)})
+ {
+ for (T hi_ : {T(min + 1), T(-1), T(0), T(1), T(half - 1), half,
+ T(half + 1), T(max - 1), max})
+ {
+ if (hi_ <= lo_)
+ {
+ continue;
+ }
+ for (std::size_t pos = 0; pos < V::size(); ++pos)
+ {
+ V lo = lo_;
+ V hi = hi_;
+ lo[pos] = 0; // have a different value in the vector in case
+ hi[pos] = 1; // this affects neighbors
+ COMPARE(hi, hi);
+ VERIFY(all_of(hi != lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(lo != hi)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(none_of(hi != hi)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(none_of(hi == lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(none_of(lo == hi)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(lo < hi)) << "hi: " << hi << ", lo: " << lo
+ << ", lo < hi: " << (lo < hi);
+ VERIFY(none_of(hi < lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(none_of(hi <= lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(hi <= hi)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(hi > lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(none_of(lo > hi)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(hi >= lo)) << "hi: " << hi << ", lo: " << lo;
+ VERIFY(all_of(hi >= hi)) << "hi: " << hi << ", lo: " << lo;
+ }
+ }
+ }
+ }
+ { // subscripting{{{2
+ V x = max;
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(x[i], max);
+ x[i] = 0;
+ }
+ COMPARE(x, V{0});
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(x[i], T(0));
+ x[i] = max;
+ }
+ COMPARE(x, V{max});
+ COMPARE(typeid(x[0] * x[0]), typeid(T() * T()));
+ COMPARE(typeid(x[0] * T()), typeid(T() * T()));
+ COMPARE(typeid(T() * x[0]), typeid(T() * T()));
+ COMPARE(typeid(x * x[0]), typeid(x));
+ COMPARE(typeid(x[0] * x), typeid(x));
+
+ x = V([](auto i) -> T { return i; });
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ COMPARE(x[i], T(i));
+ }
+ for (std::size_t i = 0; i + 1 < V::size(); i += 2)
+ {
+ using std::swap;
+ swap(x[i], x[i + 1]);
+ }
+ for (std::size_t i = 0; i + 1 < V::size(); i += 2)
+ {
+ COMPARE(x[i], T(i + 1)) << x;
+ COMPARE(x[i + 1], T(i)) << x;
+ }
+ x = 1;
+ V y = 0;
+ COMPARE(x[0], T(1));
+ x[0] = y[0]; // make sure non-const smart_reference assignment works
+ COMPARE(x[0], T(0));
+ x = 1;
+ x[0] = x[0]; // self-assignment on smart_reference
+ COMPARE(x[0], T(1));
+
+ std::experimental::simd<typename V::value_type,
+ std::experimental::simd_abi::scalar>
+ z = 2;
+ x[0] = z[0];
+ COMPARE(x[0], T(2));
+ x = 3;
+ z[0] = x[0];
+ COMPARE(z[0], T(3));
+
+ // TODO: check that only value-preserving conversions happen on subscript
+ // assignment
+ }
+ { // not{{{2
+ V x = 0;
+ COMPARE(!x, M{true});
+ V y = 1;
+ COMPARE(!y, M{false});
+ }
+
+ { // unary minus{{{2
+ V x = 0;
+ COMPARE(-x, V(T(-T(0))));
+ V y = 1;
+ COMPARE(-y, V(T(-T(1))));
+ }
+
+ { // plus{{{2
+ V x = 0;
+ V y = 0;
+ COMPARE(x + y, x);
+ COMPARE(x = x + T(1), V(1));
+ COMPARE(x + x, V(2));
+ y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+ COMPARE(x = x + y, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+ COMPARE(x = x + -y, V(1));
+ COMPARE(x += y, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+ COMPARE(x, make_vec<V>({2, 3, 4, 5, 6, 7, 8}));
+ COMPARE(x += -y, V(1));
+ COMPARE(x, V(1));
+ }
+
+ { // minus{{{2
+ V x = 1;
+ V y = 0;
+ COMPARE(x - y, x);
+ COMPARE(x - T(1), y);
+ COMPARE(y, x - T(1));
+ COMPARE(x - x, y);
+ y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+ COMPARE(x = y - x, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+ COMPARE(x = y - x, V(1));
+ COMPARE(y -= x, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+ COMPARE(y, make_vec<V>({0, 1, 2, 3, 4, 5, 6}));
+ COMPARE(y -= y, V(0));
+ COMPARE(y, V(0));
+ }
+
+ { // multiplies{{{2
+ V x = 1;
+ V y = 0;
+ COMPARE(x * y, y);
+ COMPARE(x = x * T(2), V(2));
+ COMPARE(x * x, V(4));
+ y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+ COMPARE(x = x * y, make_vec<V>({2, 4, 6, 8, 10, 12, 14}));
+ y = 2;
+ for (T n :
+ {T(std::numeric_limits<T>::max() - 1), std::numeric_limits<T>::min()})
+ {
+ x = n / 2;
+ COMPARE(x * y, V(n));
+ }
+ if (std::is_integral<T>::value && std::is_unsigned<T>::value)
+ {
+ // test modulo arithmetics
+ T n = std::numeric_limits<T>::max();
+ x = n;
+ for (T m : {T(2), T(7), T(std::numeric_limits<T>::max() / 127),
+ std::numeric_limits<T>::max()})
+ {
+ y = m;
+ // if T is of lower rank than int, `n * m` will promote to int
+ // before executing the multiplication. In this case an overflow
+ // will be UB (and ubsan will warn about it). The solution is to
+ // cast to uint in that case.
+ using U
+ = std::conditional_t<(sizeof(T) < sizeof(int)), unsigned, T>;
+ COMPARE(x * y, V(T(U(n) * U(m))));
+ }
+ }
+ x = 2;
+ COMPARE(x *= make_vec<V>({1, 2, 3}), make_vec<V>({2, 4, 6}));
+ COMPARE(x, make_vec<V>({2, 4, 6}));
+ }
+
+ { // divides{{{2
+ V x = 2;
+ COMPARE(x / x, V(1));
+ COMPARE(T(3) / x, V(T(3) / T(2)));
+ COMPARE(x / T(3), V(T(2) / T(3)));
+ V y = make_vec<V>({1, 2, 3, 4, 5, 6, 7});
+ COMPARE(y / x,
+ make_vec<V>({T(.5), T(1), T(1.5), T(2), T(2.5), T(3), T(3.5)}));
+
+ y = make_vec<V>(
+ {std::numeric_limits<T>::max(), std::numeric_limits<T>::min()});
+ V ref = make_vec<V>({T(std::numeric_limits<T>::max() / 2),
+ T(std::numeric_limits<T>::min() / 2)});
+ COMPARE(y / x, ref);
+
+ y = make_vec<V>(
+ {std::numeric_limits<T>::min(), std::numeric_limits<T>::max()});
+ ref = make_vec<V>({T(std::numeric_limits<T>::min() / 2),
+ T(std::numeric_limits<T>::max() / 2)});
+ COMPARE(y / x, ref);
+
+ y = make_vec<V>(
+ {std::numeric_limits<T>::max(), T(std::numeric_limits<T>::min() + 1)});
+ COMPARE(y / y, V(1));
+
+ ref = make_vec<V>({T(2 / std::numeric_limits<T>::max()),
+ T(2 / (std::numeric_limits<T>::min() + 1))});
+ COMPARE(x / y, ref);
+ COMPARE(x /= y, ref);
+ COMPARE(x, ref);
+ }
+
+ { // increment & decrement {{{2
+ const V from0 = make_vec<V>({0, 1, 2, 3}, 4);
+ V x = from0;
+ COMPARE(x++, from0);
+ COMPARE(x, from0 + 1);
+ COMPARE(++x, from0 + 2);
+ COMPARE(x, from0 + 2);
+
+ COMPARE(x--, from0 + 2);
+ COMPARE(x, from0 + 1);
+ COMPARE(--x, from0);
+ COMPARE(x, from0);
+ }
+ // }}}2
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h
new file mode 100644
index 00000000000..e367b692201
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.h
@@ -0,0 +1,82 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include <random>
+
+static std::mt19937 g_mt_gen{0};
+template <typename V>
+void
+test()
+{
+ using T = typename V::value_type;
+ COMPARE(reduce(V(1)), T(V::size()));
+ {
+ V x = 1;
+ COMPARE(reduce(x, std::multiplies<>()), T(1));
+ x[0] = 2;
+ COMPARE(reduce(x, std::multiplies<>()), T(2));
+ if constexpr (V::size() > 1)
+ {
+ x[V::size() - 1] = 3;
+ COMPARE(reduce(x, std::multiplies<>()), T(6));
+ }
+ }
+ COMPARE(reduce(V([](int i) { return i & 1; })), T(V::size() / 2));
+ COMPARE(reduce(V([](int i) { return i % 3; })),
+ T(3 * (V::size() / 3) // 0+1+2 for every complete 3 elements in V
+ + (V::size() % 3) / 2 // 0->0, 1->0, 2->1 adjustment
+ ));
+ if ((1 + V::size()) * V::size() / 2 <= std::numeric_limits<T>::max())
+ {
+ COMPARE(reduce(V([](int i) { return i + 1; })),
+ T((1 + V::size()) * V::size() / 2));
+ }
+
+ {
+ const V y = 2;
+ COMPARE(reduce(y), T(2 * V::size()));
+ COMPARE(reduce(where(y > 2, y)), T(0));
+ COMPARE(reduce(where(y == 2, y)), T(2 * V::size()));
+ }
+
+ {
+ const V z([](T i) { return i + 1; });
+ COMPARE(std::experimental::reduce(z,
+ [](auto a, auto b) {
+ using std::min;
+ return min(a, b);
+ }),
+ T(1))
+ << "z: " << z;
+ COMPARE(std::experimental::reduce(z,
+ [](auto a, auto b) {
+ using std::max;
+ return max(a, b);
+ }),
+ T(V::size()))
+ << "z: " << z;
+ COMPARE(std::experimental::reduce(where(z > 1, z), 117,
+ [](auto a, auto b) {
+ using std::min;
+ return min(a, b);
+ }),
+ T(V::size() == 1 ? 117 : 2))
+ << "z: " << z;
+ }
+
+ {
+ std::conditional_t<std::is_floating_point_v<T>,
+ std::uniform_real_distribution<T>,
+ std::uniform_int_distribution<T>>
+ dist(std::numeric_limits<T>::lowest(), std::numeric_limits<T>::max());
+ for (int repeat = 0; repeat < 100; ++repeat)
+ {
+ const V x([&](int) { return dist(g_mt_gen); });
+ T acc = x[0];
+ for (size_t i = 1; i < V::size(); ++i)
+ acc += x[i];
+ FUZZY_COMPARE(reduce(x), acc);
+ }
+ }
+}
+
+// vim: foldmethod=marker
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h b/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h
new file mode 100644
index 00000000000..5cac268808d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/remqo.h
@@ -0,0 +1,48 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(0);
+ vir::test::setFuzzyness<double>(0);
+
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values_2arg<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(),
+ limits::denorm_min(), limits::min() / 3, -0.,
+#endif
+ +0., limits::min(), limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2}, [](const V a, const V b) {
+ using IV = std::experimental::fixed_size_simd<int, V::size()>;
+ IV quo = {}; // the type is wrong, this should fail
+ const V totest = remquo(a, b, &quo);
+ auto&& expected
+ = [&](const auto& v, const auto& w) -> std::pair<const V, const IV> {
+ std::pair<V, IV> tmp = {};
+ using std::remquo;
+ for (std::size_t i = 0; i < V::size(); ++i)
+ {
+ int tmp2;
+ tmp.first[i] = remquo(v[i], w[i], &tmp2);
+ tmp.second[i] = tmp2;
+ }
+ return tmp;
+ };
+ const auto expect1 = expected(a, b);
+ COMPARE(isnan(totest), isnan(expect1.first))
+ << "remquo(" << a << ", " << b << ", quo) = " << totest
+ << " != " << expect1.first;
+ const V clean_a = iif(isnan(totest), 0, a);
+ const V clean_b = iif(isnan(totest), 1, b);
+ const auto expect2 = expected(clean_a, clean_b);
+ COMPARE(remquo(clean_a, clean_b, &quo), expect2.first)
+ << "\nclean_a/b = " << clean_a << ", " << clean_b;
+ COMPARE(quo, expect2.second);
+ });
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/simd.h b/libstdc++-v3/testsuite/experimental/simd/tests/simd.h
new file mode 100644
index 00000000000..87b6815f832
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/simd.h
@@ -0,0 +1,22 @@
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+ using T = typename V::value_type;
+
+ // V must store V::size() values of type T giving us the lower bound on the
+ // sizeof
+ VERIFY(sizeof(V) >= sizeof(T) * V::size());
+
+ // V should not pad more than to the next-power-of-2 of V::size() values of
+ // type T giving us the upper bound on the sizeof
+ auto n = V::size();
+ n = ((n << 1) & ~n) & ~((n >> 1) | (n >> 3));
+ while (n & (n - 1))
+ {
+ n &= n - 1;
+ }
+ VERIFY(sizeof(V) <= sizeof(T) * n);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h b/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h
new file mode 100644
index 00000000000..87b7a505d51
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/sincos.h
@@ -0,0 +1,31 @@
+// test only floattypes
+// { dg-additional-files "reference-sincos-sp.dat" }
+// { dg-additional-files "reference-sincos-ep.dat" }
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/mathreference.h"
+#include "bits/simd_view.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ using std::cos;
+ using std::sin;
+ using T = typename V::value_type;
+
+ vir::test::setFuzzyness<float>(2);
+ vir::test::setFuzzyness<double>(1);
+
+ const auto& testdata = referenceData<function::sincos, T>();
+ std::experimental::experimental::simd_view<V>(testdata).for_each(
+ [&](const V input, const V expected_sin, const V expected_cos) {
+ FUZZY_COMPARE(sin(input), expected_sin) << " input = " << input;
+ FUZZY_COMPARE(sin(-input), -expected_sin) << " input = " << input;
+ FUZZY_COMPARE(cos(input), expected_cos) << " input = " << input;
+ FUZZY_COMPARE(cos(-input), expected_cos) << " input = " << input;
+ });
+}
+
+// vim: sw=2 sts=2 noet ts=8
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h b/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h
new file mode 100644
index 00000000000..ea15c7ff1f9
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/split_concat.h
@@ -0,0 +1,168 @@
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/conversions.h"
+
+using std::experimental::simd_cast;
+
+template <typename V, bool ConstProp, typename F>
+auto
+gen(const F& fun)
+{
+ if constexpr (ConstProp)
+ return V(fun);
+ else
+ return make_value_unknown(V(fun));
+}
+
+template <typename V, bool ConstProp>
+void
+split_concat()
+{
+ using T = typename V::value_type;
+ if constexpr (V::size() * 3 <= std::experimental::simd_abi::max_fixed_size<T>)
+ {
+ V a(0), b(1), c(2);
+ auto x = concat(a, b, c);
+ COMPARE(x.size(), a.size() * 3);
+ std::size_t i = 0;
+ for (; i < a.size(); ++i)
+ {
+ COMPARE(x[i], T(0));
+ }
+ for (; i < 2 * a.size(); ++i)
+ {
+ COMPARE(x[i], T(1));
+ }
+ for (; i < 3 * a.size(); ++i)
+ {
+ COMPARE(x[i], T(2));
+ }
+ }
+
+ if constexpr (V::size() >= 4)
+ {
+ const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+ constexpr auto N0 = V::size() / 4u;
+ constexpr auto N1 = V::size() - 2 * N0;
+ using V0
+ = std::experimental::simd<T,
+ std::experimental::simd_abi::deduce_t<T, N0>>;
+ using V1
+ = std::experimental::simd<T,
+ std::experimental::simd_abi::deduce_t<T, N1>>;
+ {
+ auto x = std::experimental::split<N0, N0, N1>(a);
+ COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+ COMPARE(std::get<0>(x), V0([](auto i) -> T { return i; }));
+ COMPARE(std::get<1>(x), V0([](auto i) -> T { return i + N0; }));
+ COMPARE(std::get<2>(x), V1([](auto i) -> T { return i + 2 * N0; }));
+ auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+ // a and b may have different types if a was fixed_size<N> such that
+ // another ABI tag exists with equal N, then b will have the
+ // non-fixed-size ABI tag.
+ COMPARE(a.size(), b.size());
+ COMPARE(b,
+ decltype(b)([](auto i) -> T { return (N0 + i) % V::size(); }));
+ }
+ {
+ auto x = std::experimental::split<N0, N1, N0>(a);
+ COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+ COMPARE(std::get<0>(x), V0([](auto i) -> T { return i; }));
+ COMPARE(std::get<1>(x), V1([](auto i) -> T { return i + N0; }));
+ COMPARE(std::get<2>(x), V0([](auto i) -> T { return i + N0 + N1; }));
+ auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+ // a and b may have different types if a was fixed_size<N> such that
+ // another ABI tag exists with equal N, then b will have the
+ // non-fixed-size ABI tag.
+ COMPARE(a.size(), b.size());
+ COMPARE(b,
+ decltype(b)([](auto i) -> T { return (N0 + i) % V::size(); }));
+ }
+ {
+ auto x = std::experimental::split<N1, N0, N0>(a);
+ COMPARE(std::tuple_size<decltype(x)>::value, 3u);
+ COMPARE(std::get<0>(x), V1([](auto i) -> T { return i; }));
+ COMPARE(std::get<1>(x), V0([](auto i) -> T { return i + N1; }));
+ COMPARE(std::get<2>(x), V0([](auto i) -> T { return i + N0 + N1; }));
+ auto b = concat(std::get<1>(x), std::get<2>(x), std::get<0>(x));
+ // a and b may have different types if a was fixed_size<N> such that
+ // another ABI tag exists with equal N, then b will have the
+ // non-fixed-size ABI tag.
+ COMPARE(a.size(), b.size());
+ COMPARE(b,
+ decltype(b)([](auto i) -> T { return (N1 + i) % V::size(); }));
+ }
+ }
+
+ if constexpr (V::size() % 3 == 0)
+ {
+ const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+ constexpr auto N0 = V::size() / 3;
+ using V0
+ = std::experimental::simd<T,
+ std::experimental::simd_abi::deduce_t<T, N0>>;
+ using V1 = std::experimental::simd<
+ T, std::experimental::simd_abi::deduce_t<T, 2 * N0>>;
+ {
+ auto [x, y, z] = std::experimental::split<N0, N0, N0>(a);
+ COMPARE(x, V0([](auto i) -> T { return i; }));
+ COMPARE(y, V0([](auto i) -> T { return i + N0; }));
+ COMPARE(z, V0([](auto i) -> T { return i + N0 * 2; }));
+ auto b = concat(x, y, z);
+ COMPARE(a.size(), b.size());
+ COMPARE(b, simd_cast<decltype(b)>(a));
+ COMPARE(simd_cast<V>(b), a);
+ }
+ {
+ auto [x, y] = std::experimental::split<N0, 2 * N0>(a);
+ COMPARE(x, V0([](auto i) -> T { return i; }));
+ COMPARE(y, V1([](auto i) -> T { return i + N0; }));
+ auto b = concat(x, y);
+ COMPARE(a.size(), b.size());
+ COMPARE(b, simd_cast<decltype(b)>(a));
+ COMPARE(simd_cast<V>(b), a);
+ }
+ {
+ auto [x, y] = std::experimental::split<2 * N0, N0>(a);
+ COMPARE(x, V1([](auto i) -> T { return i; }));
+ COMPARE(y, V0([](auto i) -> T { return i + 2 * N0; }));
+ auto b = concat(x, y);
+ COMPARE(a.size(), b.size());
+ COMPARE(b, simd_cast<decltype(b)>(a));
+ COMPARE(simd_cast<V>(b), a);
+ }
+ }
+
+ if constexpr ((V::size() & 1) == 0)
+ {
+ using std::experimental::simd;
+ using std::experimental::simd_abi::deduce_t;
+ using V0 = simd<T, deduce_t<T, V::size()>>;
+ using V2 = simd<T, deduce_t<T, 2>>;
+ using V3 = simd<T, deduce_t<T, V::size() / 2>>;
+
+ const V a = gen<V, ConstProp>([](auto i) -> T { return i; });
+
+ std::array<V2, V::size() / 2> v2s = std::experimental::split<V2>(a);
+ int offset = 0;
+ for (V2 test : v2s)
+ {
+ COMPARE(test, V2([&](auto i) -> T { return i + offset; }));
+ offset += 2;
+ }
+ COMPARE(concat(v2s), simd_cast<V0>(a));
+
+ std::array<V3, 2> v3s = std::experimental::split<V3>(a);
+ COMPARE(v3s[0], V3([](auto i) -> T { return i; }));
+ COMPARE(v3s[1], V3([](auto i) -> T { return i + V3::size(); }));
+ COMPARE(concat(v3s), simd_cast<V0>(a));
+ }
+}
+
+template <typename V>
+void
+test()
+{
+ split_concat<V, true>();
+ split_concat<V, false>();
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/splits.h b/libstdc++-v3/testsuite/experimental/simd/tests/splits.h
new file mode 100644
index 00000000000..2b8c03bbcdc
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/splits.h
@@ -0,0 +1,21 @@
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ using namespace std::experimental::parallelism_v2;
+ using T = typename V::value_type;
+ if constexpr (V::size() / simd_size_v<T> * simd_size_v<T> == V::size())
+ {
+ M k(true);
+ VERIFY(all_of(k)) << k;
+ const auto parts = split<simd_mask<T>>(k);
+ for (auto k2 : parts)
+ {
+ VERIFY(all_of(k2)) << k2;
+ COMPARE(typeid(k2), typeid(simd_mask<T>));
+ }
+ }
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h b/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h
new file mode 100644
index 00000000000..b137a6bba49
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/trigonometric.h
@@ -0,0 +1,25 @@
+// test only floattypes
+#include "bits/verify.h"
+#include "bits/metahelpers.h"
+#include "bits/test_values.h"
+
+template <typename V>
+void
+test()
+{
+ vir::test::setFuzzyness<float>(1);
+ vir::test::setFuzzyness<double>(1);
+
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>(
+ {
+#ifdef __STDC_IEC_559__
+ limits::quiet_NaN(), limits::infinity(), -limits::infinity(), -0.,
+ limits::denorm_min(), limits::min() / 3,
+#endif
+ +0., limits::min(), limits::max()},
+ {10000, -limits::max() / 2, limits::max() / 2}, MAKE_TESTER(acos),
+ MAKE_TESTER(tan), MAKE_TESTER(acosh), MAKE_TESTER(asinh),
+ MAKE_TESTER(atanh), MAKE_TESTER(cosh), MAKE_TESTER(sinh),
+ MAKE_TESTER(tanh));
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h b/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h
new file mode 100644
index 00000000000..4a65606c6e5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/trunc_ceil_floor.h
@@ -0,0 +1,88 @@
+// test only floattypes
+#include "bits/test_values.h"
+#include "bits/verify.h"
+
+template <typename V>
+void
+test()
+{
+ using limits = std::numeric_limits<typename V::value_type>;
+ test_values<V>(
+ {2.1,
+ 2.0,
+ 2.9,
+ 2.5,
+ 2.499,
+ 1.5,
+ 1.499,
+ 1.99,
+ 0.99,
+ 0.5,
+ 0.499,
+ 0.,
+ -2.1,
+ -2.0,
+ -2.9,
+ -2.5,
+ -2.499,
+ -1.5,
+ -1.499,
+ -1.99,
+ -0.99,
+ -0.5,
+ -0.499,
+ 3 << 21,
+ 3 << 22,
+ 3 << 23,
+ -(3 << 21),
+ -(3 << 22),
+ -(3 << 23),
+#ifdef __STDC_IEC_559__
+ -0.,
+ limits::infinity(),
+ -limits::infinity(),
+ limits::denorm_min(),
+ limits::min() * 0.9,
+ -limits::denorm_min(),
+ -limits::min() * 0.9,
+#endif
+ limits::max(),
+ limits::min(),
+ limits::lowest(),
+ -limits::max(),
+ -limits::min(),
+ -limits::lowest()},
+ [](const V input) {
+ const V expected([&](auto i) { return std::trunc(input[i]); });
+ COMPARE(trunc(input), expected) << input;
+ },
+ [](const V input) {
+ const V expected([&](auto i) { return std::ceil(input[i]); });
+ COMPARE(ceil(input), expected) << input;
+ },
+ [](const V input) {
+ const V expected([&](auto i) { return std::floor(input[i]); });
+ COMPARE(floor(input), expected) << input;
+ });
+
+#ifdef __STDC_IEC_559__
+ test_values<V>(
+ {
+#ifdef __SUPPORT_SNAN__
+ limits::signaling_NaN(),
+#endif
+ limits::quiet_NaN()},
+ [](const V input) {
+ const V expected([&](auto i) { return std::trunc(input[i]); });
+ COMPARE(isnan(trunc(input)), isnan(expected)) << input;
+ },
+ [](const V input) {
+ const V expected([&](auto i) { return std::ceil(input[i]); });
+ COMPARE(isnan(ceil(input)), isnan(expected)) << input;
+ },
+ [](const V input) {
+ const V expected([&](auto i) { return std::floor(input[i]); });
+ COMPARE(isnan(floor(input)), isnan(expected)) << input;
+ });
+#endif
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/where.h b/libstdc++-v3/testsuite/experimental/simd/tests/where.h
new file mode 100644
index 00000000000..c502fa9a89e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/where.h
@@ -0,0 +1,108 @@
+#include "bits/verify.h"
+#include "bits/make_vec.h"
+#include "bits/metahelpers.h"
+
+template <class V> struct Convertible
+{
+ operator V() const { return V(4); }
+};
+
+template <class M, class T>
+constexpr bool
+where_is_ill_formed_impl(M, const T&, float)
+{
+ return true;
+}
+template <class M, class T>
+constexpr auto
+where_is_ill_formed_impl(M m, const T& v, int)
+ -> std::conditional_t<true, bool, decltype(std::experimental::where(m, v))>
+{
+ return false;
+}
+
+template <class M, class T>
+constexpr bool
+where_is_ill_formed(M m, const T& v)
+{
+ return where_is_ill_formed_impl(m, v, int());
+}
+
+template <typename T>
+void
+where_fundamental()
+{
+ using std::experimental::where;
+ T x = T();
+ where(true, x) = x + 1;
+ COMPARE(x, T(1));
+ where(false, x) = x - 1;
+ COMPARE(x, T(1));
+ where(true, x) += T(1);
+ COMPARE(x, T(2));
+}
+
+template <typename V>
+void
+test()
+{
+ using M = typename V::mask_type;
+ using T = typename V::value_type;
+ where_fundamental<T>();
+ VERIFY(!(sfinae_is_callable<V>(
+ [](auto x) -> decltype(where(true, x))* { return nullptr; })));
+
+ const V indexes([](int i) { return i + 1; });
+ const M alternating_mask = make_mask<M>({true, false});
+ V x = 0;
+ where(alternating_mask, x) = indexes;
+ COMPARE(alternating_mask, x == indexes);
+
+ where(!alternating_mask, x) = T(2);
+ COMPARE(!alternating_mask, x == T(2)) << x;
+
+ where(!alternating_mask, x) = Convertible<V>();
+ COMPARE(!alternating_mask, x == T(4));
+
+ x = 0;
+ COMPARE(x, T(0));
+ where(alternating_mask, x) += indexes;
+ COMPARE(alternating_mask, x == indexes);
+
+ x = 10;
+ COMPARE(x, T(10));
+ where(!alternating_mask, x) += T(1);
+ COMPARE(!alternating_mask, x == T(11));
+ where(alternating_mask, x) -= Convertible<V>();
+ COMPARE(alternating_mask, x == T(6));
+ where(alternating_mask, x) /= T(2);
+ COMPARE(alternating_mask, x == T(3)) << x;
+ where(alternating_mask, x) *= T(3);
+ COMPARE(alternating_mask, x == T(9));
+ COMPARE(!alternating_mask, x == T(11));
+
+ x = 10;
+ where(alternating_mask, x)++;
+ COMPARE(alternating_mask, x == T(11));
+ ++where(alternating_mask, x);
+ COMPARE(alternating_mask, x == T(12));
+ where(alternating_mask, x)--;
+ COMPARE(alternating_mask, x == T(11));
+ --where(alternating_mask, x);
+ --where(alternating_mask, x);
+ COMPARE(alternating_mask, x == T(9));
+ COMPARE(alternating_mask, -where(alternating_mask, x) == T(-T(9)));
+
+ const auto y = x;
+ VERIFY(where_is_ill_formed(true, y));
+ VERIFY(where_is_ill_formed(true, x));
+ VERIFY(where_is_ill_formed(true, V(x)));
+
+ M test = alternating_mask;
+ where(alternating_mask, test) = M(true);
+ COMPARE(test, alternating_mask);
+ where(alternating_mask, test) = M(false);
+ COMPARE(test, M(false));
+ where(alternating_mask, test) = M(true);
+ COMPARE(test, alternating_mask);
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc
new file mode 100644
index 00000000000..b062117b78e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc
new file mode 100644
index 00000000000..3de35f26667
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc
new file mode 100644
index 00000000000..3b3b44678ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc
new file mode 100644
index 00000000000..2c297aaeb48
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc
new file mode 100644
index 00000000000..27f56c929d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc
new file mode 100644
index 00000000000..d9b0e07e3ca
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc
new file mode 100644
index 00000000000..b15a7a58244
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc
new file mode 100644
index 00000000000..2f40098232f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc
new file mode 100644
index 00000000000..d231dd3742f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trigonometric-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trigonometric.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc
new file mode 100644
index 00000000000..173aca4e406
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc
new file mode 100644
index 00000000000..9263aff80d4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc
new file mode 100644
index 00000000000..4fd5be6ff3a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc
new file mode 100644
index 00000000000..4548e25a634
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc
new file mode 100644
index 00000000000..27aa8d8263d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc
new file mode 100644
index 00000000000..2fd1f5ff24d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc
new file mode 100644
index 00000000000..a1fc0f60fce
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc
new file mode 100644
index 00000000000..8fea4cd5894
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc
new file mode 100644
index 00000000000..8ce8b9cc9e6
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/trunc_ceil_floor-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/trunc_ceil_floor.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc
new file mode 100644
index 00000000000..0af0734bbc1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc
new file mode 100644
index 00000000000..56c695c5957
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char.cc b/libstdc++-v3/testsuite/experimental/simd/where-char.cc
new file mode 100644
index 00000000000..a5e0e2a89a7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc
new file mode 100644
index 00000000000..02902b72841
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc
new file mode 100644
index 00000000000..ca286bbc363
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc
new file mode 100644
index 00000000000..53b51f17b04
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char16_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char16_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc
new file mode 100644
index 00000000000..4f4f6aa7d4f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc
new file mode 100644
index 00000000000..11d12a709fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc
new file mode 100644
index 00000000000..3771d9a50fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-char32_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<char32_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc
new file mode 100644
index 00000000000..aeb094b5ced
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc
new file mode 100644
index 00000000000..34348c2144f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-double.cc b/libstdc++-v3/testsuite/experimental/simd/where-double.cc
new file mode 100644
index 00000000000..bb6a54368e2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc
new file mode 100644
index 00000000000..e0ea665a6a4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc
new file mode 100644
index 00000000000..3c95886ebb4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-float.cc b/libstdc++-v3/testsuite/experimental/simd/where-float.cc
new file mode 100644
index 00000000000..ffb1ddb6005
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-float.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<float>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc
new file mode 100644
index 00000000000..a94f8bedfe1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc
new file mode 100644
index 00000000000..5c3e1ccdb93
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-int.cc b/libstdc++-v3/testsuite/experimental/simd/where-int.cc
new file mode 100644
index 00000000000..79c6896f7f2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc
new file mode 100644
index 00000000000..cf381cc7e27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc
new file mode 100644
index 00000000000..7b82702c58a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long.cc b/libstdc++-v3/testsuite/experimental/simd/where-long.cc
new file mode 100644
index 00000000000..9bae1730067
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc
new file mode 100644
index 00000000000..cd22a01642e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc
new file mode 100644
index 00000000000..b0226f3fa6e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc
new file mode 100644
index 00000000000..367de972d31
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_double.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long double>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc
new file mode 100644
index 00000000000..6113181ed29
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc
new file mode 100644
index 00000000000..3ed4c6cab9f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc
new file mode 100644
index 00000000000..1b6cd741d0f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc
new file mode 100644
index 00000000000..51fda21394a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc
new file mode 100644
index 00000000000..9a437df485d
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-short.cc b/libstdc++-v3/testsuite/experimental/simd/where-short.cc
new file mode 100644
index 00000000000..54d0e5b5dde
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc
new file mode 100644
index 00000000000..23d57151ad2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc
new file mode 100644
index 00000000000..d2126eeb612
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc b/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc
new file mode 100644
index 00000000000..d671e6f2523
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-signed_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<signed char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc
new file mode 100644
index 00000000000..724751ff278
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc
new file mode 100644
index 00000000000..468bb34a8ee
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc
new file mode 100644
index 00000000000..fb063f44160
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_char.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned char>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc
new file mode 100644
index 00000000000..af40c8b99f5
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc
new file mode 100644
index 00000000000..5588269f066
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc
new file mode 100644
index 00000000000..33f4c289a70
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_int.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned int>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc
new file mode 100644
index 00000000000..5519953f1bf
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc
new file mode 100644
index 00000000000..bd4d64738d1
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc
new file mode 100644
index 00000000000..542caf46e27
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc
new file mode 100644
index 00000000000..5cfa099626f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc
new file mode 100644
index 00000000000..95e5ae020d7
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc
new file mode 100644
index 00000000000..3b7d60b4fb0
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_long_long.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned long long>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc
new file mode 100644
index 00000000000..763528d5acd
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc
new file mode 100644
index 00000000000..2dac8828348
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc
new file mode 100644
index 00000000000..f83c61d2091
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-unsigned_short.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<unsigned short>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc
new file mode 100644
index 00000000000..485c6a7d11e
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-constexpr.cc
@@ -0,0 +1,10 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc
new file mode 100644
index 00000000000..1aa7a8f3b1a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t-fixed_size.cc
@@ -0,0 +1,11 @@
+// { dg-options "-std=gnu++17" }
+// { dg-require-effective-target run_expensive_tests }
+
+#define TESTFIXEDSIZE 1
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc
new file mode 100644
index 00000000000..07f879bb5ed
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/where-wchar_t.cc
@@ -0,0 +1,9 @@
+// { dg-options "-std=c++17" }
+
+#include "tests/where.h"
+
+int main()
+{
+ iterate_abis<wchar_t>();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
index 11fdc8d340b..90264f5bfa9 100644
--- a/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
+++ b/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
@@ -89,12 +89,14 @@ if {[info exists tests_file] && [file exists $tests_file]} {
# 3. wchar_t tests, if not supported.
# 4. thread tests, if not supported.
# 5. *_filebuf, if file I/O is not supported.
+ # 6. simd tests.
if { [string first _xin $t] == -1
&& [string first performance $t] == -1
&& (${v3-wchar_t} || [string first wchar_t $t] == -1)
&& (${v3-threads} || [string first thread $t] == -1)
&& ([string first "_filebuf" $t] == -1
- || [check_v3_target_fileio]) } {
+ || [check_v3_target_fileio])
+ && [string first "/experimental/simd/" $t] == -1 } {
lappend tests $t
}
}
@@ -107,5 +109,19 @@ global DEFAULT_CXXFLAGS
global PCH_CXXFLAGS
dg-runtest $tests "" "$DEFAULT_CXXFLAGS $PCH_CXXFLAGS"
+# Finally run simd tests with extra SIMD-relevant flags
+global DEFAULT_VECTCFLAGS
+global EFFECTIVE_TARGETS
+set DEFAULT_VECTCFLAGS ""
+set EFFECTIVE_TARGETS ""
+
+if [check_vect_support_and_set_flags] {
+ lappend DEFAULT_VECTCFLAGS "-O2"
+ lappend DEFAULT_VECTCFLAGS "-Wno-psabi"
+ et-dg-runtest dg-runtest [lsort \
+ [glob -nocomplain $srcdir/experimental/simd/*.cc]] \
+ "$DEFAULT_VECTCFLAGS" "$DEFAULT_CXXFLAGS $PCH_CXXFLAGS"
+}
+
# All done.
dg-finish
^ permalink raw reply [flat|nested] 13+ messages in thread