From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lxmtout2.gsi.de (lxmtout2.gsi.de [140.181.3.112]) by sourceware.org (Postfix) with ESMTPS id 83498383B413; Fri, 11 Jun 2021 10:53:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 83498383B413 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gsi.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gsi.de Received: from localhost (localhost [127.0.0.1]) by lxmtout2.gsi.de (Postfix) with ESMTP id 071C8203E807; Fri, 11 Jun 2021 12:53:18 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at lxmtout2.gsi.de Received: from lxmtout2.gsi.de ([127.0.0.1]) by localhost (lxmtout2.gsi.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ORb6R6ryOwwS; Fri, 11 Jun 2021 12:53:17 +0200 (CEST) Received: from srvex3.campus.gsi.de (unknown [10.10.4.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by lxmtout2.gsi.de (Postfix) with ESMTPS id DF00B20AE04C; Fri, 11 Jun 2021 12:53:17 +0200 (CEST) Received: from excalibur.localnet (140.181.3.12) by srvex3.campus.gsi.de (10.10.4.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2242.10; Fri, 11 Jun 2021 12:53:17 +0200 From: Matthias Kretz To: , Subject: Re: [PATCH 04/11 v2] libstdc++: Make use of __builtin_bit_cast Date: Fri, 11 Jun 2021 12:53:17 +0200 Message-ID: <3315301.e9AK2G76lq@excalibur> Organization: GSI Helmholtzzentrum =?UTF-8?B?ZsO8cg==?= Schwerionenforschung In-Reply-To: <3553838.ebMzRN9Arp@excalibur> References: <270527782.u9WJ3AIrlG@excalibur> <3553838.ebMzRN9Arp@excalibur> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="nextPart2943179.xd1mhZDcFd" Content-Transfer-Encoding: 7Bit X-Originating-IP: [140.181.3.12] X-ClientProxiedBy: srvex1.Campus.gsi.de (10.10.4.11) To srvex3.campus.gsi.de (10.10.4.16) X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_PASS, TXREP, T_SPF_HELO_PERMERROR, URIBL_SBL, URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jun 2021 10:53:21 -0000 --nextPart2943179.xd1mhZDcFd Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="UTF-8" V2hpbGUgdGVzdGluZyBuZXdlciBwYXRjaGVzIEkgZm91bmQgc2V2ZXJhbCBtaXNzaW5nIGNvbnZl cnNpb25zIGZyb20gCl9fYml0X2Nhc3QgdG8gc2ltZF9iaXRfY2FzdCBpbiB0aGlzIHBhdGNoIChp LmUuIHdoZXJlIGJpdCBjYXN0aW5nIHRvIC8gZnJvbSAKZml4ZWRfc2l6ZSB3YXMgc29tZXRpbWVz IHJlcXVpcmVkKS4gQ29ycmVjdGVkIHBhdGNoIGF0dGFjaGVkLgoKCkZyb206IE1hdHRoaWFzIEty ZXR6IDxrcmV0ekBrZGUub3JnPgoKVGhlIF9fYml0X2Nhc3QgZnVuY3Rpb24gd2FzIGEgaGFjayB0 byBhY2hpZXZlIHdoYXQgX19idWlsdGluX2JpdF9jYXN0CmNhbiBkbywgdGhlcmVmb3JlIHVzZSBf X2J1aWx0aW5fYml0X2Nhc3QgaWYgcG9zc2libGUuIEhvd2V2ZXIsCl9fYnVpbHRpbl9iaXRfY2Fz dCBjYW5ub3QgYmUgdXNlZCB0byBjYXN0IGZyb20vdG8gZml4ZWRfc2l6ZV9zaW1kLCBzaW5jZQpp dCBpc24ndCB0cml2aWFsbHkgY29weWFibGUgKGluIHRoZSBsYW5ndWFnZSBzZW5zZSDigJQgaW4g cHJpbmNpcGxlIGl0CmlzKS4gVGhlcmVmb3JlIGFkZCBfX3Byb3Bvc2VkOjpzaW1kX2JpdF9jYXN0 IHRvIGVuYWJsZSB0aGUgdXNlIGNhc2UKcmVxdWlyZWQgaW4gdGhlIHRlc3QgZnJhbWV3b3JrLgoK U2lnbmVkLW9mZi1ieTogTWF0dGhpYXMgS3JldHogPG0ua3JldHpAZ3NpLmRlPgoKbGlic3RkYysr LXYzL0NoYW5nZUxvZzoKCiAgICAgICAgKiBpbmNsdWRlL2V4cGVyaW1lbnRhbC9iaXRzL3NpbWQu aCAoX19iaXRfY2FzdCk6IEltcGxlbWVudCB2aWEKICAgICAgICBfX2J1aWx0aW5fYml0X2Nhc3Qg I2lmIGF2YWlsYWJsZS4KICAgICAgICAoX19wcm9wb3NlZDo6c2ltZF9iaXRfY2FzdCk6IEFkZCBv dmVybG9hZHMgZm9yIHNpbWQgYW5kCiAgICAgICAgc2ltZF9tYXNrLCB3aGljaCB1c2UgX19idWls dGluX2JpdF9jYXN0IChvciBfX2JpdF9jYXN0ICNpZiBub3QKICAgICAgICBhdmFpbGFibGUpLCB3 aGljaCByZXR1cm4gYW4gb2JqZWN0IG9mIHRoZSByZXF1ZXN0ZWQgdHlwZSB3aXRoCiAgICAgICAg dGhlIHNhbWUgYml0cyBhcyB0aGUgYXJndW1lbnQuCiAgICAgICAgKiBpbmNsdWRlL2V4cGVyaW1l bnRhbC9iaXRzL3NpbWRfbWF0aC5oOiBVc2Ugc2ltZF9iaXRfY2FzdAogICAgICAgIGluc3RlYWQg b2YgX19iaXRfY2FzdCB0byBhbGxvdyBjYXN0cyB0byBmaXhlZF9zaXplX3NpbWQuCiAgICAgICAg KGNvcHlzaWduKTogUmVtb3ZlIGJyYW5jaCB0aGF0IHdhcyBvbmx5IHJlcXVpcmVkIGlmIF9fYml0 X2Nhc3QKICAgICAgICBjYW5ub3QgYmUgY29uc3RleHByLgogICAgICAgICogdGVzdHN1aXRlL2V4 cGVyaW1lbnRhbC9zaW1kL3Rlc3RzL2JpdHMvdGVzdF92YWx1ZXMuaDogU3dpdGNoCiAgICAgICAg ZnJvbSBfX2JpdF9jYXN0IHRvIF9fcHJvcG9zZWQ6OnNpbWRfYml0X2Nhc3Qgc2luY2UgdGhlIGZv cm1lcgogICAgICAgIHdpbGwgbm90IGNhc3QgZml4ZWRfc2l6ZSBvYmplY3RzIGFueW1vcmUuCi0t LQogbGlic3RkYysrLXYzL2luY2x1ZGUvZXhwZXJpbWVudGFsL2JpdHMvc2ltZC5oIHwgNTcgKysr KysrKysrKysrKysrKysrLQogLi4uL2luY2x1ZGUvZXhwZXJpbWVudGFsL2JpdHMvc2ltZF9tYXRo LmggICAgIHwgMzYgKysrKystLS0tLS0tCiAuLi4vc2ltZC90ZXN0cy9iaXRzL3Rlc3RfdmFsdWVz LmggICAgICAgICAgICAgfCAgOCArLS0KIDMgZmlsZXMgY2hhbmdlZCwgNzUgaW5zZXJ0aW9ucygr KSwgMjYgZGVsZXRpb25zKC0pCgoKLS0K4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSACiBEci4gTWF0dGhpYXMgS3JldHogICAgICAgICAgICAgICAgICAg ICAgICAgICBodHRwczovL21hdHRrcmV0ei5naXRodWIuaW8KIEdTSSBIZWxtaG9sdHogQ2VudHJl IGZvciBIZWF2eSBJb24gUmVzZWFyY2ggICAgICAgICAgICAgICBodHRwczovL2dzaS5kZQogc3Rk OjpleHBlcmltZW50YWw6OnNpbWQgICAgICAgICAgICAgIGh0dHBzOi8vZ2l0aHViLmNvbS9WY0Rl dmVsL3N0ZC1zaW1kCuKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgA== --nextPart2943179.xd1mhZDcFd Content-Disposition: inline; filename="0001-libstdc-Make-use-of-__builtin_bit_cast.patch" Content-Transfer-Encoding: 7Bit Content-Type: text/x-patch; charset="utf-8"; name="0001-libstdc-Make-use-of-__builtin_bit_cast.patch" diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 163f1b574e2..852d0b62012 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1598,7 +1598,9 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _To __bit_cast(const _From __x) { - // TODO: implement with / replace by __builtin_bit_cast ASAP +#if __has_builtin(__builtin_bit_cast) + return __builtin_bit_cast(_To, __x); +#else static_assert(sizeof(_To) == sizeof(_From)); constexpr bool __to_is_vectorizable = is_arithmetic_v<_To> || is_enum_v<_To>; @@ -1629,6 +1631,7 @@ template reinterpret_cast(&__x), sizeof(_To)); return __r; } +#endif } // }}} @@ -2900,6 +2903,58 @@ template (__x)}; } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd<_Up, _Abi>& __x) + { + using _Tp = typename _To::value_type; + using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; + using _From = simd<_Up, _Abi>; + using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember; + // with concepts, the following should be constraints + static_assert(sizeof(_To) == sizeof(_From)); + static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>); + static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>); +#if __has_builtin(__builtin_bit_cast) + return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))}; +#else + return {__private_init, __bit_cast<_ToMember>(__data(__x))}; +#endif + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd_mask<_Up, _Abi>& __x) + { + using _From = simd_mask<_Up, _Abi>; + static_assert(sizeof(_To) == sizeof(_From)); + static_assert(is_trivially_copyable_v<_From>); + // _To can be simd, specifically simd> in which case _To is not trivially + // copyable. + if constexpr (is_simd_v<_To>) + { + using _Tp = typename _To::value_type; + using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; + static_assert(is_trivially_copyable_v<_ToMember>); +#if __has_builtin(__builtin_bit_cast) + return {__private_init, __builtin_bit_cast(_ToMember, __x)}; +#else + return {__private_init, __bit_cast<_ToMember>(__x)}; +#endif + } + else + { + static_assert(is_trivially_copyable_v<_To>); +#if __has_builtin(__builtin_bit_cast) + return __builtin_bit_cast(_To, __x); +#else + return __bit_cast<_To>(__x); +#endif + } + } } // namespace __proposed // simd_cast {{{2 diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index d954e761eee..afd8b5a028f 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -405,10 +405,11 @@ template using _Vp = simd<_Tp, _Abi>; using _Up = make_unsigned_t<__int_for_sizeof_t<_Tp>>; using namespace std::experimental::__float_bitwise_operators; + using namespace std::experimental::__proposed; const _Vp __exponent_mask = __infinity_v<_Tp>; // 0x7f800000 or 0x7ff0000000000000 return static_simd_cast>( - __bit_cast>(__v & __exponent_mask) + simd_bit_cast>(__v & __exponent_mask) >> (__digits_v<_Tp> - 1)); } @@ -700,11 +701,9 @@ template // (inf and NaN are excluded by -ffinite-math-only) const auto __iszero_inf_nan = __x == 0; #else - const auto __as_int - = __bit_cast, _V>>(abs(__x)); - const auto __inf - = __bit_cast, _V>>( - _V(__infinity_v<_Tp>)); + using _Ip = __int_for_sizeof_t<_Tp>; + const auto __as_int = simd_bit_cast>(abs(__x)); + const auto __inf = simd_bit_cast>(_V(__infinity_v<_Tp>)); const auto __iszero_inf_nan = static_simd_cast( __as_int == 0 || __as_int >= __inf); #endif @@ -722,10 +721,10 @@ template where(__value_isnormal.__cvt(), __e) = __exponent_bits; static_assert(sizeof(_IV) == sizeof(__value_isnormal)); const _IV __offset - = (__bit_cast<_IV>(__value_isnormal) & _IV(__exp_adjust)) - | (__bit_cast<_IV>(static_simd_cast<_MaskType>(__exponent_bits == 0) - & static_simd_cast<_MaskType>(__x != 0)) - & _IV(__exp_adjust + __exp_offset)); + = (simd_bit_cast<_IV>(__value_isnormal) & _IV(__exp_adjust)) + | (simd_bit_cast<_IV>(static_simd_cast<_MaskType>(__exponent_bits == 0) + & static_simd_cast<_MaskType>(__x != 0)) + & _IV(__exp_adjust + __exp_offset)); *__exp = simd_cast<_Samesize>(__e - __offset); return __mant; } @@ -796,7 +795,7 @@ template using namespace std::experimental::__proposed; using _IV = rebind_simd_t< conditional_t, _V>; - return (__bit_cast<_IV>(__v) >> (__digits_v<_Tp> - 1)) + return (simd_bit_cast<_IV>(__v) >> (__digits_v<_Tp> - 1)) - (__max_exponent_v<_Tp> - 1); }; _V __r = static_simd_cast<_V>(__exponent(abs_x)); @@ -981,6 +980,7 @@ template // Skylake-AVX512 (not even for SSE and AVX vectors, and really bad for // AVX-512). using namespace __float_bitwise_operators; + using namespace __proposed; _V __absx = abs(__x); // no error _V __absy = abs(__y); // no error _V __hi = max(__absx, __absy); // no error @@ -1028,9 +1028,9 @@ template #ifdef __FAST_MATH__ using _Ip = __int_for_sizeof_t<_Tp>; using _IV = rebind_simd_t<_Ip, _V>; - const auto __as_int = __bit_cast<_IV>(__hi_exp); + const auto __as_int = simd_bit_cast<_IV>(__hi_exp); const _V __scale - = __bit_cast<_V>(2 * __bit_cast<_Ip>(_Tp(1)) - __as_int); + = simd_bit_cast<_V>(2 * simd_bit_cast<_Ip>(_Tp(1)) - __as_int); #else const _V __scale = (__hi_exp ^ __inf) * _Tp(.5); #endif @@ -1197,9 +1197,9 @@ _GLIBCXX_SIMD_CVTING2(hypot) #ifdef __FAST_MATH__ using _Ip = __int_for_sizeof_t<_Tp>; using _IV = rebind_simd_t<_Ip, _V>; - const auto __as_int = __bit_cast<_IV>(__hi_exp); + const auto __as_int = simd_bit_cast<_IV>(__hi_exp); const _V __scale - = __bit_cast<_V>(2 * __bit_cast<_Ip>(_Tp(1)) - __as_int); + = simd_bit_cast<_V>(2 * simd_bit_cast<_Ip>(_Tp(1)) - __as_int); #else const _V __scale = (__hi_exp ^ __inf) * _Tp(.5); #endif @@ -1306,12 +1306,6 @@ template return std::copysign(__x[0], __y[0]); else if constexpr (__is_fixed_size_abi_v<_Abi>) return {__private_init, _Abi::_SimdImpl::_S_copysign(__data(__x), __data(__y))}; - else if constexpr (is_same_v<_Tp, long double> && sizeof(_Tp) == 12) - // Remove this case once __bit_cast is implemented via __builtin_bit_cast. - // It is necessary, because __signmask below cannot be computed at compile - // time. - return simd<_Tp, _Abi>( - [&](auto __i) { return std::copysign(__x[__i], __y[__i]); }); else { using _V = simd<_Tp, _Abi>; diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h index b69bd0b704d..67aa870659b 100644 --- a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h +++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h @@ -221,11 +221,11 @@ template if constexpr (sizeof(T) <= sizeof(double)) { using I = rebind_simd_t<__int_for_sizeof_t, V>; - const I abs_x = __bit_cast(abs(x)); - const I min = __bit_cast(V(std::__norm_min_v)); - const I max = __bit_cast(V(std::__finite_max_v)); + const I abs_x = simd_bit_cast(abs(x)); + const I min = simd_bit_cast(V(std::__norm_min_v)); + const I max = simd_bit_cast(V(std::__finite_max_v)); return static_simd_cast( - __bit_cast(x) == 0 || (abs_x >= min && abs_x <= max)); + simd_bit_cast(x) == 0 || (abs_x >= min && abs_x <= max)); } else { --nextPart2943179.xd1mhZDcFd--