From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lxmtout2.gsi.de (lxmtout2.gsi.de [140.181.3.112]) by sourceware.org (Postfix) with ESMTPS id C7DF6384C005; Mon, 1 Feb 2021 10:23:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C7DF6384C005 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gsi.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=M.Kretz@gsi.de Received: from localhost (localhost [127.0.0.1]) by lxmtout2.gsi.de (Postfix) with ESMTP id 72244202AD64; Mon, 1 Feb 2021 11:23:45 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at lxmtout2.gsi.de Received: from lxmtout2.gsi.de ([127.0.0.1]) by localhost (lxmtout2.gsi.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 0wWuU51Yqbss; Mon, 1 Feb 2021 11:23:45 +0100 (CET) Received: from srvex3.campus.gsi.de (srvex3.campus.gsi.de [10.10.4.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by lxmtout2.gsi.de (Postfix) with ESMTPS id 57912202AD5D; Mon, 1 Feb 2021 11:23:45 +0100 (CET) Received: from excalibur.localnet (140.181.3.12) by srvex3.campus.gsi.de (10.10.4.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2106.2; Mon, 1 Feb 2021 11:23:45 +0100 From: Matthias Kretz To: , Subject: Re: [PATCH 14/16] Implement hmin and hmax Date: Mon, 1 Feb 2021 11:23:44 +0100 Message-ID: <3529880.ebMzRN9Arp@excalibur> Organization: GSI Helmholtzzentrum =?UTF-8?B?ZsO8cg==?= Schwerionenforschung In-Reply-To: <6999407.PJNiXcIEje@excalibur> References: <4667217.5jz8CO7rxU@excalibur> <6999407.PJNiXcIEje@excalibur> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="nextPart4244710.CdeeP7ohKn" Content-Transfer-Encoding: 7Bit X-Originating-IP: [140.181.3.12] X-ClientProxiedBy: srvex3.Campus.gsi.de (10.10.4.16) To srvex3.campus.gsi.de (10.10.4.16) X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, SPF_PASS, TXREP, T_SPF_HELO_PERMERROR autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2021 10:23:49 -0000 --nextPart4244710.CdeeP7ohKn Content-Transfer-Encoding: base64 Content-Type: text/plain; charset="UTF-8" T24gTWl0dHdvY2gsIDI3LiBKYW51YXIgMjAyMSAyMTo0Mjo1MCBDRVQgTWF0dGhpYXMgS3JldHog d3JvdGU6Cj4gLS0tIGEvbGlic3RkYysrLXYzL2luY2x1ZGUvZXhwZXJpbWVudGFsL2JpdHMvc2lt ZC5oCj4gKysrIGIvbGlic3RkYysrLXYzL2luY2x1ZGUvZXhwZXJpbWVudGFsL2JpdHMvc2ltZC5o Cj4gQEAgLTIwNCw2ICsyMDQsMjcgQEAgdGVtcGxhdGUgPHNpemVfdCBfTnA+Cj4gIHRlbXBsYXRl IDxzaXplX3QgX1g+Cj4gICAgdXNpbmcgX1NpemVDb25zdGFudCA9IGludGVncmFsX2NvbnN0YW50 PHNpemVfdCwgX1g+Owo+IAo+ICtuYW1lc3BhY2UgX19kZXRhaWwgewo+ICsgIHN0cnVjdCBfTWlu aW11bSB7Cj4gKyAgICB0ZW1wbGF0ZSA8dHlwZW5hbWUgX1RwPgo+ICsgICAgICBfR0xJQkNYWF9T SU1EX0lOVFJJTlNJQyBjb25zdGV4cHIKPiArICAgICAgX1RwCj4gKyAgICAgIG9wZXJhdG9yKCko X1RwIF9fYSwgX1RwIF9fYikgY29uc3QgewoKUmV2aWV3aW5nIG15IG93biBwYXRjaCA6KSBUaGlz IG5lZWRzIGxpbmUgYnJlYWtzIGJlZm9yZSB7IGZvciBuYW1lc3BhY2UsIApzdHJ1Y3QsIGFuZCBv cGVyYXRvcigpLiBBbmQgYW5vdGhlciBsaW5lIGJyZWFrIGJlZm9yZSB0aGUgbmV4dCBzdHJ1Y3Qu IE5ldyAKcGF0Y2ggYXR0YWNoZWQuCgpGcm9tOiBNYXR0aGlhcyBLcmV0eiA8a3JldHpAa2RlLm9y Zz4KCkZyb20gOS43LjQgaW4gUGFyYWxsZWxpc20gVFMgMi4gRm9yIHNvbWUgcmVhc29uIEkgb3Zl cmxvb2tlZCB0aGVzZSB0d28KZnVuY3Rpb25zLiBJbXBsZW1lbnQgdGhlbSB2aWEgY2FsbCB0byBf U19yZWR1Y2UuCgpsaWJzdGRjKystdjMvQ2hhbmdlTG9nOgogICAgICAgICogaW5jbHVkZS9leHBl cmltZW50YWwvYml0cy9zaW1kLmg6IEFkZCBfX2RldGFpbDo6X01pbmltdW0gYW5kCiAgICAgICAg X19kZXRhaWw6Ol9NYXhpbXVtIHRvIHVzZSB0aGVtIGFzIF9CaW5hcnlPcGVyYXRpb24gdG8gX1Nf cmVkdWNlLgogICAgICAgIEFkZCBobWluIGFuZCBobWF4IG92ZXJsb2FkcyBmb3Igc2ltZCBhbmQg Y29uc3Rfd2hlcmVfZXhwcmVzc2lvbi4KICAgICAgICAqIGluY2x1ZGUvZXhwZXJpbWVudGFsL2Jp dHMvc2ltZF9zY2FsYXIuaAogICAgICAgIChfU2ltZEltcGxTY2FsYXI6Ol9TX3JlZHVjZSk6IE1h a2UgdW51c2VkIF9CaW5hcnlPcGVyYXRpb24KICAgICAgICBwYXJhbWV0ZXIgY29uc3QtcmVmIHRv IGFsbG93IGNhbGxpbmcgX1NfcmVkdWNlIHdpdGggYW4gcnZhbHVlLgogICAgICAgICogdGVzdHN1 aXRlL2V4cGVyaW1lbnRhbC9zaW1kL3Rlc3RzL3JlZHVjdGlvbnMuY2M6IEFkZCB0ZXN0cyBmb3IK ICAgICAgICBobWluIGFuZCBobWF4LiBTaW5jZSB0aGUgY29tcGlsZXIgc3RhdGljYWxseSBkZXRl cm1pbmVkIHRoYXQgYWxsCiAgICAgICAgdGVzdHMgcGFzcywgcmVwZWF0IHRoZSB0ZXN0IGFmdGVy IGEgY2FsbCB0byBtYWtlX3ZhbHVlX3Vua25vd24uCgotLSAK4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA 4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSA4pSACiBEci4gTWF0dGhpYXMgS3JldHogICAgICAg ICAgICAgICAgICAgICAgICAgICBodHRwczovL21hdHRrcmV0ei5naXRodWIuaW8KIEdTSSBIZWxt aG9sdHogQ2VudHJlIGZvciBIZWF2eSBJb24gUmVzZWFyY2ggICAgICAgICAgICAgICBodHRwczov L2dzaS5kZQogc3RkOjpleHBlcmltZW50YWw6OnNpbWQgICAgICAgICAgICAgIGh0dHBzOi8vZ2l0 aHViLmNvbS9WY0RldmVsL3N0ZC1zaW1kCuKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKUgOKU gOKUgOKUgOKUgOKUgOKUgOKUgAo= --nextPart4244710.CdeeP7ohKn Content-Disposition: inline; filename="0014-Implement-hmin-and-hmax.patch" Content-Transfer-Encoding: 7Bit Content-Type: text/x-patch; charset="UTF-8"; name="0014-Implement-hmin-and-hmax.patch" diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 14179491f9d..a90cb3b2d98 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -204,6 +204,33 @@ template template using _SizeConstant = integral_constant; +namespace __detail +{ + struct _Minimum + { + template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const + { + using std::min; + return min(__a, __b); + } + }; + + struct _Maximum + { + template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const + { + using std::max; + return max(__a, __b); + } + }; +} // namespace __detail + // unrolled/pack execution helpers // __execute_n_times{{{ template @@ -3408,7 +3435,7 @@ template // }}}1 // reductions [simd.reductions] {{{1 - template > +template > _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp reduce(const simd<_Tp, _Abi>& __v, _BinaryOperation __binary_op = _BinaryOperation()) @@ -3454,6 +3481,61 @@ template reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op) { return reduce(__x, 0, __binary_op); } +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmin(const simd<_Tp, _Abi>& __v) noexcept + { + return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmax(const simd<_Tp, _Abi>& __v) noexcept + { + return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmin(const const_where_expression<_M, _V>& __x) noexcept + { + using _Tp = typename _V::value_type; + constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_max_v<_Tp>; +#else + __value_or<__infinity, _Tp>(__finite_max_v<_Tp>); +#endif + _V __tmp = __id_elem; + _V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), + __data(__get_lvalue(__x))); + return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmax(const const_where_expression<_M, _V>& __x) noexcept + { + using _Tp = typename _V::value_type; + constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_min_v<_Tp>; +#else + [] { + if constexpr (__value_exists_v<__infinity, _Tp>) + return -__infinity_v<_Tp>; + else + return __finite_min_v<_Tp>; + }(); +#endif + _V __tmp = __id_elem; + _V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), + __data(__get_lvalue(__x))); + return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum()); + } + // }}}1 // algorithms [simd.alg] {{{ template diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h index 7680bc39c30..7e480ecdb37 100644 --- a/libstdc++-v3/include/experimental/bits/simd_scalar.h +++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h @@ -182,7 +182,7 @@ struct _SimdImplScalar // _S_reduce {{{2 template static constexpr inline _Tp - _S_reduce(const simd<_Tp, simd_abi::scalar>& __x, _BinaryOperation&) + _S_reduce(const simd<_Tp, simd_abi::scalar>& __x, const _BinaryOperation&) { return __x._M_data; } // _S_min, _S_max {{{2 diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc index 9d897d5ccd6..02df68fafbc 100644 --- a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc +++ b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc @@ -57,6 +57,8 @@ template } { + COMPARE(hmin(V(1)), T(1)); + COMPARE(hmax(V(1)), T(1)); const V z([](T i) { return i + 1; }); COMPARE(std::experimental::reduce(z, [](auto a, auto b) { @@ -79,6 +81,25 @@ template }), T(V::size() == 1 ? 117 : 2)) << "z: " << z; + COMPARE(hmin(z), T(1)); + COMPARE(hmax(z), T(V::size())); + if (V::size() > 1) + { + COMPARE(hmin(where(z > 1, z)), T(2)); + COMPARE(hmax(where(z > 1, z)), T(V::size())); + } + COMPARE(hmin(where(z < 4, z)), T(1)); + COMPARE(hmax(where(z < 4, z)), std::min(T(V::size()), T(3))); + const V zz = make_value_unknown(z); + COMPARE(hmin(zz), T(1)); + COMPARE(hmax(zz), T(V::size())); + if (V::size() > 1) + { + COMPARE(hmin(where(zz > 1, zz)), T(2)); + COMPARE(hmax(where(zz > 1, zz)), T(V::size())); + } + COMPARE(hmin(where(zz < 4, zz)), T(1)); + COMPARE(hmax(where(zz < 4, zz)), std::min(T(V::size()), T(3))); } test_values({}, {1000}, [](V x) { --nextPart4244710.CdeeP7ohKn--