From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lxmtout1.gsi.de (lxmtout1.gsi.de [140.181.3.111]) by sourceware.org (Postfix) with ESMTPS id E8C9C3857010; Sat, 14 Nov 2020 01:11:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E8C9C3857010 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gsi.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=M.Kretz@gsi.de Received: from localhost (localhost [127.0.0.1]) by lxmtout1.gsi.de (Postfix) with ESMTP id 80EEE2050D02; Sat, 14 Nov 2020 02:11:05 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at lxmtout1.gsi.de Received: from lxmtout1.gsi.de ([127.0.0.1]) by localhost (lxmtout1.gsi.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id X-h_rnhRXNUI; Sat, 14 Nov 2020 02:11:05 +0100 (CET) Received: from srvex3.campus.gsi.de (srvex3.campus.gsi.de [10.10.4.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by lxmtout1.gsi.de (Postfix) with ESMTPS id 652F02050D00; Sat, 14 Nov 2020 02:11:05 +0100 (CET) Received: from excalibur.localnet (140.181.3.12) by srvex3.campus.gsi.de (10.10.4.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1847.3; Sat, 14 Nov 2020 02:11:05 +0100 From: Matthias Kretz To: Jonathan Wakely CC: Thomas Rodgers , libstdc++ , Gcc-patches Subject: Re: [PATCH] std::experimental::simd Date: Sat, 14 Nov 2020 02:11:04 +0100 Message-ID: <7016866.CdeeP7ohKn@excalibur> Organization: GSI Helmholtzzentrum =?UTF-8?B?ZsO8cg==?= Schwerionenforschung In-Reply-To: <20201111234331.GA503596@redhat.com> References: <10916085.4XdQGCaa7L@depc447> <33105491.xCRyjBS7g1@excalibur> <20201111234331.GA503596@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" X-Originating-IP: [140.181.3.12] X-ClientProxiedBy: srvex3.Campus.gsi.de (10.10.4.16) To srvex3.campus.gsi.de (10.10.4.16) X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Nov 2020 01:11:11 -0000 On Donnerstag, 12. November 2020 00:43:31 CET Jonathan Wakely wrote: > On 08/05/20 21:03 +0200, Matthias Kretz wrote: > >Here's my last update to the std::experimental::simd patch. It's current= ly > >based on the gcc-10 branch. > > > > > >+ > >+// __next_power_of_2{{{ > >+/** > >+ * \internal >=20 > We use @foo for Doxygen commens rather than \foo Done. > >+ * Returns the next power of 2 larger than or equal to \p __x. > >+ */ > >+constexpr std::size_t > >+__next_power_of_2(std::size_t __x) > >+{ > >+ return (__x & (__x - 1)) =3D=3D 0 ? __x > >+ : __next_power_of_2((__x | (__x >> 1)) + 1); > >+} >=20 > Can this be replaced with std::__bit_ceil ? >=20 > std::bit_ceil is C++20, but we provide __private versions of > everything in for C++14 and up. Ah good. I'll delete some code. > >+// vvv ---- type traits ---- vvv > >+// integer type aliases{{{ > >+using _UChar =3D unsigned char; > >+using _SChar =3D signed char; > >+using _UShort =3D unsigned short; > >+using _UInt =3D unsigned int; > >+using _ULong =3D unsigned long; > >+using _ULLong =3D unsigned long long; > >+using _LLong =3D long long; >=20 > I have a suspicion some of these might clash with libc macros on some > OS somewhere, but we can cross that bridge when we come to it. I need those to help cutting down the code for 80 cols. ;-) > >+// __make_dependent_t {{{ > >+template struct __make_dependent > >+{ > >+ using type =3D _Up; > >+}; > >+template > >+using __make_dependent_t =3D typename __make_dependent<_Tp, _Up>::type; >=20 > Do you need a distinct class template for this, or can > __make_dependent_t be an alias to __type_identity::type or > something else that already exists? With GCC it would suffice to use __type_identity::type here. But Clang=20 rejects it. Clang sees that the first template argument is not used in the= =20 definition of the alias and thus doesn't make _Up a dependent type. > >+// __call_with_n_evaluations{{{ > >+template > >+_GLIBCXX_SIMD_INTRINSIC constexpr auto > >+__call_with_n_evaluations(std::index_sequence<_I...>, _F0&& __f0, > >+ _FArgs&& __fargs) >=20 > I'm not sure if it matters here, but old versions of G++ passed empty > types (like index_sequence) using the wrong ABI. Passing them as the > last argument makes it a non-issue. If they're not the last argument, > you get incompatible code when compiling with -fabi-version=3D7 or > lower. These are all [[gnu::always_inline]]. So it shouldn't matter. > >+// __is_narrowing_conversion<_From, _To>{{{ > >+template >std::is_arithmetic<_From>::value, + bool =3D > >std::is_arithmetic<_To>::value> >=20 > These could use is_arithmetic_v. Right. That was me trying to work around a clang-format bug. Will fix. I'm = in=20 the process of ditching clang-format anyway. > >+{ > >+}; > >+ > >+template > >+struct __is_narrowing_conversion : public true_t= ype >=20 > This looks odd, bool to arithmetic type T is narrowing? > I assume there's a reason for it, so maybe a comment explaining it > would help. Odd indeed. Either I wanted to take a shortcut to implement: "From is a=20 vectorizable type and every possibly value of From can be represented with= =20 type value_type, or [...]". Or I wanted to swap bool and _Tp here and say t= hat=20 anything other than bool converting to bool is narrowing. I should clean this up. >=20 > >+// _BitOps {{{ > >+struct _BitOps > > [...] > std::__popcount in > > [...] > std::__countl_zero in Yes. I'll clean up all of _BitOps. > >+template >=20 > We generally avoid single letter names, although _V isn't in the list > of BADNAMES in the manual, so maybe this one's OK. >=20 > >+template , > >+ typename _R >=20 > Same for _R, it's not listed as a BADNAME. I believe I checked the list. ;-) > >+ > >+template > >+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp > >+__and(_Tp __a, _Tp __b) noexcept >=20 > Calls to __and are done unqualified. Are they only with types that > won't cause ADL to look outside namespace std? >=20 > Even though __and is a reserved name, avoidign ADL has other benefits. Called either with integers, [[gnu::vector_size(N)]] types, or=20 std::experimental::parallelism_v2::_SimdWrapper. I request a column limit=20 relaxation to at least 100 cols if I should qualify all of them with=20 std::experimental:: ;-) > That's all for now ... not very far through the huge patch though. > Generally this looks very good. The things mentioned above are > stylistic or just remove some redundancy, they're not critical. Thanks. I'll post a new patch ASAP. My tests are running. =2DMatthias =2D-=20 =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80 Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80