From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 6E7493857830 for ; Fri, 1 Oct 2021 19:42:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E7493857830 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-231-dV8pZ4_gOfmWFySL9qFSAQ-1; Fri, 01 Oct 2021 15:42:37 -0400 X-MC-Unique: dV8pZ4_gOfmWFySL9qFSAQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F36BD84A5E1; Fri, 1 Oct 2021 19:42:36 +0000 (UTC) Received: from localhost (unknown [10.33.36.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9DE1A5C1B4; Fri, 1 Oct 2021 19:42:36 +0000 (UTC) Date: Fri, 1 Oct 2021 20:42:35 +0100 From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [committed] libstdc++: Optimize std::visit for the common case [PR 78113] Message-ID: MIME-Version: 1.0 X-Clacks-Overhead: GNU Terry Pratchett X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="nvJ9T5DIqXMkrvrD" Content-Disposition: inline X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2021 19:42:41 -0000 --nvJ9T5DIqXMkrvrD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline GCC does not do a good job of optimizing the table of function pointers used for variant visitation. This avoids using the table for the common case of visiting a single variant with a small number of alternative types. Instead we use: switch(v.index()) { case 0: return visitor(get<0>(v)); case 1: return visitor(get<1>(v)); ... } It's not quite that simple, because get<1>(v) is ill-formed if the variant only has one alternative, and similarly for each get. We need to ensure each case only applies the visitor if the index is in range for the actual type we're dealing with, and tell the compiler that the case is unreachable otherwise. We also need to invoke the visitor via the __gen_vtable_impl::__visit_invoke function, to handle the raw visitation cases used to implement std::variant assignments and comparisons. Because that gets quite verbose and repetitive, a macro is used to stamp out the cases. We also need to handle the valueless_by_exception case, but only for raw visitation, because std::visit already checks for it before calling __do_visit. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/78113 * include/std/variant (__do_visit): Use a switch when we have a single variant with a small number of alternatives. Tested powerpc64le-linux. Committed to trunk. --nvJ9T5DIqXMkrvrD Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch.txt" commit cfb582f62791dfadc243d97d37f0b83ef77cf480 Author: Jonathan Wakely Date: Tue May 4 23:31:48 2021 libstdc++: Optimize std::visit for the common case [PR 78113] GCC does not do a good job of optimizing the table of function pointers used for variant visitation. This avoids using the table for the common case of visiting a single variant with a small number of alternative types. Instead we use: switch(v.index()) { case 0: return visitor(get<0>(v)); case 1: return visitor(get<1>(v)); ... } It's not quite that simple, because get<1>(v) is ill-formed if the variant only has one alternative, and similarly for each get. We need to ensure each case only applies the visitor if the index is in range for the actual type we're dealing with, and tell the compiler that the case is unreachable otherwise. We also need to invoke the visitor via the __gen_vtable_impl::__visit_invoke function, to handle the raw visitation cases used to implement std::variant assignments and comparisons. Because that gets quite verbose and repetitive, a macro is used to stamp out the cases. We also need to handle the valueless_by_exception case, but only for raw visitation, because std::visit already checks for it before calling __do_visit. Signed-off-by: Jonathan Wakely libstdc++-v3/ChangeLog: PR libstdc++/78113 * include/std/variant (__do_visit): Use a switch when we have a single variant with a small number of alternatives. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index c651326ead9..19b2158690a 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -485,6 +485,12 @@ namespace __variant { if constexpr (__variant::__never_valueless<_Types...>()) return true; + // It would be nice if we could just return true for -fno-exceptions. + // It's possible (but inadvisable) that a std::variant could become + // valueless in a translation unit compiled with -fexceptions and then + // be passed to functions compiled with -fno-exceptions. We would need + // some #ifdef _GLIBCXX_NO_EXCEPTIONS_GLOBALLY property to elide all + // checks for valueless_by_exception(). return this->_M_index != static_cast<__index_type>(variant_npos); } @@ -1754,12 +1760,89 @@ namespace __variant constexpr decltype(auto) __do_visit(_Visitor&& __visitor, _Variants&&... __variants) { - constexpr auto& __vtable = __detail::__variant::__gen_vtable< - _Result_type, _Visitor&&, _Variants&&...>::_S_vtable; + // Get the silly case of visiting no variants out of the way first. + if constexpr (sizeof...(_Variants) == 0) + return std::forward<_Visitor>(__visitor)(); + else + { + constexpr size_t __max = 11; // "These go to eleven." - auto __func_ptr = __vtable._M_access(__variants.index()...); - return (*__func_ptr)(std::forward<_Visitor>(__visitor), - std::forward<_Variants>(__variants)...); + // The type of the first variant in the pack. + using _V0 + = typename __detail::__variant::_Nth_type<0, _Variants...>::type; + // The number of alternatives in that first variant. + constexpr auto __n = variant_size_v>; + + if constexpr (sizeof...(_Variants) > 1 || __n > __max) + { + // Use a jump table for the general case. + constexpr auto& __vtable = __detail::__variant::__gen_vtable< + _Result_type, _Visitor&&, _Variants&&...>::_S_vtable; + + auto __func_ptr = __vtable._M_access(__variants.index()...); + return (*__func_ptr)(std::forward<_Visitor>(__visitor), + std::forward<_Variants>(__variants)...); + } + else // We have a single variant with a small number of alternatives. + { + // A name for the first variant in the pack. + _V0& __v0 + = [](_V0& __v, ...) -> _V0& { return __v; }(__variants...); + + using __detail::__variant::_Multi_array; + using __detail::__variant::__gen_vtable_impl; + using _Ma = _Multi_array<_Result_type (*)(_Visitor&&, _V0&&)>; + +#ifdef _GLIBCXX_DEBUG +# define _GLIBCXX_VISIT_UNREACHABLE __builtin_trap +#else +# define _GLIBCXX_VISIT_UNREACHABLE __builtin_unreachable +#endif + +#define _GLIBCXX_VISIT_CASE(N) \ + case N: \ + { \ + if constexpr (N < __n) \ + { \ + return __gen_vtable_impl<_Ma, index_sequence>:: \ + __visit_invoke(std::forward<_Visitor>(__visitor), \ + std::forward<_V0>(__v0)); \ + } \ + else _GLIBCXX_VISIT_UNREACHABLE(); \ + } + + switch (__v0.index()) + { + _GLIBCXX_VISIT_CASE(0) + _GLIBCXX_VISIT_CASE(1) + _GLIBCXX_VISIT_CASE(2) + _GLIBCXX_VISIT_CASE(3) + _GLIBCXX_VISIT_CASE(4) + _GLIBCXX_VISIT_CASE(5) + _GLIBCXX_VISIT_CASE(6) + _GLIBCXX_VISIT_CASE(7) + _GLIBCXX_VISIT_CASE(8) + _GLIBCXX_VISIT_CASE(9) + _GLIBCXX_VISIT_CASE(10) + case variant_npos: + using __detail::__variant::__variant_idx_cookie; + using __detail::__variant::__variant_cookie; + if constexpr (is_same_v<_Result_type, __variant_idx_cookie> + || is_same_v<_Result_type, __variant_cookie>) + { + return __gen_vtable_impl<_Ma, index_sequence<-1>>:: + __visit_invoke(std::forward<_Visitor>(__visitor), + std::forward<_V0>(__v0)); + } + else + _GLIBCXX_VISIT_UNREACHABLE(); + default: + _GLIBCXX_VISIT_UNREACHABLE(); + } +#undef _GLIBCXX_VISIT_CASE +#undef _GLIBCXX_VISIT_UNREACHABLE + } + } } /// @endcond --nvJ9T5DIqXMkrvrD--