From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 1B91A3857827 for ; Thu, 14 Apr 2022 12:38:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B91A3857827 Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-235-L3muWoAlOcGDmQuBXj0L8g-1; Thu, 14 Apr 2022 08:38:02 -0400 X-MC-Unique: L3muWoAlOcGDmQuBXj0L8g-1 Received: by mail-qt1-f197.google.com with SMTP id k1-20020ac85fc1000000b002e1c5930386so3127154qta.3 for ; Thu, 14 Apr 2022 05:38:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:date:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=Uotwoxe1dEPOq9UtEPxhEPZ83mQo+yMgkaxtB760s9A=; b=GFozZR2qrFKKStn/uJvSwQyWr0Alp78klYikIdG+251UJk1bswX1MG9yaPgPGFB6XW 3XSQhGvH41FpVGgIdTG929HgdnGzANSPbz1kLpnZi0rSqiLDxMLcgZTpFkREXy/jCFwR sZ8Jx8ukK2B3AlYSuXw1R/rpT6/nDcZJFCgU9k5XUWb9SNPcDLUoOPTjVPjAH4AkFyQm knTkgQ6jwvFIBZK+vfY2hSwF7RmtsHMFD6V6Ca8WfkTGtzoPivFedjhdW3DbG9MYUOfk 9qaxG9SO0zKyU28JmTFTZFo1RQhMCMDd+ZTK1HAaAI0D241e8ubBtZvHNNvMk0RcKkho HReQ== X-Gm-Message-State: AOAM530iUtj5LeeeyU5fRca9WzsrhJYl+n9sj/1QdGww+OVwwVMdzwOX 9zmLjPI6bIWu+N28K9VWEV1FRSzNrHEmItprk6A6xYpp+cgCinh5K+WjcPF7rlGIRnIELBXNNjj M5u4CmZ5fNIyZYOE= X-Received: by 2002:ad4:5842:0:b0:444:5137:44c6 with SMTP id de2-20020ad45842000000b00444513744c6mr3053152qvb.27.1649939881394; Thu, 14 Apr 2022 05:38:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy/UNcHPItVidxPCEOkE4QfP09+wYZNhUWN0OIui/kXDCDeAHwgvfew2uNt6skY/RPIRf2x6g== X-Received: by 2002:ad4:5842:0:b0:444:5137:44c6 with SMTP id de2-20020ad45842000000b00444513744c6mr3053124qvb.27.1649939880995; Thu, 14 Apr 2022 05:38:00 -0700 (PDT) Received: from [192.168.1.130] (ool-18e40894.dyn.optonline.net. [24.228.8.148]) by smtp.gmail.com with ESMTPSA id f18-20020a05622a1a1200b002eef655b3e3sm995905qtb.94.2022.04.14.05.38.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Apr 2022 05:38:00 -0700 (PDT) From: Patrick Palka X-Google-Original-From: Patrick Palka Date: Thu, 14 Apr 2022 08:37:59 -0400 (EDT) To: Patrick Palka cc: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org Subject: Re: [PATCH] libstdc++: Optimize integer std::from_chars In-Reply-To: <20220414123113.175965-1-ppalka@redhat.com> Message-ID: <69e19e1b-3c1b-f2a0-6965-9f83eff571d8@idea> References: <20220414123113.175965-1-ppalka@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-14.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 12:38:07 -0000 On Thu, 14 Apr 2022, Patrick Palka wrote: > This applies the following optimizations to the integer std::from_chars > implementation: > > 1. Use a lookup table for converting an alphanumeric digit to its > base-36 value instead of using a range test (for 0-9) and switch > (for a-z and A-Z). The table is constructed using a C++14 > constexpr function which doesn't assume a particular character > encoding or __CHAR_BIT__ value. The new conversion function > __from_chars_alnum_to_val is templated on whether we care > only about the decimal digits, in which case we can perform the > conversion with a single subtraction since the digit characters > are guaranteed to be contiguous (unlike the letters). > 2. Generalize __from_chars_binary to handle all power-of-two bases. > This function, now named __from_chars_pow2_base, is also templated > on whether we care only about the decimal digits in order to speed > up digit conversion for base 2, 4 and 8. > 3. In __from_chars_digit, use > static_cast(__c - '0') < __base > instead of > '0' <= __c && __c <= ('0' + (__base - 1)). > as the digit recognition test (exhaustively verified that the two > tests are equivalent). > 4. In __from_chars_alnum, use a nested loop to consume the rest of the > digits in the overflow case (mirroring __from_chars_digit) so that > the main loop doesn't have to maintain the __valid overflow flag. > > At this point, __from_chars_digit is nearly identical to > __from_chars_alnum, so this patch combines the two functions, removing > the former and templatizing the latter according to whether we care only > about the decimal digits. Finally, > > 5. In __from_chars_alnum, keep track of a lower bound on the number of > unused bits in the result and use that to omit the overflow check > when it's safe to do so. > > In passing this replaces the non-portable function ascii_to_hexit > used by __floating_from_chars_hex with the new conversion function. > > Here are some runtime measurements for a simple 15-line benchmark that > roundtrips printing/parsing 200 million integers via std::to/from_chars > (average of 5 runs): > > Base Before After (seconds, lower is better) > 2 9.37 9.37 > 3 12.13 15.79 > 8 3.67 4.15 > 10 3.86 4.90 > 11 5.03 6.84 > 16 2.93 4.14 > 32 2.39 3.85 > 36 3.26 5.22 Whoops, the second and third columns should be swapped (runtime is smaller after the patch across the board). > > Testedon x86_64-pc-linux-gnu, does this look OK for trunk? Also tested > against libc++'s from_chars tests for good measure. > > libstdc++-v3/ChangeLog: > > * include/std/charconv (__from_chars_alnum_to_val_table): Define. > (__from_chars_alnum_to_val): Define. > (__from_chars_binary): Rename to ... > (__from_chars_pow2_base): ... this. Generalize to handle any > power-of-two base using __from_chars_alnum_to_val. > (__from_chars_digit): Optimize digit recognition to a single > test instead of two tests. Use [[__unlikely___]] attribute. > (__from_chars_alpha_to_num): Remove. > (__from_chars_alnum): Use __from_chars_alnum_to_val. Use a > nested loop for the overflow case. > (from_chars): Adjust appropriately. > * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove. > (__floating_from_chars_hex): Use __from_chars_alnum_to_val > to recognize a hex digit instead. > --- > libstdc++-v3/include/std/charconv | 250 ++++++++---------- > libstdc++-v3/src/c++17/floating_from_chars.cc | 18 +- > 2 files changed, 105 insertions(+), 163 deletions(-) > > diff --git a/libstdc++-v3/include/std/charconv b/libstdc++-v3/include/std/charconv > index 2ce9c7d4cb9..5e44459749a 100644 > --- a/libstdc++-v3/include/std/charconv > +++ b/libstdc++-v3/include/std/charconv > @@ -407,176 +407,127 @@ namespace __detail > return true; > } > > - /// std::from_chars implementation for integers in base 2. > - template > + // Construct and return a lookup table that maps 0-9, A-Z and a-z to the > + // corresponding corresponding base-36 value and maps all other characters > + // to 127. > + constexpr auto > + __from_chars_alnum_to_val_table() > + { > + constexpr unsigned char __lower_letters[] > + = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', > + 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', > + 'u', 'v', 'w', 'x', 'y', 'z' }; > + constexpr unsigned char __upper_letters[] > + = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', > + 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', > + 'U', 'V', 'W', 'X', 'Y', 'Z' }; > + struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table; > + for (auto& __entry : __table.__data) > + __entry = 127; > + for (int __i = 0; __i < 10; ++__i) > + __table.__data['0' + __i] = __i; > + for (int __i = 0; __i < 26; ++__i) > + { > + __table.__data[__lower_letters[__i]] = 10 + __i; > + __table.__data[__upper_letters[__i]] = 10 + __i; > + } > + return __table; > + } > + > + /// If _DecOnly is true: if the character is a decimal digit, then > + /// return its corresponding base-10 value, otherwise return a value >= 127. > + /// If _DecOnly is false: if the character is an alphanumeric digit, then > + /// return its corresponding base-36 value, otherwise return a value >= 127. > + template > + unsigned char > + __from_chars_alnum_to_val(unsigned char __c) > + { > + if _GLIBCXX17_CONSTEXPR (_DecOnly) > + return __c - '0'; > + else > + { > + static constexpr auto __table = __from_chars_alnum_to_val_table(); > + return __table.__data[__c]; > + } > + } > + > + /// std::from_chars implementation for integers in a power-of-two base. > + /// If _DecOnly is true, then we may assume __base is at most 8. > + template > bool > - __from_chars_binary(const char*& __first, const char* __last, _Tp& __val) > + __from_chars_pow2_base(const char*& __first, const char* __last, _Tp& __val, > + int __base) > { > static_assert(is_integral<_Tp>::value, "implementation bug"); > static_assert(is_unsigned<_Tp>::value, "implementation bug"); > > + __glibcxx_assert((__base & (__base - 1)) == 0); > + __glibcxx_assert(_DecOnly ? __base <= 8 : __base <= 32); > + const int __log2_base = __countr_zero(__base); > + > const ptrdiff_t __len = __last - __first; > ptrdiff_t __i = 0; > while (__i < __len && __first[__i] == '0') > ++__i; > const ptrdiff_t __leading_zeroes = __i; > > + unsigned char __leading_c = 0; > while (__i < __len) > { > - const unsigned char __c = (unsigned)__first[__i] - '0'; > - if (__c < 2) > - __val = (__val << 1) | __c; > - else > + const unsigned char __c = __from_chars_alnum_to_val<_DecOnly>(__first[__i]); > + if (__c >= __base) > break; > + __val = (__val << __log2_base) | __c; > + > + if (__i == __leading_zeroes) > + { > + // At the first iteration, remember the leading significant digit. > + __glibcxx_assert(__leading_c == 0 && __c != 0); > + __leading_c = __c; > + } > __i++; > } > __first += __i; > - return (__i - __leading_zeroes) <= __gnu_cxx::__int_traits<_Tp>::__digits; > + auto __significant_bits = (__i - __leading_zeroes) * __log2_base; > + if (__base != 2 && __leading_c != 0) > + // Compensate for a leading significant digit that didn't use all > + // of its available bits. > + __significant_bits -= __log2_base - __bit_width(__leading_c); > + __glibcxx_assert(__significant_bits >= 0); > + return __significant_bits <= __gnu_cxx::__int_traits<_Tp>::__digits; > } > > - /// std::from_chars implementation for integers in bases 3 to 10. > - template > - bool > - __from_chars_digit(const char*& __first, const char* __last, _Tp& __val, > - int __base) > - { > - static_assert(is_integral<_Tp>::value, "implementation bug"); > - static_assert(is_unsigned<_Tp>::value, "implementation bug"); > - > - auto __matches = [__base](char __c) { > - return '0' <= __c && __c <= ('0' + (__base - 1)); > - }; > - > - while (__first != __last) > - { > - const char __c = *__first; > - if (__matches(__c)) > - { > - if (!__raise_and_add(__val, __base, __c - '0')) > - { > - while (++__first != __last && __matches(*__first)) > - ; > - return false; > - } > - __first++; > - } > - else > - return true; > - } > - return true; > - } > - > - constexpr char > - __from_chars_alpha_to_num(char __c) > - { > - switch (__c) > - { > - case 'a': > - case 'A': > - return 10; > - case 'b': > - case 'B': > - return 11; > - case 'c': > - case 'C': > - return 12; > - case 'd': > - case 'D': > - return 13; > - case 'e': > - case 'E': > - return 14; > - case 'f': > - case 'F': > - return 15; > - case 'g': > - case 'G': > - return 16; > - case 'h': > - case 'H': > - return 17; > - case 'i': > - case 'I': > - return 18; > - case 'j': > - case 'J': > - return 19; > - case 'k': > - case 'K': > - return 20; > - case 'l': > - case 'L': > - return 21; > - case 'm': > - case 'M': > - return 22; > - case 'n': > - case 'N': > - return 23; > - case 'o': > - case 'O': > - return 24; > - case 'p': > - case 'P': > - return 25; > - case 'q': > - case 'Q': > - return 26; > - case 'r': > - case 'R': > - return 27; > - case 's': > - case 'S': > - return 28; > - case 't': > - case 'T': > - return 29; > - case 'u': > - case 'U': > - return 30; > - case 'v': > - case 'V': > - return 31; > - case 'w': > - case 'W': > - return 32; > - case 'x': > - case 'X': > - return 33; > - case 'y': > - case 'Y': > - return 34; > - case 'z': > - case 'Z': > - return 35; > - } > - return 127; > - } > - > - /// std::from_chars implementation for integers in bases 11 to 36. > - template > + /// std::from_chars implementation for integers in any base. > + /// If _DecOnly is true, then we may assume __base is at most 10. > + template > bool > __from_chars_alnum(const char*& __first, const char* __last, _Tp& __val, > int __base) > { > - bool __valid = true; > - while (__first != __last) > + if _GLIBCXX17_CONSTEXPR (_DecOnly) > + __glibcxx_assert(__base <= 10); > + > + const int __bits_per_digit = __bit_width(__base); > + int __unused_bits_lower_bound = __gnu_cxx::__int_traits<_Tp>::__digits; > + for (; __first != __last; ++__first) > { > - char __c = *__first; > - if ('0' <= __c && __c <= '9') // isdigit > - __c -= '0'; > - else > + const unsigned char __c = __from_chars_alnum_to_val<_DecOnly>(*__first); > + if (__c >= __base) > + return true; > + > + __unused_bits_lower_bound -= __bits_per_digit; > + if (__unused_bits_lower_bound >= 0) [[__likely__]] > + /* We're definitely not going to overflow. */ > + __val = __val * __base + __c; > + else if (!__raise_and_add(__val, __base, __c)) [[__unlikely__]] > { > - __c = __from_chars_alpha_to_num(__c); > - if (__c >= __base) > - break; > + while (++__first != __last > + && __from_chars_alnum_to_val<_DecOnly>(*__first) < __base) > + ; > + return false; > } > - > - if (__builtin_expect(__valid, 1)) > - __valid = __raise_and_add(__val, __base, __c); > - __first++; > } > - return __valid; > + return true; > } > > template > @@ -611,12 +562,17 @@ namespace __detail > > const auto __start = __first; > bool __valid; > - if (__base == 2) > - __valid = __detail::__from_chars_binary(__first, __last, __val); > + if ((__base & (__base - 1)) == 0) > + { > + if (__base <= 8) > + __valid = __detail::__from_chars_pow2_base(__first, __last, __val, __base); > + else > + __valid = __detail::__from_chars_pow2_base(__first, __last, __val, __base); > + } > else if (__base <= 10) > - __valid = __detail::__from_chars_digit(__first, __last, __val, __base); > + __valid = __detail::__from_chars_alnum(__first, __last, __val, __base); > else > - __valid = __detail::__from_chars_alnum(__first, __last, __val, __base); > + __valid = __detail::__from_chars_alnum(__first, __last, __val, __base); > > if (__builtin_expect(__first == __start, 0)) > __res.ec = errc::invalid_argument; > diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc b/libstdc++-v3/src/c++17/floating_from_chars.cc > index 4aa2483bc28..bbe03f7f068 100644 > --- a/libstdc++-v3/src/c++17/floating_from_chars.cc > +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc > @@ -451,20 +451,6 @@ namespace > #endif // USE_STRTOD_FOR_FROM_CHARS > > #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 > - // If the given ASCII character represents a hexit, return that hexit. > - // Otherwise return -1. > - int > - ascii_to_hexit(char ch) > - { > - if (ch >= '0' && ch <= '9') > - return ch - '0'; > - if (ch >= 'a' && ch <= 'f') > - return ch - 'a' + 10; > - if (ch >= 'A' && ch <= 'F') > - return ch - 'A' + 10; > - return -1; > - } > - > // Return true iff [FIRST,LAST) begins with PREFIX, ignoring case. > bool > starts_with_ci(const char* first, const char* last, string_view prefix) > @@ -614,8 +600,8 @@ namespace > continue; > } > > - int hexit = ascii_to_hexit(ch); > - if (hexit == -1) > + int hexit = __detail::__from_chars_alnum_to_val(ch); > + if (hexit >= 16) > break; > seen_hexit = true; > > -- > 2.36.0.rc2 > >