From: Patrick Palka <ppalka@redhat.com>
To: Patrick Palka <ppalka@redhat.com>
Cc: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org
Subject: Re: [PATCH] libstdc++: Optimize integer std::from_chars
Date: Thu, 14 Apr 2022 08:37:59 -0400 (EDT) [thread overview]
Message-ID: <69e19e1b-3c1b-f2a0-6965-9f83eff571d8@idea> (raw)
In-Reply-To: <20220414123113.175965-1-ppalka@redhat.com>
On Thu, 14 Apr 2022, Patrick Palka wrote:
> This applies the following optimizations to the integer std::from_chars
> implementation:
>
> 1. Use a lookup table for converting an alphanumeric digit to its
> base-36 value instead of using a range test (for 0-9) and switch
> (for a-z and A-Z). The table is constructed using a C++14
> constexpr function which doesn't assume a particular character
> encoding or __CHAR_BIT__ value. The new conversion function
> __from_chars_alnum_to_val is templated on whether we care
> only about the decimal digits, in which case we can perform the
> conversion with a single subtraction since the digit characters
> are guaranteed to be contiguous (unlike the letters).
> 2. Generalize __from_chars_binary to handle all power-of-two bases.
> This function, now named __from_chars_pow2_base, is also templated
> on whether we care only about the decimal digits in order to speed
> up digit conversion for base 2, 4 and 8.
> 3. In __from_chars_digit, use
> static_cast<unsigned char>(__c - '0') < __base
> instead of
> '0' <= __c && __c <= ('0' + (__base - 1)).
> as the digit recognition test (exhaustively verified that the two
> tests are equivalent).
> 4. In __from_chars_alnum, use a nested loop to consume the rest of the
> digits in the overflow case (mirroring __from_chars_digit) so that
> the main loop doesn't have to maintain the __valid overflow flag.
>
> At this point, __from_chars_digit is nearly identical to
> __from_chars_alnum, so this patch combines the two functions, removing
> the former and templatizing the latter according to whether we care only
> about the decimal digits. Finally,
>
> 5. In __from_chars_alnum, keep track of a lower bound on the number of
> unused bits in the result and use that to omit the overflow check
> when it's safe to do so.
>
> In passing this replaces the non-portable function ascii_to_hexit
> used by __floating_from_chars_hex with the new conversion function.
>
> Here are some runtime measurements for a simple 15-line benchmark that
> roundtrips printing/parsing 200 million integers via std::to/from_chars
> (average of 5 runs):
>
> Base Before After (seconds, lower is better)
> 2 9.37 9.37
> 3 12.13 15.79
> 8 3.67 4.15
> 10 3.86 4.90
> 11 5.03 6.84
> 16 2.93 4.14
> 32 2.39 3.85
> 36 3.26 5.22
Whoops, the second and third columns should be swapped (runtime is
smaller after the patch across the board).
>
> Testedon x86_64-pc-linux-gnu, does this look OK for trunk? Also tested
> against libc++'s from_chars tests for good measure.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/charconv (__from_chars_alnum_to_val_table): Define.
> (__from_chars_alnum_to_val): Define.
> (__from_chars_binary): Rename to ...
> (__from_chars_pow2_base): ... this. Generalize to handle any
> power-of-two base using __from_chars_alnum_to_val.
> (__from_chars_digit): Optimize digit recognition to a single
> test instead of two tests. Use [[__unlikely___]] attribute.
> (__from_chars_alpha_to_num): Remove.
> (__from_chars_alnum): Use __from_chars_alnum_to_val. Use a
> nested loop for the overflow case.
> (from_chars): Adjust appropriately.
> * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove.
> (__floating_from_chars_hex): Use __from_chars_alnum_to_val
> to recognize a hex digit instead.
> ---
> libstdc++-v3/include/std/charconv | 250 ++++++++----------
> libstdc++-v3/src/c++17/floating_from_chars.cc | 18 +-
> 2 files changed, 105 insertions(+), 163 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/charconv b/libstdc++-v3/include/std/charconv
> index 2ce9c7d4cb9..5e44459749a 100644
> --- a/libstdc++-v3/include/std/charconv
> +++ b/libstdc++-v3/include/std/charconv
> @@ -407,176 +407,127 @@ namespace __detail
> return true;
> }
>
> - /// std::from_chars implementation for integers in base 2.
> - template<typename _Tp>
> + // Construct and return a lookup table that maps 0-9, A-Z and a-z to the
> + // corresponding corresponding base-36 value and maps all other characters
> + // to 127.
> + constexpr auto
> + __from_chars_alnum_to_val_table()
> + {
> + constexpr unsigned char __lower_letters[]
> + = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> + 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
> + 'u', 'v', 'w', 'x', 'y', 'z' };
> + constexpr unsigned char __upper_letters[]
> + = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> + 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
> + 'U', 'V', 'W', 'X', 'Y', 'Z' };
> + struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table;
> + for (auto& __entry : __table.__data)
> + __entry = 127;
> + for (int __i = 0; __i < 10; ++__i)
> + __table.__data['0' + __i] = __i;
> + for (int __i = 0; __i < 26; ++__i)
> + {
> + __table.__data[__lower_letters[__i]] = 10 + __i;
> + __table.__data[__upper_letters[__i]] = 10 + __i;
> + }
> + return __table;
> + }
> +
> + /// If _DecOnly is true: if the character is a decimal digit, then
> + /// return its corresponding base-10 value, otherwise return a value >= 127.
> + /// If _DecOnly is false: if the character is an alphanumeric digit, then
> + /// return its corresponding base-36 value, otherwise return a value >= 127.
> + template<bool _DecOnly>
> + unsigned char
> + __from_chars_alnum_to_val(unsigned char __c)
> + {
> + if _GLIBCXX17_CONSTEXPR (_DecOnly)
> + return __c - '0';
> + else
> + {
> + static constexpr auto __table = __from_chars_alnum_to_val_table();
> + return __table.__data[__c];
> + }
> + }
> +
> + /// std::from_chars implementation for integers in a power-of-two base.
> + /// If _DecOnly is true, then we may assume __base is at most 8.
> + template<bool _DecOnly, typename _Tp>
> bool
> - __from_chars_binary(const char*& __first, const char* __last, _Tp& __val)
> + __from_chars_pow2_base(const char*& __first, const char* __last, _Tp& __val,
> + int __base)
> {
> static_assert(is_integral<_Tp>::value, "implementation bug");
> static_assert(is_unsigned<_Tp>::value, "implementation bug");
>
> + __glibcxx_assert((__base & (__base - 1)) == 0);
> + __glibcxx_assert(_DecOnly ? __base <= 8 : __base <= 32);
> + const int __log2_base = __countr_zero(__base);
> +
> const ptrdiff_t __len = __last - __first;
> ptrdiff_t __i = 0;
> while (__i < __len && __first[__i] == '0')
> ++__i;
> const ptrdiff_t __leading_zeroes = __i;
>
> + unsigned char __leading_c = 0;
> while (__i < __len)
> {
> - const unsigned char __c = (unsigned)__first[__i] - '0';
> - if (__c < 2)
> - __val = (__val << 1) | __c;
> - else
> + const unsigned char __c = __from_chars_alnum_to_val<_DecOnly>(__first[__i]);
> + if (__c >= __base)
> break;
> + __val = (__val << __log2_base) | __c;
> +
> + if (__i == __leading_zeroes)
> + {
> + // At the first iteration, remember the leading significant digit.
> + __glibcxx_assert(__leading_c == 0 && __c != 0);
> + __leading_c = __c;
> + }
> __i++;
> }
> __first += __i;
> - return (__i - __leading_zeroes) <= __gnu_cxx::__int_traits<_Tp>::__digits;
> + auto __significant_bits = (__i - __leading_zeroes) * __log2_base;
> + if (__base != 2 && __leading_c != 0)
> + // Compensate for a leading significant digit that didn't use all
> + // of its available bits.
> + __significant_bits -= __log2_base - __bit_width(__leading_c);
> + __glibcxx_assert(__significant_bits >= 0);
> + return __significant_bits <= __gnu_cxx::__int_traits<_Tp>::__digits;
> }
>
> - /// std::from_chars implementation for integers in bases 3 to 10.
> - template<typename _Tp>
> - bool
> - __from_chars_digit(const char*& __first, const char* __last, _Tp& __val,
> - int __base)
> - {
> - static_assert(is_integral<_Tp>::value, "implementation bug");
> - static_assert(is_unsigned<_Tp>::value, "implementation bug");
> -
> - auto __matches = [__base](char __c) {
> - return '0' <= __c && __c <= ('0' + (__base - 1));
> - };
> -
> - while (__first != __last)
> - {
> - const char __c = *__first;
> - if (__matches(__c))
> - {
> - if (!__raise_and_add(__val, __base, __c - '0'))
> - {
> - while (++__first != __last && __matches(*__first))
> - ;
> - return false;
> - }
> - __first++;
> - }
> - else
> - return true;
> - }
> - return true;
> - }
> -
> - constexpr char
> - __from_chars_alpha_to_num(char __c)
> - {
> - switch (__c)
> - {
> - case 'a':
> - case 'A':
> - return 10;
> - case 'b':
> - case 'B':
> - return 11;
> - case 'c':
> - case 'C':
> - return 12;
> - case 'd':
> - case 'D':
> - return 13;
> - case 'e':
> - case 'E':
> - return 14;
> - case 'f':
> - case 'F':
> - return 15;
> - case 'g':
> - case 'G':
> - return 16;
> - case 'h':
> - case 'H':
> - return 17;
> - case 'i':
> - case 'I':
> - return 18;
> - case 'j':
> - case 'J':
> - return 19;
> - case 'k':
> - case 'K':
> - return 20;
> - case 'l':
> - case 'L':
> - return 21;
> - case 'm':
> - case 'M':
> - return 22;
> - case 'n':
> - case 'N':
> - return 23;
> - case 'o':
> - case 'O':
> - return 24;
> - case 'p':
> - case 'P':
> - return 25;
> - case 'q':
> - case 'Q':
> - return 26;
> - case 'r':
> - case 'R':
> - return 27;
> - case 's':
> - case 'S':
> - return 28;
> - case 't':
> - case 'T':
> - return 29;
> - case 'u':
> - case 'U':
> - return 30;
> - case 'v':
> - case 'V':
> - return 31;
> - case 'w':
> - case 'W':
> - return 32;
> - case 'x':
> - case 'X':
> - return 33;
> - case 'y':
> - case 'Y':
> - return 34;
> - case 'z':
> - case 'Z':
> - return 35;
> - }
> - return 127;
> - }
> -
> - /// std::from_chars implementation for integers in bases 11 to 36.
> - template<typename _Tp>
> + /// std::from_chars implementation for integers in any base.
> + /// If _DecOnly is true, then we may assume __base is at most 10.
> + template<bool _DecOnly, typename _Tp>
> bool
> __from_chars_alnum(const char*& __first, const char* __last, _Tp& __val,
> int __base)
> {
> - bool __valid = true;
> - while (__first != __last)
> + if _GLIBCXX17_CONSTEXPR (_DecOnly)
> + __glibcxx_assert(__base <= 10);
> +
> + const int __bits_per_digit = __bit_width(__base);
> + int __unused_bits_lower_bound = __gnu_cxx::__int_traits<_Tp>::__digits;
> + for (; __first != __last; ++__first)
> {
> - char __c = *__first;
> - if ('0' <= __c && __c <= '9') // isdigit
> - __c -= '0';
> - else
> + const unsigned char __c = __from_chars_alnum_to_val<_DecOnly>(*__first);
> + if (__c >= __base)
> + return true;
> +
> + __unused_bits_lower_bound -= __bits_per_digit;
> + if (__unused_bits_lower_bound >= 0) [[__likely__]]
> + /* We're definitely not going to overflow. */
> + __val = __val * __base + __c;
> + else if (!__raise_and_add(__val, __base, __c)) [[__unlikely__]]
> {
> - __c = __from_chars_alpha_to_num(__c);
> - if (__c >= __base)
> - break;
> + while (++__first != __last
> + && __from_chars_alnum_to_val<_DecOnly>(*__first) < __base)
> + ;
> + return false;
> }
> -
> - if (__builtin_expect(__valid, 1))
> - __valid = __raise_and_add(__val, __base, __c);
> - __first++;
> }
> - return __valid;
> + return true;
> }
>
> template<typename _Tp>
> @@ -611,12 +562,17 @@ namespace __detail
>
> const auto __start = __first;
> bool __valid;
> - if (__base == 2)
> - __valid = __detail::__from_chars_binary(__first, __last, __val);
> + if ((__base & (__base - 1)) == 0)
> + {
> + if (__base <= 8)
> + __valid = __detail::__from_chars_pow2_base<true>(__first, __last, __val, __base);
> + else
> + __valid = __detail::__from_chars_pow2_base<false>(__first, __last, __val, __base);
> + }
> else if (__base <= 10)
> - __valid = __detail::__from_chars_digit(__first, __last, __val, __base);
> + __valid = __detail::__from_chars_alnum<true>(__first, __last, __val, __base);
> else
> - __valid = __detail::__from_chars_alnum(__first, __last, __val, __base);
> + __valid = __detail::__from_chars_alnum<false>(__first, __last, __val, __base);
>
> if (__builtin_expect(__first == __start, 0))
> __res.ec = errc::invalid_argument;
> diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc b/libstdc++-v3/src/c++17/floating_from_chars.cc
> index 4aa2483bc28..bbe03f7f068 100644
> --- a/libstdc++-v3/src/c++17/floating_from_chars.cc
> +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
> @@ -451,20 +451,6 @@ namespace
> #endif // USE_STRTOD_FOR_FROM_CHARS
>
> #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64
> - // If the given ASCII character represents a hexit, return that hexit.
> - // Otherwise return -1.
> - int
> - ascii_to_hexit(char ch)
> - {
> - if (ch >= '0' && ch <= '9')
> - return ch - '0';
> - if (ch >= 'a' && ch <= 'f')
> - return ch - 'a' + 10;
> - if (ch >= 'A' && ch <= 'F')
> - return ch - 'A' + 10;
> - return -1;
> - }
> -
> // Return true iff [FIRST,LAST) begins with PREFIX, ignoring case.
> bool
> starts_with_ci(const char* first, const char* last, string_view prefix)
> @@ -614,8 +600,8 @@ namespace
> continue;
> }
>
> - int hexit = ascii_to_hexit(ch);
> - if (hexit == -1)
> + int hexit = __detail::__from_chars_alnum_to_val<false>(ch);
> + if (hexit >= 16)
> break;
> seen_hexit = true;
>
> --
> 2.36.0.rc2
>
>
next prev parent reply other threads:[~2022-04-14 12:38 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-14 12:31 Patrick Palka
2022-04-14 12:37 ` Kyrylo Tkachov
2022-04-14 12:37 ` Patrick Palka [this message]
2022-04-14 15:12 ` Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=69e19e1b-3c1b-f2a0-6965-9f83eff571d8@idea \
--to=ppalka@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).