From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 2BBF63857342 for ; Thu, 14 Apr 2022 15:12:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2BBF63857342 Received: from mail-yb1-f198.google.com (mail-yb1-f198.google.com [209.85.219.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-621-IuIYxOdROVeqYhVs4Nqu_w-1; Thu, 14 Apr 2022 11:12:26 -0400 X-MC-Unique: IuIYxOdROVeqYhVs4Nqu_w-1 Received: by mail-yb1-f198.google.com with SMTP id e4-20020a056902034400b00633691534d5so4553699ybs.7 for ; Thu, 14 Apr 2022 08:12:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4MJ1mqpEpbJdj75Pu4hhQrWElUQ4y2WQ3Wvb5KDzguY=; b=RjL3xcHprcRoswW0LvlJyD364EkDMTLqz+e66kLYCrB0KYRzuE/H/zZtVlT3Fz7UG2 ZSSSZVGDLdErSvvDxr18U3ctm5N2mY3awa2zP9g/2/eIj/qV1LgfE1CpmxZSOmuwuHmS F2+fjKI8rdPJfDEJ9wWrY0+VqOH9i2x9jAzSn89/zkw83T5NYfZJOXozN4HZT8Vk7JBg 9L0tesLvuGNftGI1grCShSahk3sh4Befk4mA4bfv0jOWxoYlu2Set9u70l8kOuw4Rqgl UhTWJC7bla3K0kVymilQPQprUyyFg4D9Y+2m4X942myqLX6Qo/TIJiHuoniAs1WzVhmr qEUw== X-Gm-Message-State: AOAM530WpkwUIWpd+GqpdJDGPoOmsx0Iv7pH+5e1CASma/MhXxgRS037 7MG1b/EIXmuVDQTdhPVY7UE9FKHmXyLQZ31o0YSgzc4NDIwDg7aSVLZDM9IPyK9CmvbUv/NzChO t/fe1a6S8LJU+/hh3VQs4wCy1M0qvyc0= X-Received: by 2002:a81:7c88:0:b0:2e5:8fa2:29d with SMTP id x130-20020a817c88000000b002e58fa2029dmr2424082ywc.346.1649949146112; Thu, 14 Apr 2022 08:12:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5541K19Xy4HGyToQSFxjoCHvtiIDIMf2Ur0ZkUYBxHEH4EzlJa9Q5BHQXBDc1FXknw2+/dY1c85R1IsqXH+Y= X-Received: by 2002:a81:7c88:0:b0:2e5:8fa2:29d with SMTP id x130-20020a817c88000000b002e58fa2029dmr2424059ywc.346.1649949145835; Thu, 14 Apr 2022 08:12:25 -0700 (PDT) MIME-Version: 1.0 References: <20220414123113.175965-1-ppalka@redhat.com> In-Reply-To: <20220414123113.175965-1-ppalka@redhat.com> From: Jonathan Wakely Date: Thu, 14 Apr 2022 16:12:14 +0100 Message-ID: Subject: Re: [PATCH] libstdc++: Optimize integer std::from_chars To: Patrick Palka Cc: gcc Patches , "libstdc++" X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 15:12:29 -0000 On Thu, 14 Apr 2022 at 13:32, Patrick Palka via Libstdc++ wrote: > > This applies the following optimizations to the integer std::from_chars > implementation: > > 1. Use a lookup table for converting an alphanumeric digit to its > base-36 value instead of using a range test (for 0-9) and switch > (for a-z and A-Z). The table is constructed using a C++14 > constexpr function which doesn't assume a particular character > encoding or __CHAR_BIT__ value. The new conversion function > __from_chars_alnum_to_val is templated on whether we care > only about the decimal digits, in which case we can perform the > conversion with a single subtraction since the digit characters > are guaranteed to be contiguous (unlike the letters). > 2. Generalize __from_chars_binary to handle all power-of-two bases. > This function, now named __from_chars_pow2_base, is also templated > on whether we care only about the decimal digits in order to speed > up digit conversion for base 2, 4 and 8. > 3. In __from_chars_digit, use > static_cast(__c - '0') < __base > instead of > '0' <= __c && __c <= ('0' + (__base - 1)). > as the digit recognition test (exhaustively verified that the two > tests are equivalent). > 4. In __from_chars_alnum, use a nested loop to consume the rest of the > digits in the overflow case (mirroring __from_chars_digit) so that > the main loop doesn't have to maintain the __valid overflow flag. > > At this point, __from_chars_digit is nearly identical to > __from_chars_alnum, so this patch combines the two functions, removing > the former and templatizing the latter according to whether we care only > about the decimal digits. Finally, > > 5. In __from_chars_alnum, keep track of a lower bound on the number of > unused bits in the result and use that to omit the overflow check > when it's safe to do so. > > In passing this replaces the non-portable function ascii_to_hexit > used by __floating_from_chars_hex with the new conversion function. > > Here are some runtime measurements for a simple 15-line benchmark that > roundtrips printing/parsing 200 million integers via std::to/from_chars > (average of 5 runs): > > Base Before After (seconds, lower is better) > 2 9.37 9.37 > 3 12.13 15.79 > 8 3.67 4.15 > 10 3.86 4.90 > 11 5.03 6.84 > 16 2.93 4.14 > 32 2.39 3.85 > 36 3.26 5.22 > > Testedon x86_64-pc-linux-gnu, does this look OK for trunk? Also tested > against libc++'s from_chars tests for good measure. > > libstdc++-v3/ChangeLog: > > * include/std/charconv (__from_chars_alnum_to_val_table): Define. > (__from_chars_alnum_to_val): Define. > (__from_chars_binary): Rename to ... > (__from_chars_pow2_base): ... this. Generalize to handle any > power-of-two base using __from_chars_alnum_to_val. > (__from_chars_digit): Optimize digit recognition to a single > test instead of two tests. Use [[__unlikely___]] attribute. > (__from_chars_alpha_to_num): Remove. > (__from_chars_alnum): Use __from_chars_alnum_to_val. Use a > nested loop for the overflow case. > (from_chars): Adjust appropriately. > * src/c++17/floating_from_chars.cc (ascii_to_hexit): Remove. > (__floating_from_chars_hex): Use __from_chars_alnum_to_val > to recognize a hex digit instead. > --- > libstdc++-v3/include/std/charconv | 250 ++++++++---------- > libstdc++-v3/src/c++17/floating_from_chars.cc | 18 +- > 2 files changed, 105 insertions(+), 163 deletions(-) > > diff --git a/libstdc++-v3/include/std/charconv b/libstdc++-v3/include/std/charconv > index 2ce9c7d4cb9..5e44459749a 100644 > --- a/libstdc++-v3/include/std/charconv > +++ b/libstdc++-v3/include/std/charconv > @@ -407,176 +407,127 @@ namespace __detail > return true; > } > > - /// std::from_chars implementation for integers in base 2. > - template > + // Construct and return a lookup table that maps 0-9, A-Z and a-z to the > + // corresponding corresponding base-36 value and maps all other characters > + // to 127. > + constexpr auto > + __from_chars_alnum_to_val_table() > + { > + constexpr unsigned char __lower_letters[] > + = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', > + 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', > + 'u', 'v', 'w', 'x', 'y', 'z' }; > + constexpr unsigned char __upper_letters[] > + = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', > + 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', > + 'U', 'V', 'W', 'X', 'Y', 'Z' }; > + struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table; > + for (auto& __entry : __table.__data) > + __entry = 127; > + for (int __i = 0; __i < 10; ++__i) > + __table.__data['0' + __i] = __i; > + for (int __i = 0; __i < 26; ++__i) > + { > + __table.__data[__lower_letters[__i]] = 10 + __i; > + __table.__data[__upper_letters[__i]] = 10 + __i; > + } > + return __table; > + } > + > + /// If _DecOnly is true: if the character is a decimal digit, then > + /// return its corresponding base-10 value, otherwise return a value >= 127. > + /// If _DecOnly is false: if the character is an alphanumeric digit, then > + /// return its corresponding base-36 value, otherwise return a value >= 127. > + template > + unsigned char > + __from_chars_alnum_to_val(unsigned char __c) > + { > + if _GLIBCXX17_CONSTEXPR (_DecOnly) > + return __c - '0'; > + else > + { > + static constexpr auto __table = __from_chars_alnum_to_val_table(); > + return __table.__data[__c]; > + } > + } > + > + /// std::from_chars implementation for integers in a power-of-two base. > + /// If _DecOnly is true, then we may assume __base is at most 8. > + template > bool > - __from_chars_binary(const char*& __first, const char* __last, _Tp& __val) > + __from_chars_pow2_base(const char*& __first, const char* __last, _Tp& __val, > + int __base) > { > static_assert(is_integral<_Tp>::value, "implementation bug"); > static_assert(is_unsigned<_Tp>::value, "implementation bug"); > > + __glibcxx_assert((__base & (__base - 1)) == 0); > + __glibcxx_assert(_DecOnly ? __base <= 8 : __base <= 32); I think we should remove these assertions. They can only fail if we messed up in libstdc++, so users should not have to pay to assert that the implementation is correct. We should save assertions for things the user might get wrong. If you want, leave them there but commented out. That serves as a reminder of the preconditions, and we can uncomment them while refactoring the code, to ensure we don't break something later. Maybe we should have a separate kind of assertion, something like __glibcxx_impl_assert, which would be enabled by _GLIBCXX_IMPL_ASSERTIONS that we can enable for the testsuite. But that would be something to add in stage 1. Apart from that, this looks great, please push to trunk, thanks!