From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 7EE953858027 for ; Mon, 7 Nov 2022 13:37:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7EE953858027 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667828260; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DCkZ3zID6SSlweKUty5gjrsVXuZINjS57AjN5GkSS9w=; b=TQqQmK4l00x9cPPycY6hhiMKPjRWPMzWKzrytw0zbX5AyVAmkBG7Blm0utxyxEItb8GmKP 74JW6yLxB/llffHsfVILqwF0mfFCDBdUGHml/4o4NMu5Tx+FaQ5K1osc52meICZh1UDTsc FnwTx8mI8MEVJKZfMDgoY0f12k8MdPc= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-640-j00W3pu6PQqdu2PZIAjUoA-1; Mon, 07 Nov 2022 08:37:38 -0500 X-MC-Unique: j00W3pu6PQqdu2PZIAjUoA-1 Received: by mail-ed1-f70.google.com with SMTP id t4-20020a056402524400b004620845ba7bso8227984edd.4 for ; Mon, 07 Nov 2022 05:37:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DCkZ3zID6SSlweKUty5gjrsVXuZINjS57AjN5GkSS9w=; b=5hRnMeBqWex0hTWnOaTUf2uYf/L3FmnHkaaAaPM2xWy2/1pU/T/xJYldOFI/FrZObO ey3/knv1tywbegq9CIkLy8+aH65PQUFVW500QjLHZQXl3axfzuNuSxRMNicwVO+KST2y ZfotAp5nWDYk5BnrUiSCTpFBMqbW3IDaF7NsOJItkPZ8mMANp6/iviKbvJdtFbTr7dau sZFom/6IgkgZxGziwxN/KSjbbsMpu1JH5tjHOxAh5rJY4oZaO83VhxwldDEnkjT1gP04 a8H+CkHBdNAkRUignBoO7VoZVxjwzqwDJ761FkIY6IgRYNfOWKvih3NHvKDtMcjXuw3i UOcw== X-Gm-Message-State: ACrzQf2j1M0CZen90YNCb8VefctjD+6D5P6dr3siYntz7dsLI1bday7p ZXNQVLnftFWm3i6tbibOGmJGUCObwk8lvvaDASw4Cu+ThSLawzFQypYgu/znhyjPKILfrJCYGNn GdeB9ZmUNDnFgjWfyMqjlFcFz1i5AP3Q= X-Received: by 2002:a17:906:9bcd:b0:7ae:2679:c47 with SMTP id de13-20020a1709069bcd00b007ae26790c47mr19174529ejc.353.1667828257546; Mon, 07 Nov 2022 05:37:37 -0800 (PST) X-Google-Smtp-Source: AMsMyM69Dz+NTPPe8OD+Q3LwQoiWmCIVQOurTgR2YzrRNM26+mYqM9IldgAvJPjnJGY0GFEpVwXT9FbUYoN1xMl8XC0= X-Received: by 2002:a17:906:9bcd:b0:7ae:2679:c47 with SMTP id de13-20020a1709069bcd00b007ae26790c47mr19174480ejc.353.1667828256849; Mon, 07 Nov 2022 05:37:36 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jonathan Wakely Date: Mon, 7 Nov 2022 13:37:25 +0000 Message-ID: Subject: Re: [PATCH] libstdc++: Update from latest fast_float [PR107468] To: Jakub Jelinek Cc: Patrick Palka , gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_INFOUSMEBIZ,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 7 Nov 2022 at 08:19, Jakub Jelinek wrote: > > Hi! > > The following patch updates from fast_float trunk. That way > it grabs two of the 4 LOCAL_PATCHES, some smaller tweaks, to_extended > cleanups and most importantly fix for the incorrect rounding case, > PR107468 aka https://github.com/fastfloat/fast_float/issues/149 > Using std::fegetround showed in benchmarks too slow, so instead of > doing that the patch limits the fast path where it uses floating > point multiplication rather than integral to cases where we can > prove there will be no rounding (the multiplication will be exact, not > just that the two multiplication or division operation arguments are > exactly representable). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK, thanks. > > 2022-11-07 Jakub Jelinek > > PR libstdc++/107468 > * src/c++17/fast_float/MERGE: Adjust for merge from upstream. > * src/c++17/fast_float/LOCAL_PATCHES: Remove commits that were > upstreamed. > * src/c++17/fast_float/README.md: Merge from fast_float > 662497742fea7055f0e0ee27e5a7ddc382c2c38e commit. > * src/c++17/fast_float/fast_float.h: Likewise. > * testsuite/20_util/from_chars/pr107468.cc: New test. > > --- libstdc++-v3/src/c++17/fast_float/MERGE.jj 2022-01-18 11:59:00.30697= 1713 +0100 > +++ libstdc++-v3/src/c++17/fast_float/MERGE 2022-11-05 18:42:50.81589= 2080 +0100 > @@ -1,4 +1,4 @@ > -d35368cae610b4edeec61cd41e4d2367a4d33f58 > +662497742fea7055f0e0ee27e5a7ddc382c2c38e > > The first line of this file holds the git revision number of the > last merge done from the master library sources. > --- libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES.jj 2022-02-04 14:36:= 56.965577924 +0100 > +++ libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES 2022-11-05 19:02:= 57.360336939 +0100 > @@ -1,4 +1,2 @@ > r12-6647 > r12-6648 > -r12-6664 > -r12-6665 > --- libstdc++-v3/src/c++17/fast_float/README.md.jj 2022-01-18 11:59:= 00.306971713 +0100 > +++ libstdc++-v3/src/c++17/fast_float/README.md 2022-11-05 18:32:34.66834= 5927 +0100 > @@ -1,12 +1,5 @@ > ## fast_float number parsing library: 4x faster than strtod > > -![Ubuntu 20.04 CI (GCC 9)](https://github.com/lemire/fast_float/workflow= s/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg) > -![Ubuntu 18.04 CI (GCC 7)](https://github.com/lemire/fast_float/workflow= s/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg) > -![Alpine Linux](https://github.com/lemire/fast_float/workflows/Alpine%20= Linux/badge.svg) > -![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badg= e.svg) > -![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLA= NG-CI/badge.svg) > -[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs= 16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workf= lows/vs16-ci.yml) > - > The fast_float library provides fast header-only implementations for the= C++ from_chars > functions for `float` and `double` types. These functions convert ASCII= strings representing > decimal values (e.g., `1.3e10`) into binary types. We provide exact roun= ding (including > @@ -28,8 +21,8 @@ struct from_chars_result { > ``` > > It parses the character sequence [first,last) for a number. It parses fl= oating-point numbers expecting > -a locale-independent format equivalent to the C++17 from_chars function. > -The resulting floating-point value is the closest floating-point values = (using either float or double), > +a locale-independent format equivalent to the C++17 from_chars function. > +The resulting floating-point value is the closest floating-point values = (using either float or double), > using the "round to even" convention for values that would otherwise fal= l right in-between two values. > That is, we provide exact parsing according to the IEEE standard. > > @@ -47,7 +40,7 @@ Example: > ``` C++ > #include "fast_float/fast_float.h" > #include > - > + > int main() { > const std::string input =3D "3.1416 xyz "; > double result; > @@ -60,15 +53,15 @@ int main() { > > > Like the C++17 standard, the `fast_float::from_chars` functions take an = optional last argument of > -the type `fast_float::chars_format`. It is a bitset value: we check whet= her > +the type `fast_float::chars_format`. It is a bitset value: we check whet= her > `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_for= mat::scientific` are set > to determine whether we allow the fixed point and scientific notation re= spectively. > The default is `fast_float::chars_format::general` which allows both `f= ixed` and `scientific`. > > -The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++dr= aft/charconv.from.chars).(7.1)) specification. > +The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++dr= aft/charconv.from.chars).(7.1)) specification. > * The `from_chars` function does not skip leading white-space characters= . > * [A leading `+` sign](https://en.cppreference.com/w/cpp/utility/from_ch= ars) is forbidden. > -* It is generally impossible to represent a decimal value exactly as bin= ary floating-point number (`float` and `double` types). We seek the nearest= value. We round to an even mantissa when we are in-between two binary floa= ting-point numbers. > +* It is generally impossible to represent a decimal value exactly as bin= ary floating-point number (`float` and `double` types). We seek the nearest= value. We round to an even mantissa when we are in-between two binary floa= ting-point numbers. > > Furthermore, we have the following restrictions: > * We only support `float` and `double` types at this time. > @@ -77,22 +70,22 @@ Furthermore, we have the following restr > > We support Visual Studio, macOS, Linux, freeBSD. We support big and litt= le endian. We support 32-bit and 64-bit systems. > > - > +We assume that the rounding mode is set to nearest (`std::fegetround() = =3D=3D FE_TONEAREST`). > > ## Using commas as decimal separator > > > The C++ standard stipulate that `from_chars` has to be locale-independen= t. In > -particular, the decimal separator has to be the period (`.`). However, > -some users still want to use the `fast_float` library with in a locale-d= ependent > +particular, the decimal separator has to be the period (`.`). However, > +some users still want to use the `fast_float` library with in a locale-d= ependent > manner. Using a separate function called `from_chars_advanced`, we allow= the users > -to pass a `parse_options` instance which contains a custom decimal separ= ator (e.g., > +to pass a `parse_options` instance which contains a custom decimal separ= ator (e.g., > the comma). You may use it as follows. > > ```C++ > #include "fast_float/fast_float.h" > #include > - > + > int main() { > const std::string input =3D "3,1416 xyz "; > double result; > @@ -104,25 +97,55 @@ int main() { > } > ``` > > +You can parse delimited numbers: > +```C++ > + const std::string input =3D "234532.3426362,7869234.9823,324562.645"= ; > + double result; > + auto answer =3D fast_float::from_chars(input.data(), input.data()+inpu= t.size(), result); > + if(answer.ec !=3D std::errc()) { > + // check error > + } > + // we have result =3D=3D 234532.3426362. > + if(answer.ptr[0] !=3D ',') { > + // unexpected delimiter > + } > + answer =3D fast_float::from_chars(answer.ptr + 1, input.data()+input.s= ize(), result); > + if(answer.ec !=3D std::errc()) { > + // check error > + } > + // we have result =3D=3D 7869234.9823. > + if(answer.ptr[0] !=3D ',') { > + // unexpected delimiter > + } > + answer =3D fast_float::from_chars(answer.ptr + 1, input.data()+input.s= ize(), result); > + if(answer.ec !=3D std::errc()) { > + // check error > + } > + // we have result =3D=3D 324562.645. > +``` > > ## Reference > > -- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv= .org/abs/2101.11408), Software: Pratice and Experience 51 (8), 2021. > +- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv= .org/abs/2101.11408), Software: Practice and Experience 51 (8), 2021. > > ## Other programming languages > > - [There is an R binding](https://github.com/eddelbuettel/rcppfastfloat)= called `rcppfastfloat`. > - [There is a Rust port of the fast_float library](https://github.com/al= danor/fast-float-rust/) called `fast-float-rust`. > -- [There is a Java port of the fast_float library](https://github.com/wr= andelshofer/FastDoubleParser) called `FastDoubleParser`. > +- [There is a Java port of the fast_float library](https://github.com/wr= andelshofer/FastDoubleParser) called `FastDoubleParser`. It used for import= ant systems such as [Jackson](https://github.com/FasterXML/jackson-core). > - [There is a C# port of the fast_float library](https://github.com/Carl= Verret/csFastFloat) called `csFastFloat`. > > > ## Relation With Other Work > > -The fastfloat algorithm is part of the [LLVM standard libraries](https:/= /github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7= ba). > +The fast_float library is part of GCC (as of version 12): the `from_char= s` function in GCC relies on fast_float. > + > +The fastfloat algorithm is part of the [LLVM standard libraries](https:/= /github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7= ba). > > The fast_float library provides a performance similar to that of the [fa= st_double_parser](https://github.com/lemire/fast_double_parser) library but= using an updated algorithm reworked from the ground up, and while offering= an API more in line with the expectations of C++ programmers. The fast_dou= ble_parser library is part of the [Microsoft LightGBM machine-learning fram= ework](https://github.com/microsoft/LightGBM). > > +There is a [derived implementation part of AdaCore](https://github.com/A= daCore/VSS). > + > ## Users > > The fast_float library is used by [Apache Arrow](https://github.com/apac= he/arrow/pull/8494) where it multiplied the number parsing speed by two or = three times. It is also used by [Yandex ClickHouse](https://github.com/Clic= kHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonne= t). > @@ -135,14 +158,14 @@ It can parse random floating-point numbe > > > ``` > -$ ./build/benchmarks/benchmark > +$ ./build/benchmarks/benchmark > # parsing random integers in the range [0,1) > -volume =3D 2.09808 MB > -netlib : 271.18 MB/s (+/- 1.2 %) 1= 2.93 Mfloat/s > -doubleconversion : 225.35 MB/s (+/- 1.2 %) 1= 0.74 Mfloat/s > -strtod : 190.94 MB/s (+/- 1.6 %) = 9.10 Mfloat/s > -abseil : 430.45 MB/s (+/- 2.2 %) 2= 0.52 Mfloat/s > -fastfloat : 1042.38 MB/s (+/- 9.9 %) 4= 9.68 Mfloat/s > +volume =3D 2.09808 MB > +netlib : 271.18 MB/s (+/- 1.2 %) 1= 2.93 Mfloat/s > +doubleconversion : 225.35 MB/s (+/- 1.2 %) 1= 0.74 Mfloat/s > +strtod : 190.94 MB/s (+/- 1.6 %) = 9.10 Mfloat/s > +abseil : 430.45 MB/s (+/- 2.2 %) 2= 0.52 Mfloat/s > +fastfloat : 1042.38 MB/s (+/- 9.9 %) 4= 9.68 Mfloat/s > ``` > > See https://github.com/lemire/simple_fastfloat_benchmark for our benchma= rking code. > @@ -183,23 +206,23 @@ You should change the `GIT_TAG` line so > > ## Using as single header > > -The script `script/amalgamate.py` may be used to generate a single heade= r > +The script `script/amalgamate.py` may be used to generate a single heade= r > version of the library if so desired. > -Just run the script from the root directory of this repository. > +Just run the script from the root directory of this repository. > You can customize the license type and output file if desired as describ= ed in > the command line help. > > You may directly download automatically generated single-header files: > > -https://github.com/fastfloat/fast_float/releases/download/v1.1.2/fast_fl= oat.h > +https://github.com/fastfloat/fast_float/releases/download/v3.4.0/fast_fl= oat.h > > ## Credit > > -Though this work is inspired by many different people, this work benefit= ed especially from exchanges with > -Michael Eisel, who motivated the original research with his key insights= , and with Nigel Tao who provided > +Though this work is inspired by many different people, this work benefit= ed especially from exchanges with > +Michael Eisel, who motivated the original research with his key insights= , and with Nigel Tao who provided > invaluable feedback. R=C3=A9my Oudompheng first implemented a fast path = we use in the case of long digits. > > -The library includes code adapted from Google Wuffs (written by Nigel Ta= o) which was originally published > +The library includes code adapted from Google Wuffs (written by Nigel Ta= o) which was originally published > under the Apache 2.0 license. > > ## License > --- libstdc++-v3/src/c++17/fast_float/fast_float.h.jj 2022-02-04 14:36:= 56.966577910 +0100 > +++ libstdc++-v3/src/c++17/fast_float/fast_float.h 2022-11-05 18:54:= 48.096049177 +0100 > @@ -74,7 +74,7 @@ struct parse_options { > * Like the C++17 standard, the `fast_float::from_chars` functions take = an optional last argument of > * the type `fast_float::chars_format`. It is a bitset value: we check w= hether > * `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_= format::scientific` are set > - * to determine whether we allowe the fixed point and scientific notatio= n respectively. > + * to determine whether we allow the fixed point and scientific notation= respectively. > * The default is `fast_float::chars_format::general` which allows both= `fixed` and `scientific`. > */ > template > @@ -98,12 +98,11 @@ from_chars_result from_chars_advanced(co > || defined(__amd64) || defined(__aarch64__) || defined(_M_ARM64) = \ > || defined(__MINGW64__) = \ > || defined(__s390x__) = \ > - || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le= __) || defined(__PPC64LE__)) \ > - || defined(__EMSCRIPTEN__)) > + || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le= __) || defined(__PPC64LE__)) ) > #define FASTFLOAT_64BIT > #elif (defined(__i386) || defined(__i386__) || defined(_M_IX86) \ > || defined(__arm__) || defined(_M_ARM) \ > - || defined(__MINGW32__)) > + || defined(__MINGW32__) || defined(__EMSCRIPTEN__)) > #define FASTFLOAT_32BIT > #else > // Need to check incrementally, since SIZE_MAX is a size_t, avoid over= flow. > @@ -128,7 +127,7 @@ from_chars_result from_chars_advanced(co > #define FASTFLOAT_VISUAL_STUDIO 1 > #endif > > -#ifdef __BYTE_ORDER__ > +#if defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__ > #define FASTFLOAT_IS_BIG_ENDIAN (__BYTE_ORDER__ =3D=3D __ORDER_BIG_ENDIA= N__) > #elif defined _WIN32 > #define FASTFLOAT_IS_BIG_ENDIAN 0 > @@ -271,8 +270,9 @@ fastfloat_really_inline uint64_t _umul12 > fastfloat_really_inline value128 full_multiplication(uint64_t a, > uint64_t b) { > value128 answer; > -#ifdef _M_ARM64 > +#if defined(_M_ARM64) && !defined(__MINGW32__) > // ARM64 has native support for 64-bit multiplications, no need to emu= late > + // But MinGW on ARM64 doesn't have native support for 64-bit multiplic= ations > answer.high =3D __umulh(a, b); > answer.low =3D a * b; > #elif defined(FASTFLOAT_32BIT) || (defined(_WIN64) && !defined(__clang__= )) > @@ -307,21 +307,69 @@ constexpr static double powers_of_ten_do > 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, 1e20, 1e21, 1e22}; > constexpr static float powers_of_ten_float[] =3D {1e0, 1e1, 1e2, 1e3, 1e= 4, 1e5, > 1e6, 1e7, 1e8, 1e9, 1e10= }; > +// used for max_mantissa_double and max_mantissa_float > +constexpr uint64_t constant_55555 =3D 5 * 5 * 5 * 5 * 5; > +// Largest integer value v so that (5**index * v) <=3D 1<<53. > +// 0x10000000000000 =3D=3D 1 << 53 > +constexpr static uint64_t max_mantissa_double[] =3D { > + 0x10000000000000, > + 0x10000000000000 / 5, > + 0x10000000000000 / (5 * 5), > + 0x10000000000000 / (5 * 5 * 5), > + 0x10000000000000 / (5 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555), > + 0x10000000000000 / (constant_55555 * 5), > + 0x10000000000000 / (constant_55555 * 5 * 5), > + 0x10000000000000 / (constant_55555 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * 5 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555), > + 0x10000000000000 / (constant_55555 * constant_55555 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * 5 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * constant_55555), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * constant_55555 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * constant_55555 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * constant_55555 * 5 * 5 * 5), > + 0x10000000000000 / (constant_55555 * constant_55555 * constant_555= 55 * constant_55555 * 5 * 5 * 5 * 5)}; > + // Largest integer value v so that (5**index * v) <=3D 1<<24. > + // 0x1000000 =3D=3D 1<<24 > + constexpr static uint64_t max_mantissa_float[] =3D { > + 0x1000000, > + 0x1000000 / 5, > + 0x1000000 / (5 * 5), > + 0x1000000 / (5 * 5 * 5), > + 0x1000000 / (5 * 5 * 5 * 5), > + 0x1000000 / (constant_55555), > + 0x1000000 / (constant_55555 * 5), > + 0x1000000 / (constant_55555 * 5 * 5), > + 0x1000000 / (constant_55555 * 5 * 5 * 5), > + 0x1000000 / (constant_55555 * 5 * 5 * 5 * 5), > + 0x1000000 / (constant_55555 * constant_55555), > + 0x1000000 / (constant_55555 * constant_55555 * 5)}; > > template struct binary_format { > + using equiv_uint =3D typename std::conditional::type; > + > static inline constexpr int mantissa_explicit_bits(); > static inline constexpr int minimum_exponent(); > static inline constexpr int infinite_power(); > static inline constexpr int sign_index(); > - static inline constexpr int min_exponent_fast_path(); > static inline constexpr int max_exponent_fast_path(); > static inline constexpr int max_exponent_round_to_even(); > static inline constexpr int min_exponent_round_to_even(); > - static inline constexpr uint64_t max_mantissa_fast_path(); > + static inline constexpr uint64_t max_mantissa_fast_path(int64_t power)= ; > static inline constexpr int largest_power_of_ten(); > static inline constexpr int smallest_power_of_ten(); > static inline constexpr T exact_power_of_ten(int64_t power); > static inline constexpr size_t max_digits(); > + static inline constexpr equiv_uint exponent_mask(); > + static inline constexpr equiv_uint mantissa_mask(); > + static inline constexpr equiv_uint hidden_bit_mask(); > }; > > template <> inline constexpr int binary_format::mantissa_explici= t_bits() { > @@ -364,21 +412,6 @@ template <> inline constexpr int binary_ > template <> inline constexpr int binary_format::sign_index() { r= eturn 63; } > template <> inline constexpr int binary_format::sign_index() { re= turn 31; } > > -template <> inline constexpr int binary_format::min_exponent_fas= t_path() { > -#if (FLT_EVAL_METHOD !=3D 1) && (FLT_EVAL_METHOD !=3D 0) > - return 0; > -#else > - return -22; > -#endif > -} > -template <> inline constexpr int binary_format::min_exponent_fast= _path() { > -#if (FLT_EVAL_METHOD !=3D 1) && (FLT_EVAL_METHOD !=3D 0) > - return 0; > -#else > - return -10; > -#endif > -} > - > template <> inline constexpr int binary_format::max_exponent_fas= t_path() { > return 22; > } > @@ -386,11 +419,17 @@ template <> inline constexpr int binary_ > return 10; > } > > -template <> inline constexpr uint64_t binary_format::max_mantiss= a_fast_path() { > - return uint64_t(2) << mantissa_explicit_bits(); > +template <> inline constexpr uint64_t binary_format::max_mantiss= a_fast_path(int64_t power) { > + // caller is responsible to ensure that > + // power >=3D 0 && power <=3D 22 > + // > + return max_mantissa_double[power]; > } > -template <> inline constexpr uint64_t binary_format::max_mantissa= _fast_path() { > - return uint64_t(2) << mantissa_explicit_bits(); > +template <> inline constexpr uint64_t binary_format::max_mantissa= _fast_path(int64_t power) { > + // caller is responsible to ensure that > + // power >=3D 0 && power <=3D 10 > + // > + return max_mantissa_float[power]; > } > > template <> > @@ -429,6 +468,33 @@ template <> inline constexpr size_t bina > return 114; > } > > +template <> inline constexpr binary_format::equiv_uint > + binary_format::exponent_mask() { > + return 0x7F800000; > +} > +template <> inline constexpr binary_format::equiv_uint > + binary_format::exponent_mask() { > + return 0x7FF0000000000000; > +} > + > +template <> inline constexpr binary_format::equiv_uint > + binary_format::mantissa_mask() { > + return 0x007FFFFF; > +} > +template <> inline constexpr binary_format::equiv_uint > + binary_format::mantissa_mask() { > + return 0x000FFFFFFFFFFFFF; > +} > + > +template <> inline constexpr binary_format::equiv_uint > + binary_format::hidden_bit_mask() { > + return 0x00800000; > +} > +template <> inline constexpr binary_format::equiv_uint > + binary_format::hidden_bit_mask() { > + return 0x0010000000000000; > +} > + > template > fastfloat_really_inline void to_float(bool negative, adjusted_mantissa a= m, T &value) { > uint64_t word =3D am.mantissa; > @@ -2410,40 +2476,24 @@ fastfloat_really_inline int32_t scientif > // this converts a native floating-point number to an extended-precision= float. > template > fastfloat_really_inline adjusted_mantissa to_extended(T value) noexcept = { > + using equiv_uint =3D typename binary_format::equiv_uint; > + constexpr equiv_uint exponent_mask =3D binary_format::exponent_mask= (); > + constexpr equiv_uint mantissa_mask =3D binary_format::mantissa_mask= (); > + constexpr equiv_uint hidden_bit_mask =3D binary_format::hidden_bit_= mask(); > + > adjusted_mantissa am; > int32_t bias =3D binary_format::mantissa_explicit_bits() - binary_f= ormat::minimum_exponent(); > - if (std::is_same::value) { > - constexpr uint32_t exponent_mask =3D 0x7F800000; > - constexpr uint32_t mantissa_mask =3D 0x007FFFFF; > - constexpr uint64_t hidden_bit_mask =3D 0x00800000; > - uint32_t bits; > - ::memcpy(&bits, &value, sizeof(T)); > - if ((bits & exponent_mask) =3D=3D 0) { > - // denormal > - am.power2 =3D 1 - bias; > - am.mantissa =3D bits & mantissa_mask; > - } else { > - // normal > - am.power2 =3D int32_t((bits & exponent_mask) >> binary_format::= mantissa_explicit_bits()); > - am.power2 -=3D bias; > - am.mantissa =3D (bits & mantissa_mask) | hidden_bit_mask; > - } > + equiv_uint bits; > + ::memcpy(&bits, &value, sizeof(T)); > + if ((bits & exponent_mask) =3D=3D 0) { > + // denormal > + am.power2 =3D 1 - bias; > + am.mantissa =3D bits & mantissa_mask; > } else { > - constexpr uint64_t exponent_mask =3D 0x7FF0000000000000; > - constexpr uint64_t mantissa_mask =3D 0x000FFFFFFFFFFFFF; > - constexpr uint64_t hidden_bit_mask =3D 0x0010000000000000; > - uint64_t bits; > - ::memcpy(&bits, &value, sizeof(T)); > - if ((bits & exponent_mask) =3D=3D 0) { > - // denormal > - am.power2 =3D 1 - bias; > - am.mantissa =3D bits & mantissa_mask; > - } else { > - // normal > - am.power2 =3D int32_t((bits & exponent_mask) >> binary_format::= mantissa_explicit_bits()); > - am.power2 -=3D bias; > - am.mantissa =3D (bits & mantissa_mask) | hidden_bit_mask; > - } > + // normal > + am.power2 =3D int32_t((bits & exponent_mask) >> binary_format::ma= ntissa_explicit_bits()); > + am.power2 -=3D bias; > + am.mantissa =3D (bits & mantissa_mask) | hidden_bit_mask; > } > > return am; > @@ -2869,11 +2919,10 @@ from_chars_result from_chars_advanced(co > } > answer.ec =3D std::errc(); // be optimistic > answer.ptr =3D pns.lastmatch; > - // Next is Clinger's fast path. > - if (binary_format::min_exponent_fast_path() <=3D pns.exponent && pn= s.exponent <=3D binary_format::max_exponent_fast_path() && pns.mantissa = <=3Dbinary_format::max_mantissa_fast_path() && !pns.too_many_digits) { > + // Next is a modified Clinger's fast path, inspired by Jakub Jel=C3=AD= nek's proposal > + if (pns.exponent >=3D 0 && pns.exponent <=3D binary_format::max_exp= onent_fast_path() && pns.mantissa <=3Dbinary_format::max_mantissa_fast_p= ath(pns.exponent) && !pns.too_many_digits) { > value =3D T(pns.mantissa); > - if (pns.exponent < 0) { value =3D value / binary_format::exact_po= wer_of_ten(-pns.exponent); } > - else { value =3D value * binary_format::exact_power_of_ten(pns.ex= ponent); } > + value =3D value * binary_format::exact_power_of_ten(pns.exponent)= ; > if (pns.negative) { value =3D -value; } > return answer; > } > --- libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc.jj 2022-11-0= 5 19:20:28.944898668 +0100 > +++ libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc 2022-11-0= 5 19:26:44.318740322 +0100 > @@ -0,0 +1,42 @@ > +// Copyright (C) 2022 Free Software Foundation, Inc. > +// > +// This file is part of the GNU ISO C++ Library. This library is free > +// software; you can redistribute it and/or modify it under the > +// terms of the GNU General Public License as published by the > +// Free Software Foundation; either version 3, or (at your option) > +// any later version. > + > +// This library is distributed in the hope that it will be useful, > +// but WITHOUT ANY WARRANTY; without even the implied warranty of > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +// GNU General Public License for more details. > + > +// You should have received a copy of the GNU General Public License alo= ng > +// with this library; see the file COPYING3. If not see > +// . > + > +// { dg-do run { target c++17 } } > +// { dg-add-options ieee } > + > +#include > +#include > +#include > +#include > + > +int > +main() > +{ > + // FP from_char not available otherwise. > +#if __cpp_lib_to_chars >=3D 201611L \ > + && _GLIBCXX_USE_C99_FENV_TR1 \ > + && defined(FE_DOWNWARD) \ > + && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32) > + // PR libstdc++/107468 > + float f; > + char buf[] =3D "3.355447e+07"; > + std::fesetround(FE_DOWNWARD); > + auto [ptr, ec] =3D std::from_chars(buf, buf + sizeof(buf) - 1, f, std:= :chars_format::scientific); > + VERIFY( ec =3D=3D std::errc() && ptr =3D=3D buf + sizeof(buf) - 1 ); > + VERIFY( f =3D=3D 33554472.0f ); > +#endif > +} > > Jakub >