[PATCH] libstdc++: Update from latest fast

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] libstdc++: Update from latest fast_float [PR107468]
@ 2022-11-07  8:19 Jakub Jelinek
  2022-11-07 13:37 ` Jonathan Wakely
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2022-11-07  8:19 UTC (permalink / raw)
  To: Jonathan Wakely, Patrick Palka; +Cc: gcc-patches, libstdc++

Hi!

The following patch updates from fast_float trunk.  That way
it grabs two of the 4 LOCAL_PATCHES, some smaller tweaks, to_extended
cleanups and most importantly fix for the incorrect rounding case,
PR107468 aka https://github.com/fastfloat/fast_float/issues/149
Using std::fegetround showed in benchmarks too slow, so instead of
doing that the patch limits the fast path where it uses floating
point multiplication rather than integral to cases where we can
prove there will be no rounding (the multiplication will be exact, not
just that the two multiplication or division operation arguments are
exactly representable).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-11-07  Jakub Jelinek  <jakub@redhat.com>

	PR libstdc++/107468
	* src/c++17/fast_float/MERGE: Adjust for merge from upstream.
	* src/c++17/fast_float/LOCAL_PATCHES: Remove commits that were
	upstreamed.
	* src/c++17/fast_float/README.md: Merge from fast_float
	662497742fea7055f0e0ee27e5a7ddc382c2c38e commit.
	* src/c++17/fast_float/fast_float.h: Likewise.
	* testsuite/20_util/from_chars/pr107468.cc: New test.

--- libstdc++-v3/src/c++17/fast_float/MERGE.jj	2022-01-18 11:59:00.306971713 +0100
+++ libstdc++-v3/src/c++17/fast_float/MERGE	2022-11-05 18:42:50.815892080 +0100
@@ -1,4 +1,4 @@
-d35368cae610b4edeec61cd41e4d2367a4d33f58
+662497742fea7055f0e0ee27e5a7ddc382c2c38e
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
--- libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES.jj	2022-02-04 14:36:56.965577924 +0100
+++ libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES	2022-11-05 19:02:57.360336939 +0100
@@ -1,4 +1,2 @@
 r12-6647
 r12-6648
-r12-6664
-r12-6665
--- libstdc++-v3/src/c++17/fast_float/README.md.jj	2022-01-18 11:59:00.306971713 +0100
+++ libstdc++-v3/src/c++17/fast_float/README.md	2022-11-05 18:32:34.668345927 +0100
@@ -1,12 +1,5 @@
 ## fast_float number parsing library: 4x faster than strtod
 
-![Ubuntu 20.04 CI (GCC 9)](https://github.com/lemire/fast_float/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)
-![Ubuntu 18.04 CI (GCC 7)](https://github.com/lemire/fast_float/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
-![Alpine Linux](https://github.com/lemire/fast_float/workflows/Alpine%20Linux/badge.svg)
-![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badge.svg)
-![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLANG-CI/badge.svg)
-[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml)
-
 The fast_float library provides fast header-only implementations for the C++ from_chars
 functions for `float` and `double` types.  These functions convert ASCII strings representing
 decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding (including
@@ -28,8 +21,8 @@ struct from_chars_result {
 ```
 
 It parses the character sequence [first,last) for a number. It parses floating-point numbers expecting
-a locale-independent format equivalent to the C++17 from_chars function. 
-The resulting floating-point value is the closest floating-point values (using either float or double), 
+a locale-independent format equivalent to the C++17 from_chars function.
+The resulting floating-point value is the closest floating-point values (using either float or double),
 using the "round to even" convention for values that would otherwise fall right in-between two values.
 That is, we provide exact parsing according to the IEEE standard.
 
@@ -47,7 +40,7 @@ Example:
 ``` C++
 #include "fast_float/fast_float.h"
 #include <iostream>
- 
+
 int main() {
     const std::string input =  "3.1416 xyz ";
     double result;
@@ -60,15 +53,15 @@ int main() {
 
 
 Like the C++17 standard, the `fast_float::from_chars` functions take an optional last argument of
-the type `fast_float::chars_format`. It is a bitset value: we check whether 
+the type `fast_float::chars_format`. It is a bitset value: we check whether
 `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_format::scientific` are set
 to determine whether we allow the fixed point and scientific notation respectively.
 The default is  `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
 
-The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification. 
+The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
 * The `from_chars` function does not skip leading white-space characters.
 * [A leading `+` sign](https://en.cppreference.com/w/cpp/utility/from_chars) is forbidden.
-* It is generally impossible to represent a decimal value exactly as binary floating-point number (`float` and `double` types). We seek the nearest value. We round to an even mantissa when we are in-between two binary floating-point numbers. 
+* It is generally impossible to represent a decimal value exactly as binary floating-point number (`float` and `double` types). We seek the nearest value. We round to an even mantissa when we are in-between two binary floating-point numbers.
 
 Furthermore, we have the following restrictions:
 * We only support `float` and `double` types at this time.
@@ -77,22 +70,22 @@ Furthermore, we have the following restr
 
 We support Visual Studio, macOS, Linux, freeBSD. We support big and little endian. We support 32-bit and 64-bit systems.
 
-
+We assume that the rounding mode is set to nearest (`std::fegetround() == FE_TONEAREST`).
 
 ## Using commas as decimal separator
 
 
 The C++ standard stipulate that `from_chars` has to be locale-independent. In
-particular, the decimal separator has to be the period (`.`). However, 
-some users still want to use the `fast_float` library with in a locale-dependent 
+particular, the decimal separator has to be the period (`.`). However,
+some users still want to use the `fast_float` library with in a locale-dependent
 manner. Using a separate function called `from_chars_advanced`, we allow the users
-to pass a `parse_options` instance which contains a custom decimal separator (e.g., 
+to pass a `parse_options` instance which contains a custom decimal separator (e.g.,
 the comma). You may use it as follows.
 
 ```C++
 #include "fast_float/fast_float.h"
 #include <iostream>
- 
+
 int main() {
     const std::string input =  "3,1416 xyz ";
     double result;
@@ -104,25 +97,55 @@ int main() {
 }
 ```
 
+You can parse delimited numbers:
+```C++
+  const std::string input =   "234532.3426362,7869234.9823,324562.645";
+  double result;
+  auto answer = fast_float::from_chars(input.data(), input.data()+input.size(), result);
+  if(answer.ec != std::errc()) {
+    // check error
+  }
+  // we have result == 234532.3426362.
+  if(answer.ptr[0] != ',') {
+    // unexpected delimiter
+  }
+  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
+  if(answer.ec != std::errc()) {
+    // check error
+  }
+  // we have result == 7869234.9823.
+  if(answer.ptr[0] != ',') {
+    // unexpected delimiter
+  }
+  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
+  if(answer.ec != std::errc()) {
+    // check error
+  }
+  // we have result == 324562.645.
+```
 
 ## Reference
 
-- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv.org/abs/2101.11408), Software: Pratice and Experience 51 (8), 2021.
+- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv.org/abs/2101.11408), Software: Practice and Experience 51 (8), 2021.
 
 ## Other programming languages
 
 - [There is an R binding](https://github.com/eddelbuettel/rcppfastfloat) called `rcppfastfloat`.
 - [There is a Rust port of the fast_float library](https://github.com/aldanor/fast-float-rust/) called `fast-float-rust`.
-- [There is a Java port of the fast_float library](https://github.com/wrandelshofer/FastDoubleParser) called `FastDoubleParser`.
+- [There is a Java port of the fast_float library](https://github.com/wrandelshofer/FastDoubleParser) called `FastDoubleParser`. It used for important systems such as [Jackson](https://github.com/FasterXML/jackson-core).
 - [There is a C# port of the fast_float library](https://github.com/CarlVerret/csFastFloat) called `csFastFloat`.
 
 
 ## Relation With Other Work
 
-The fastfloat algorithm is part of the [LLVM standard libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba). 
+The fast_float library is part of GCC (as of version 12): the `from_chars` function in GCC relies on fast_float.
+
+The fastfloat algorithm is part of the [LLVM standard libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
 
 The fast_float library provides a performance similar to that of the [fast_double_parser](https://github.com/lemire/fast_double_parser) library but using an updated algorithm reworked from the ground up, and while offering an API more in line with the expectations of C++ programmers. The fast_double_parser library is part of the [Microsoft LightGBM machine-learning framework](https://github.com/microsoft/LightGBM).
 
+There is a [derived implementation part of AdaCore](https://github.com/AdaCore/VSS).
+
 ## Users
 
 The fast_float library is used by [Apache Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the number parsing speed by two or three times. It is also used by [Yandex ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonnet).
@@ -135,14 +158,14 @@ It can parse random floating-point numbe
 <img src="http://lemire.me/blog/wp-content/uploads/2020/11/fastfloat_speed.png" width="400">
 
 ```
-$ ./build/benchmarks/benchmark 
+$ ./build/benchmarks/benchmark
 # parsing random integers in the range [0,1)
-volume = 2.09808 MB 
-netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 Mfloat/s  
-doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 Mfloat/s  
-strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 Mfloat/s  
-abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 Mfloat/s  
-fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 Mfloat/s  
+volume = 2.09808 MB
+netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 Mfloat/s 
+doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 Mfloat/s 
+strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 Mfloat/s 
+abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 Mfloat/s 
+fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 Mfloat/s 
 ```
 
 See https://github.com/lemire/simple_fastfloat_benchmark for our benchmarking code.
@@ -183,23 +206,23 @@ You should change the `GIT_TAG` line so
 
 ## Using as single header
 
-The script `script/amalgamate.py` may be used to generate a single header 
+The script `script/amalgamate.py` may be used to generate a single header
 version of the library if so desired.
-Just run the script from the root directory of this repository. 
+Just run the script from the root directory of this repository.
 You can customize the license type and output file if desired as described in
 the command line help.
 
 You may directly download automatically generated single-header files:
 
-https://github.com/fastfloat/fast_float/releases/download/v1.1.2/fast_float.h
+https://github.com/fastfloat/fast_float/releases/download/v3.4.0/fast_float.h
 
 ## Credit
 
-Though this work is inspired by many different people, this work benefited especially from exchanges with 
-Michael Eisel, who motivated the original research with his key insights, and with Nigel Tao who provided 
+Though this work is inspired by many different people, this work benefited especially from exchanges with
+Michael Eisel, who motivated the original research with his key insights, and with Nigel Tao who provided
 invaluable feedback. Rémy Oudompheng first implemented a fast path we use in the case of long digits.
 
-The library includes code adapted from Google Wuffs (written by Nigel Tao) which was originally published 
+The library includes code adapted from Google Wuffs (written by Nigel Tao) which was originally published
 under the Apache 2.0 license.
 
 ## License
--- libstdc++-v3/src/c++17/fast_float/fast_float.h.jj	2022-02-04 14:36:56.966577910 +0100
+++ libstdc++-v3/src/c++17/fast_float/fast_float.h	2022-11-05 18:54:48.096049177 +0100
@@ -74,7 +74,7 @@ struct parse_options {
  * Like the C++17 standard, the `fast_float::from_chars` functions take an optional last argument of
  * the type `fast_float::chars_format`. It is a bitset value: we check whether
  * `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_format::scientific` are set
- * to determine whether we allowe the fixed point and scientific notation respectively.
+ * to determine whether we allow the fixed point and scientific notation respectively.
  * The default is  `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
  */
 template<typename T>
@@ -98,12 +98,11 @@ from_chars_result from_chars_advanced(co
        || defined(__amd64) || defined(__aarch64__) || defined(_M_ARM64) \
        || defined(__MINGW64__)                                          \
        || defined(__s390x__)                                            \
-       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) || defined(__PPC64LE__)) \
-       || defined(__EMSCRIPTEN__))
+       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) || defined(__PPC64LE__)) )
 #define FASTFLOAT_64BIT
 #elif (defined(__i386) || defined(__i386__) || defined(_M_IX86)   \
      || defined(__arm__) || defined(_M_ARM)                   \
-     || defined(__MINGW32__))
+     || defined(__MINGW32__) || defined(__EMSCRIPTEN__))
 #define FASTFLOAT_32BIT
 #else
   // Need to check incrementally, since SIZE_MAX is a size_t, avoid overflow.
@@ -128,7 +127,7 @@ from_chars_result from_chars_advanced(co
 #define FASTFLOAT_VISUAL_STUDIO 1
 #endif
 
-#ifdef __BYTE_ORDER__
+#if defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__
 #define FASTFLOAT_IS_BIG_ENDIAN (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
 #elif defined _WIN32
 #define FASTFLOAT_IS_BIG_ENDIAN 0
@@ -271,8 +270,9 @@ fastfloat_really_inline uint64_t _umul12
 fastfloat_really_inline value128 full_multiplication(uint64_t a,
                                                      uint64_t b) {
   value128 answer;
-#ifdef _M_ARM64
+#if defined(_M_ARM64) && !defined(__MINGW32__)
   // ARM64 has native support for 64-bit multiplications, no need to emulate
+  // But MinGW on ARM64 doesn't have native support for 64-bit multiplications
   answer.high = __umulh(a, b);
   answer.low = a * b;
 #elif defined(FASTFLOAT_32BIT) || (defined(_WIN64) && !defined(__clang__))
@@ -307,21 +307,69 @@ constexpr static double powers_of_ten_do
     1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, 1e20, 1e21, 1e22};
 constexpr static float powers_of_ten_float[] = {1e0, 1e1, 1e2, 1e3, 1e4, 1e5,
                                                 1e6, 1e7, 1e8, 1e9, 1e10};
+// used for max_mantissa_double and max_mantissa_float
+constexpr uint64_t constant_55555 = 5 * 5 * 5 * 5 * 5;
+// Largest integer value v so that (5**index * v) <= 1<<53.
+// 0x10000000000000 == 1 << 53
+constexpr static uint64_t max_mantissa_double[] = {
+      0x10000000000000,
+      0x10000000000000 / 5,
+      0x10000000000000 / (5 * 5),
+      0x10000000000000 / (5 * 5 * 5),
+      0x10000000000000 / (5 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555),
+      0x10000000000000 / (constant_55555 * 5),
+      0x10000000000000 / (constant_55555 * 5 * 5),
+      0x10000000000000 / (constant_55555 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * 5 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555),
+      0x10000000000000 / (constant_55555 * constant_55555 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5),
+      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5 * 5)};
+  // Largest integer value v so that (5**index * v) <= 1<<24.
+  // 0x1000000 == 1<<24
+  constexpr static uint64_t max_mantissa_float[] = {
+      0x1000000,
+      0x1000000 / 5,
+      0x1000000 / (5 * 5),
+      0x1000000 / (5 * 5 * 5),
+      0x1000000 / (5 * 5 * 5 * 5),
+      0x1000000 / (constant_55555),
+      0x1000000 / (constant_55555 * 5),
+      0x1000000 / (constant_55555 * 5 * 5),
+      0x1000000 / (constant_55555 * 5 * 5 * 5),
+      0x1000000 / (constant_55555 * 5 * 5 * 5 * 5),
+      0x1000000 / (constant_55555 * constant_55555),
+      0x1000000 / (constant_55555 * constant_55555 * 5)};
 
 template <typename T> struct binary_format {
+  using equiv_uint = typename std::conditional<sizeof(T) == 4, uint32_t, uint64_t>::type;
+
   static inline constexpr int mantissa_explicit_bits();
   static inline constexpr int minimum_exponent();
   static inline constexpr int infinite_power();
   static inline constexpr int sign_index();
-  static inline constexpr int min_exponent_fast_path();
   static inline constexpr int max_exponent_fast_path();
   static inline constexpr int max_exponent_round_to_even();
   static inline constexpr int min_exponent_round_to_even();
-  static inline constexpr uint64_t max_mantissa_fast_path();
+  static inline constexpr uint64_t max_mantissa_fast_path(int64_t power);
   static inline constexpr int largest_power_of_ten();
   static inline constexpr int smallest_power_of_ten();
   static inline constexpr T exact_power_of_ten(int64_t power);
   static inline constexpr size_t max_digits();
+  static inline constexpr equiv_uint exponent_mask();
+  static inline constexpr equiv_uint mantissa_mask();
+  static inline constexpr equiv_uint hidden_bit_mask();
 };
 
 template <> inline constexpr int binary_format<double>::mantissa_explicit_bits() {
@@ -364,21 +412,6 @@ template <> inline constexpr int binary_
 template <> inline constexpr int binary_format<double>::sign_index() { return 63; }
 template <> inline constexpr int binary_format<float>::sign_index() { return 31; }
 
-template <> inline constexpr int binary_format<double>::min_exponent_fast_path() {
-#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
-  return 0;
-#else
-  return -22;
-#endif
-}
-template <> inline constexpr int binary_format<float>::min_exponent_fast_path() {
-#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
-  return 0;
-#else
-  return -10;
-#endif
-}
-
 template <> inline constexpr int binary_format<double>::max_exponent_fast_path() {
   return 22;
 }
@@ -386,11 +419,17 @@ template <> inline constexpr int binary_
   return 10;
 }
 
-template <> inline constexpr uint64_t binary_format<double>::max_mantissa_fast_path() {
-  return uint64_t(2) << mantissa_explicit_bits();
+template <> inline constexpr uint64_t binary_format<double>::max_mantissa_fast_path(int64_t power) {
+  // caller is responsible to ensure that
+  // power >= 0 && power <= 22
+  //
+  return max_mantissa_double[power];
 }
-template <> inline constexpr uint64_t binary_format<float>::max_mantissa_fast_path() {
-  return uint64_t(2) << mantissa_explicit_bits();
+template <> inline constexpr uint64_t binary_format<float>::max_mantissa_fast_path(int64_t power) {
+  // caller is responsible to ensure that
+  // power >= 0 && power <= 10
+  //
+  return max_mantissa_float[power];
 }
 
 template <>
@@ -429,6 +468,33 @@ template <> inline constexpr size_t bina
   return 114;
 }
 
+template <> inline constexpr binary_format<float>::equiv_uint
+    binary_format<float>::exponent_mask() {
+  return 0x7F800000;
+}
+template <> inline constexpr binary_format<double>::equiv_uint
+    binary_format<double>::exponent_mask() {
+  return 0x7FF0000000000000;
+}
+
+template <> inline constexpr binary_format<float>::equiv_uint
+    binary_format<float>::mantissa_mask() {
+  return 0x007FFFFF;
+}
+template <> inline constexpr binary_format<double>::equiv_uint
+    binary_format<double>::mantissa_mask() {
+  return 0x000FFFFFFFFFFFFF;
+}
+
+template <> inline constexpr binary_format<float>::equiv_uint
+    binary_format<float>::hidden_bit_mask() {
+  return 0x00800000;
+}
+template <> inline constexpr binary_format<double>::equiv_uint
+    binary_format<double>::hidden_bit_mask() {
+  return 0x0010000000000000;
+}
+
 template<typename T>
 fastfloat_really_inline void to_float(bool negative, adjusted_mantissa am, T &value) {
   uint64_t word = am.mantissa;
@@ -2410,40 +2476,24 @@ fastfloat_really_inline int32_t scientif
 // this converts a native floating-point number to an extended-precision float.
 template <typename T>
 fastfloat_really_inline adjusted_mantissa to_extended(T value) noexcept {
+  using equiv_uint = typename binary_format<T>::equiv_uint;
+  constexpr equiv_uint exponent_mask = binary_format<T>::exponent_mask();
+  constexpr equiv_uint mantissa_mask = binary_format<T>::mantissa_mask();
+  constexpr equiv_uint hidden_bit_mask = binary_format<T>::hidden_bit_mask();
+
   adjusted_mantissa am;
   int32_t bias = binary_format<T>::mantissa_explicit_bits() - binary_format<T>::minimum_exponent();
-  if (std::is_same<T, float>::value) {
-    constexpr uint32_t exponent_mask = 0x7F800000;
-    constexpr uint32_t mantissa_mask = 0x007FFFFF;
-    constexpr uint64_t hidden_bit_mask = 0x00800000;
-    uint32_t bits;
-    ::memcpy(&bits, &value, sizeof(T));
-    if ((bits & exponent_mask) == 0) {
-      // denormal
-      am.power2 = 1 - bias;
-      am.mantissa = bits & mantissa_mask;
-    } else {
-      // normal
-      am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
-      am.power2 -= bias;
-      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
-    }
+  equiv_uint bits;
+  ::memcpy(&bits, &value, sizeof(T));
+  if ((bits & exponent_mask) == 0) {
+    // denormal
+    am.power2 = 1 - bias;
+    am.mantissa = bits & mantissa_mask;
   } else {
-    constexpr uint64_t exponent_mask = 0x7FF0000000000000;
-    constexpr uint64_t mantissa_mask = 0x000FFFFFFFFFFFFF;
-    constexpr uint64_t hidden_bit_mask = 0x0010000000000000;
-    uint64_t bits;
-    ::memcpy(&bits, &value, sizeof(T));
-    if ((bits & exponent_mask) == 0) {
-      // denormal
-      am.power2 = 1 - bias;
-      am.mantissa = bits & mantissa_mask;
-    } else {
-      // normal
-      am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
-      am.power2 -= bias;
-      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
-    }
+    // normal
+    am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
+    am.power2 -= bias;
+    am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
   }
 
   return am;
@@ -2869,11 +2919,10 @@ from_chars_result from_chars_advanced(co
   }
   answer.ec = std::errc(); // be optimistic
   answer.ptr = pns.lastmatch;
-  // Next is Clinger's fast path.
-  if (binary_format<T>::min_exponent_fast_path() <= pns.exponent && pns.exponent <= binary_format<T>::max_exponent_fast_path() && pns.mantissa <=binary_format<T>::max_mantissa_fast_path() && !pns.too_many_digits) {
+  // Next is a modified Clinger's fast path, inspired by Jakub Jelínek's proposal
+  if (pns.exponent >= 0 && pns.exponent <= binary_format<T>::max_exponent_fast_path() && pns.mantissa <=binary_format<T>::max_mantissa_fast_path(pns.exponent) && !pns.too_many_digits) {
     value = T(pns.mantissa);
-    if (pns.exponent < 0) { value = value / binary_format<T>::exact_power_of_ten(-pns.exponent); }
-    else { value = value * binary_format<T>::exact_power_of_ten(pns.exponent); }
+    value = value * binary_format<T>::exact_power_of_ten(pns.exponent);
     if (pns.negative) { value = -value; }
     return answer;
   }
--- libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc.jj	2022-11-05 19:20:28.944898668 +0100
+++ libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc	2022-11-05 19:26:44.318740322 +0100
@@ -0,0 +1,42 @@
+// Copyright (C) 2022 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++17 } }
+// { dg-add-options ieee }
+
+#include <charconv>
+#include <string>
+#include <cfenv>
+#include <testsuite_hooks.h>
+
+int
+main()
+{
+  // FP from_char not available otherwise.
+#if __cpp_lib_to_chars >= 201611L \
+    && _GLIBCXX_USE_C99_FENV_TR1 \
+    && defined(FE_DOWNWARD) \
+    && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
+  // PR libstdc++/107468
+  float f;
+  char buf[] = "3.355447e+07";
+  std::fesetround(FE_DOWNWARD);
+  auto [ptr, ec] = std::from_chars(buf, buf + sizeof(buf) - 1, f, std::chars_format::scientific);
+  VERIFY( ec == std::errc() && ptr == buf + sizeof(buf) - 1 );
+  VERIFY( f == 33554472.0f );
+#endif
+}

	Jakub


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] libstdc++: Update from latest fast_float [PR107468]
  2022-11-07  8:19 [PATCH] libstdc++: Update from latest fast_float [PR107468] Jakub Jelinek
@ 2022-11-07 13:37 ` Jonathan Wakely
  0 siblings, 0 replies; 2+ messages in thread
From: Jonathan Wakely @ 2022-11-07 13:37 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Patrick Palka, gcc-patches, libstdc++

On Mon, 7 Nov 2022 at 08:19, Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The following patch updates from fast_float trunk.  That way
> it grabs two of the 4 LOCAL_PATCHES, some smaller tweaks, to_extended
> cleanups and most importantly fix for the incorrect rounding case,
> PR107468 aka https://github.com/fastfloat/fast_float/issues/149
> Using std::fegetround showed in benchmarks too slow, so instead of
> doing that the patch limits the fast path where it uses floating
> point multiplication rather than integral to cases where we can
> prove there will be no rounding (the multiplication will be exact, not
> just that the two multiplication or division operation arguments are
> exactly representable).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK, thanks.


>
> 2022-11-07  Jakub Jelinek  <jakub@redhat.com>
>
>         PR libstdc++/107468
>         * src/c++17/fast_float/MERGE: Adjust for merge from upstream.
>         * src/c++17/fast_float/LOCAL_PATCHES: Remove commits that were
>         upstreamed.
>         * src/c++17/fast_float/README.md: Merge from fast_float
>         662497742fea7055f0e0ee27e5a7ddc382c2c38e commit.
>         * src/c++17/fast_float/fast_float.h: Likewise.
>         * testsuite/20_util/from_chars/pr107468.cc: New test.
>
> --- libstdc++-v3/src/c++17/fast_float/MERGE.jj  2022-01-18 11:59:00.306971713 +0100
> +++ libstdc++-v3/src/c++17/fast_float/MERGE     2022-11-05 18:42:50.815892080 +0100
> @@ -1,4 +1,4 @@
> -d35368cae610b4edeec61cd41e4d2367a4d33f58
> +662497742fea7055f0e0ee27e5a7ddc382c2c38e
>
>  The first line of this file holds the git revision number of the
>  last merge done from the master library sources.
> --- libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES.jj  2022-02-04 14:36:56.965577924 +0100
> +++ libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES     2022-11-05 19:02:57.360336939 +0100
> @@ -1,4 +1,2 @@
>  r12-6647
>  r12-6648
> -r12-6664
> -r12-6665
> --- libstdc++-v3/src/c++17/fast_float/README.md.jj      2022-01-18 11:59:00.306971713 +0100
> +++ libstdc++-v3/src/c++17/fast_float/README.md 2022-11-05 18:32:34.668345927 +0100
> @@ -1,12 +1,5 @@
>  ## fast_float number parsing library: 4x faster than strtod
>
> -![Ubuntu 20.04 CI (GCC 9)](https://github.com/lemire/fast_float/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)
> -![Ubuntu 18.04 CI (GCC 7)](https://github.com/lemire/fast_float/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
> -![Alpine Linux](https://github.com/lemire/fast_float/workflows/Alpine%20Linux/badge.svg)
> -![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badge.svg)
> -![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLANG-CI/badge.svg)
> -[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml)
> -
>  The fast_float library provides fast header-only implementations for the C++ from_chars
>  functions for `float` and `double` types.  These functions convert ASCII strings representing
>  decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding (including
> @@ -28,8 +21,8 @@ struct from_chars_result {
>  ```
>
>  It parses the character sequence [first,last) for a number. It parses floating-point numbers expecting
> -a locale-independent format equivalent to the C++17 from_chars function.
> -The resulting floating-point value is the closest floating-point values (using either float or double),
> +a locale-independent format equivalent to the C++17 from_chars function.
> +The resulting floating-point value is the closest floating-point values (using either float or double),
>  using the "round to even" convention for values that would otherwise fall right in-between two values.
>  That is, we provide exact parsing according to the IEEE standard.
>
> @@ -47,7 +40,7 @@ Example:
>  ``` C++
>  #include "fast_float/fast_float.h"
>  #include <iostream>
> -
> +
>  int main() {
>      const std::string input =  "3.1416 xyz ";
>      double result;
> @@ -60,15 +53,15 @@ int main() {
>
>
>  Like the C++17 standard, the `fast_float::from_chars` functions take an optional last argument of
> -the type `fast_float::chars_format`. It is a bitset value: we check whether
> +the type `fast_float::chars_format`. It is a bitset value: we check whether
>  `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_format::scientific` are set
>  to determine whether we allow the fixed point and scientific notation respectively.
>  The default is  `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
>
> -The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
> +The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
>  * The `from_chars` function does not skip leading white-space characters.
>  * [A leading `+` sign](https://en.cppreference.com/w/cpp/utility/from_chars) is forbidden.
> -* It is generally impossible to represent a decimal value exactly as binary floating-point number (`float` and `double` types). We seek the nearest value. We round to an even mantissa when we are in-between two binary floating-point numbers.
> +* It is generally impossible to represent a decimal value exactly as binary floating-point number (`float` and `double` types). We seek the nearest value. We round to an even mantissa when we are in-between two binary floating-point numbers.
>
>  Furthermore, we have the following restrictions:
>  * We only support `float` and `double` types at this time.
> @@ -77,22 +70,22 @@ Furthermore, we have the following restr
>
>  We support Visual Studio, macOS, Linux, freeBSD. We support big and little endian. We support 32-bit and 64-bit systems.
>
> -
> +We assume that the rounding mode is set to nearest (`std::fegetround() == FE_TONEAREST`).
>
>  ## Using commas as decimal separator
>
>
>  The C++ standard stipulate that `from_chars` has to be locale-independent. In
> -particular, the decimal separator has to be the period (`.`). However,
> -some users still want to use the `fast_float` library with in a locale-dependent
> +particular, the decimal separator has to be the period (`.`). However,
> +some users still want to use the `fast_float` library with in a locale-dependent
>  manner. Using a separate function called `from_chars_advanced`, we allow the users
> -to pass a `parse_options` instance which contains a custom decimal separator (e.g.,
> +to pass a `parse_options` instance which contains a custom decimal separator (e.g.,
>  the comma). You may use it as follows.
>
>  ```C++
>  #include "fast_float/fast_float.h"
>  #include <iostream>
> -
> +
>  int main() {
>      const std::string input =  "3,1416 xyz ";
>      double result;
> @@ -104,25 +97,55 @@ int main() {
>  }
>  ```
>
> +You can parse delimited numbers:
> +```C++
> +  const std::string input =   "234532.3426362,7869234.9823,324562.645";
> +  double result;
> +  auto answer = fast_float::from_chars(input.data(), input.data()+input.size(), result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 234532.3426362.
> +  if(answer.ptr[0] != ',') {
> +    // unexpected delimiter
> +  }
> +  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 7869234.9823.
> +  if(answer.ptr[0] != ',') {
> +    // unexpected delimiter
> +  }
> +  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
> +  if(answer.ec != std::errc()) {
> +    // check error
> +  }
> +  // we have result == 324562.645.
> +```
>
>  ## Reference
>
> -- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv.org/abs/2101.11408), Software: Pratice and Experience 51 (8), 2021.
> +- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv.org/abs/2101.11408), Software: Practice and Experience 51 (8), 2021.
>
>  ## Other programming languages
>
>  - [There is an R binding](https://github.com/eddelbuettel/rcppfastfloat) called `rcppfastfloat`.
>  - [There is a Rust port of the fast_float library](https://github.com/aldanor/fast-float-rust/) called `fast-float-rust`.
> -- [There is a Java port of the fast_float library](https://github.com/wrandelshofer/FastDoubleParser) called `FastDoubleParser`.
> +- [There is a Java port of the fast_float library](https://github.com/wrandelshofer/FastDoubleParser) called `FastDoubleParser`. It used for important systems such as [Jackson](https://github.com/FasterXML/jackson-core).
>  - [There is a C# port of the fast_float library](https://github.com/CarlVerret/csFastFloat) called `csFastFloat`.
>
>
>  ## Relation With Other Work
>
> -The fastfloat algorithm is part of the [LLVM standard libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
> +The fast_float library is part of GCC (as of version 12): the `from_chars` function in GCC relies on fast_float.
> +
> +The fastfloat algorithm is part of the [LLVM standard libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
>
>  The fast_float library provides a performance similar to that of the [fast_double_parser](https://github.com/lemire/fast_double_parser) library but using an updated algorithm reworked from the ground up, and while offering an API more in line with the expectations of C++ programmers. The fast_double_parser library is part of the [Microsoft LightGBM machine-learning framework](https://github.com/microsoft/LightGBM).
>
> +There is a [derived implementation part of AdaCore](https://github.com/AdaCore/VSS).
> +
>  ## Users
>
>  The fast_float library is used by [Apache Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the number parsing speed by two or three times. It is also used by [Yandex ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonnet).
> @@ -135,14 +158,14 @@ It can parse random floating-point numbe
>  <img src="http://lemire.me/blog/wp-content/uploads/2020/11/fastfloat_speed.png" width="400">
>
>  ```
> -$ ./build/benchmarks/benchmark
> +$ ./build/benchmarks/benchmark
>  # parsing random integers in the range [0,1)
> -volume = 2.09808 MB
> -netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 Mfloat/s
> -doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 Mfloat/s
> -strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 Mfloat/s
> -abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 Mfloat/s
> -fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 Mfloat/s
> +volume = 2.09808 MB
> +netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 Mfloat/s
> +doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 Mfloat/s
> +strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 Mfloat/s
> +abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 Mfloat/s
> +fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 Mfloat/s
>  ```
>
>  See https://github.com/lemire/simple_fastfloat_benchmark for our benchmarking code.
> @@ -183,23 +206,23 @@ You should change the `GIT_TAG` line so
>
>  ## Using as single header
>
> -The script `script/amalgamate.py` may be used to generate a single header
> +The script `script/amalgamate.py` may be used to generate a single header
>  version of the library if so desired.
> -Just run the script from the root directory of this repository.
> +Just run the script from the root directory of this repository.
>  You can customize the license type and output file if desired as described in
>  the command line help.
>
>  You may directly download automatically generated single-header files:
>
> -https://github.com/fastfloat/fast_float/releases/download/v1.1.2/fast_float.h
> +https://github.com/fastfloat/fast_float/releases/download/v3.4.0/fast_float.h
>
>  ## Credit
>
> -Though this work is inspired by many different people, this work benefited especially from exchanges with
> -Michael Eisel, who motivated the original research with his key insights, and with Nigel Tao who provided
> +Though this work is inspired by many different people, this work benefited especially from exchanges with
> +Michael Eisel, who motivated the original research with his key insights, and with Nigel Tao who provided
>  invaluable feedback. Rémy Oudompheng first implemented a fast path we use in the case of long digits.
>
> -The library includes code adapted from Google Wuffs (written by Nigel Tao) which was originally published
> +The library includes code adapted from Google Wuffs (written by Nigel Tao) which was originally published
>  under the Apache 2.0 license.
>
>  ## License
> --- libstdc++-v3/src/c++17/fast_float/fast_float.h.jj   2022-02-04 14:36:56.966577910 +0100
> +++ libstdc++-v3/src/c++17/fast_float/fast_float.h      2022-11-05 18:54:48.096049177 +0100
> @@ -74,7 +74,7 @@ struct parse_options {
>   * Like the C++17 standard, the `fast_float::from_chars` functions take an optional last argument of
>   * the type `fast_float::chars_format`. It is a bitset value: we check whether
>   * `fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_format::scientific` are set
> - * to determine whether we allowe the fixed point and scientific notation respectively.
> + * to determine whether we allow the fixed point and scientific notation respectively.
>   * The default is  `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
>   */
>  template<typename T>
> @@ -98,12 +98,11 @@ from_chars_result from_chars_advanced(co
>         || defined(__amd64) || defined(__aarch64__) || defined(_M_ARM64) \
>         || defined(__MINGW64__)                                          \
>         || defined(__s390x__)                                            \
> -       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) || defined(__PPC64LE__)) \
> -       || defined(__EMSCRIPTEN__))
> +       || (defined(__ppc64__) || defined(__PPC64__) || defined(__ppc64le__) || defined(__PPC64LE__)) )
>  #define FASTFLOAT_64BIT
>  #elif (defined(__i386) || defined(__i386__) || defined(_M_IX86)   \
>       || defined(__arm__) || defined(_M_ARM)                   \
> -     || defined(__MINGW32__))
> +     || defined(__MINGW32__) || defined(__EMSCRIPTEN__))
>  #define FASTFLOAT_32BIT
>  #else
>    // Need to check incrementally, since SIZE_MAX is a size_t, avoid overflow.
> @@ -128,7 +127,7 @@ from_chars_result from_chars_advanced(co
>  #define FASTFLOAT_VISUAL_STUDIO 1
>  #endif
>
> -#ifdef __BYTE_ORDER__
> +#if defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__
>  #define FASTFLOAT_IS_BIG_ENDIAN (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
>  #elif defined _WIN32
>  #define FASTFLOAT_IS_BIG_ENDIAN 0
> @@ -271,8 +270,9 @@ fastfloat_really_inline uint64_t _umul12
>  fastfloat_really_inline value128 full_multiplication(uint64_t a,
>                                                       uint64_t b) {
>    value128 answer;
> -#ifdef _M_ARM64
> +#if defined(_M_ARM64) && !defined(__MINGW32__)
>    // ARM64 has native support for 64-bit multiplications, no need to emulate
> +  // But MinGW on ARM64 doesn't have native support for 64-bit multiplications
>    answer.high = __umulh(a, b);
>    answer.low = a * b;
>  #elif defined(FASTFLOAT_32BIT) || (defined(_WIN64) && !defined(__clang__))
> @@ -307,21 +307,69 @@ constexpr static double powers_of_ten_do
>      1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, 1e20, 1e21, 1e22};
>  constexpr static float powers_of_ten_float[] = {1e0, 1e1, 1e2, 1e3, 1e4, 1e5,
>                                                  1e6, 1e7, 1e8, 1e9, 1e10};
> +// used for max_mantissa_double and max_mantissa_float
> +constexpr uint64_t constant_55555 = 5 * 5 * 5 * 5 * 5;
> +// Largest integer value v so that (5**index * v) <= 1<<53.
> +// 0x10000000000000 == 1 << 53
> +constexpr static uint64_t max_mantissa_double[] = {
> +      0x10000000000000,
> +      0x10000000000000 / 5,
> +      0x10000000000000 / (5 * 5),
> +      0x10000000000000 / (5 * 5 * 5),
> +      0x10000000000000 / (5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555),
> +      0x10000000000000 / (constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * 5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5),
> +      0x10000000000000 / (constant_55555 * constant_55555 * constant_55555 * constant_55555 * 5 * 5 * 5 * 5)};
> +  // Largest integer value v so that (5**index * v) <= 1<<24.
> +  // 0x1000000 == 1<<24
> +  constexpr static uint64_t max_mantissa_float[] = {
> +      0x1000000,
> +      0x1000000 / 5,
> +      0x1000000 / (5 * 5),
> +      0x1000000 / (5 * 5 * 5),
> +      0x1000000 / (5 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555),
> +      0x1000000 / (constant_55555 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555 * 5 * 5 * 5 * 5),
> +      0x1000000 / (constant_55555 * constant_55555),
> +      0x1000000 / (constant_55555 * constant_55555 * 5)};
>
>  template <typename T> struct binary_format {
> +  using equiv_uint = typename std::conditional<sizeof(T) == 4, uint32_t, uint64_t>::type;
> +
>    static inline constexpr int mantissa_explicit_bits();
>    static inline constexpr int minimum_exponent();
>    static inline constexpr int infinite_power();
>    static inline constexpr int sign_index();
> -  static inline constexpr int min_exponent_fast_path();
>    static inline constexpr int max_exponent_fast_path();
>    static inline constexpr int max_exponent_round_to_even();
>    static inline constexpr int min_exponent_round_to_even();
> -  static inline constexpr uint64_t max_mantissa_fast_path();
> +  static inline constexpr uint64_t max_mantissa_fast_path(int64_t power);
>    static inline constexpr int largest_power_of_ten();
>    static inline constexpr int smallest_power_of_ten();
>    static inline constexpr T exact_power_of_ten(int64_t power);
>    static inline constexpr size_t max_digits();
> +  static inline constexpr equiv_uint exponent_mask();
> +  static inline constexpr equiv_uint mantissa_mask();
> +  static inline constexpr equiv_uint hidden_bit_mask();
>  };
>
>  template <> inline constexpr int binary_format<double>::mantissa_explicit_bits() {
> @@ -364,21 +412,6 @@ template <> inline constexpr int binary_
>  template <> inline constexpr int binary_format<double>::sign_index() { return 63; }
>  template <> inline constexpr int binary_format<float>::sign_index() { return 31; }
>
> -template <> inline constexpr int binary_format<double>::min_exponent_fast_path() {
> -#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
> -  return 0;
> -#else
> -  return -22;
> -#endif
> -}
> -template <> inline constexpr int binary_format<float>::min_exponent_fast_path() {
> -#if (FLT_EVAL_METHOD != 1) && (FLT_EVAL_METHOD != 0)
> -  return 0;
> -#else
> -  return -10;
> -#endif
> -}
> -
>  template <> inline constexpr int binary_format<double>::max_exponent_fast_path() {
>    return 22;
>  }
> @@ -386,11 +419,17 @@ template <> inline constexpr int binary_
>    return 10;
>  }
>
> -template <> inline constexpr uint64_t binary_format<double>::max_mantissa_fast_path() {
> -  return uint64_t(2) << mantissa_explicit_bits();
> +template <> inline constexpr uint64_t binary_format<double>::max_mantissa_fast_path(int64_t power) {
> +  // caller is responsible to ensure that
> +  // power >= 0 && power <= 22
> +  //
> +  return max_mantissa_double[power];
>  }
> -template <> inline constexpr uint64_t binary_format<float>::max_mantissa_fast_path() {
> -  return uint64_t(2) << mantissa_explicit_bits();
> +template <> inline constexpr uint64_t binary_format<float>::max_mantissa_fast_path(int64_t power) {
> +  // caller is responsible to ensure that
> +  // power >= 0 && power <= 10
> +  //
> +  return max_mantissa_float[power];
>  }
>
>  template <>
> @@ -429,6 +468,33 @@ template <> inline constexpr size_t bina
>    return 114;
>  }
>
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::exponent_mask() {
> +  return 0x7F800000;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::exponent_mask() {
> +  return 0x7FF0000000000000;
> +}
> +
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::mantissa_mask() {
> +  return 0x007FFFFF;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::mantissa_mask() {
> +  return 0x000FFFFFFFFFFFFF;
> +}
> +
> +template <> inline constexpr binary_format<float>::equiv_uint
> +    binary_format<float>::hidden_bit_mask() {
> +  return 0x00800000;
> +}
> +template <> inline constexpr binary_format<double>::equiv_uint
> +    binary_format<double>::hidden_bit_mask() {
> +  return 0x0010000000000000;
> +}
> +
>  template<typename T>
>  fastfloat_really_inline void to_float(bool negative, adjusted_mantissa am, T &value) {
>    uint64_t word = am.mantissa;
> @@ -2410,40 +2476,24 @@ fastfloat_really_inline int32_t scientif
>  // this converts a native floating-point number to an extended-precision float.
>  template <typename T>
>  fastfloat_really_inline adjusted_mantissa to_extended(T value) noexcept {
> +  using equiv_uint = typename binary_format<T>::equiv_uint;
> +  constexpr equiv_uint exponent_mask = binary_format<T>::exponent_mask();
> +  constexpr equiv_uint mantissa_mask = binary_format<T>::mantissa_mask();
> +  constexpr equiv_uint hidden_bit_mask = binary_format<T>::hidden_bit_mask();
> +
>    adjusted_mantissa am;
>    int32_t bias = binary_format<T>::mantissa_explicit_bits() - binary_format<T>::minimum_exponent();
> -  if (std::is_same<T, float>::value) {
> -    constexpr uint32_t exponent_mask = 0x7F800000;
> -    constexpr uint32_t mantissa_mask = 0x007FFFFF;
> -    constexpr uint64_t hidden_bit_mask = 0x00800000;
> -    uint32_t bits;
> -    ::memcpy(&bits, &value, sizeof(T));
> -    if ((bits & exponent_mask) == 0) {
> -      // denormal
> -      am.power2 = 1 - bias;
> -      am.mantissa = bits & mantissa_mask;
> -    } else {
> -      // normal
> -      am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
> -      am.power2 -= bias;
> -      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
> -    }
> +  equiv_uint bits;
> +  ::memcpy(&bits, &value, sizeof(T));
> +  if ((bits & exponent_mask) == 0) {
> +    // denormal
> +    am.power2 = 1 - bias;
> +    am.mantissa = bits & mantissa_mask;
>    } else {
> -    constexpr uint64_t exponent_mask = 0x7FF0000000000000;
> -    constexpr uint64_t mantissa_mask = 0x000FFFFFFFFFFFFF;
> -    constexpr uint64_t hidden_bit_mask = 0x0010000000000000;
> -    uint64_t bits;
> -    ::memcpy(&bits, &value, sizeof(T));
> -    if ((bits & exponent_mask) == 0) {
> -      // denormal
> -      am.power2 = 1 - bias;
> -      am.mantissa = bits & mantissa_mask;
> -    } else {
> -      // normal
> -      am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
> -      am.power2 -= bias;
> -      am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
> -    }
> +    // normal
> +    am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
> +    am.power2 -= bias;
> +    am.mantissa = (bits & mantissa_mask) | hidden_bit_mask;
>    }
>
>    return am;
> @@ -2869,11 +2919,10 @@ from_chars_result from_chars_advanced(co
>    }
>    answer.ec = std::errc(); // be optimistic
>    answer.ptr = pns.lastmatch;
> -  // Next is Clinger's fast path.
> -  if (binary_format<T>::min_exponent_fast_path() <= pns.exponent && pns.exponent <= binary_format<T>::max_exponent_fast_path() && pns.mantissa <=binary_format<T>::max_mantissa_fast_path() && !pns.too_many_digits) {
> +  // Next is a modified Clinger's fast path, inspired by Jakub Jelínek's proposal
> +  if (pns.exponent >= 0 && pns.exponent <= binary_format<T>::max_exponent_fast_path() && pns.mantissa <=binary_format<T>::max_mantissa_fast_path(pns.exponent) && !pns.too_many_digits) {
>      value = T(pns.mantissa);
> -    if (pns.exponent < 0) { value = value / binary_format<T>::exact_power_of_ten(-pns.exponent); }
> -    else { value = value * binary_format<T>::exact_power_of_ten(pns.exponent); }
> +    value = value * binary_format<T>::exact_power_of_ten(pns.exponent);
>      if (pns.negative) { value = -value; }
>      return answer;
>    }
> --- libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc.jj    2022-11-05 19:20:28.944898668 +0100
> +++ libstdc++-v3/testsuite/20_util/from_chars/pr107468.cc       2022-11-05 19:26:44.318740322 +0100
> @@ -0,0 +1,42 @@
> +// Copyright (C) 2022 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// You should have received a copy of the GNU General Public License along
> +// with this library; see the file COPYING3.  If not see
> +// <http://www.gnu.org/licenses/>.
> +
> +// { dg-do run { target c++17 } }
> +// { dg-add-options ieee }
> +
> +#include <charconv>
> +#include <string>
> +#include <cfenv>
> +#include <testsuite_hooks.h>
> +
> +int
> +main()
> +{
> +  // FP from_char not available otherwise.
> +#if __cpp_lib_to_chars >= 201611L \
> +    && _GLIBCXX_USE_C99_FENV_TR1 \
> +    && defined(FE_DOWNWARD) \
> +    && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
> +  // PR libstdc++/107468
> +  float f;
> +  char buf[] = "3.355447e+07";
> +  std::fesetround(FE_DOWNWARD);
> +  auto [ptr, ec] = std::from_chars(buf, buf + sizeof(buf) - 1, f, std::chars_format::scientific);
> +  VERIFY( ec == std::errc() && ptr == buf + sizeof(buf) - 1 );
> +  VERIFY( f == 33554472.0f );
> +#endif
> +}
>
>         Jakub
>


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-07 13:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07  8:19 [PATCH] libstdc++: Update from latest fast_float [PR107468] Jakub Jelinek
2022-11-07 13:37 ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).