public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] libstdc++: std::to_chars std::{,b}float16_t support
@ 2022-10-27  7:59 Jakub Jelinek
  2022-10-28 16:52 ` Patrick Palka
  2022-11-01 12:22 ` [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jonathan Wakely
  0 siblings, 2 replies; 7+ messages in thread
From: Jakub Jelinek @ 2022-10-27  7:59 UTC (permalink / raw)
  To: Jonathan Wakely, Patrick Palka; +Cc: gcc-patches, libstdc++

[-- Attachment #1: Type: text/plain, Size: 15891 bytes --]

Hi!

The following patch on top of
https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
adds std::{,b}float16_t support for std::to_chars.
When precision is specified (or for std::bfloat16_t for hex mode even if not),
I believe we can just use the std::to_chars float (when float is mode
compatible with std::float32_t) overloads, both formats are proper subsets
of std::float32_t.
Unfortunately when precision is not specified and we are supposed to emit
shortest string, the std::{,b}float16_t strings are usually much shorter.
E.g. 1.e7p-14f16 shortest fixed representation is
0.0001161 and shortest scientific representation is
1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
0.00011610985 and
1.1610985e-04.
Similarly for 1.38p-112bf16,
0.000000000000000000000000000000000235
2.35e-34 vs. 1.38p-112f32
0.00000000000000000000000000000000023472271
2.3472271e-34
For std::float16_t there are differences even in the shortest hex, say:
0.01p-14 vs. 1p-22
but only for denormal std::float16_t values (where all std::float16_t
denormals converted to std::float32_t are normal), __FLT16_MIN__ and
everything larger in absolute value than that is the same.  Unless
that is a bug and we should try to discover shorter representations
even for denormals...
std::bfloat16_t has the same exponent range as std::float32_t, so all
std::bfloat16_t denormals are also std::float32_t denormals and thus
the shortest hex representations are the same.

As documented, ryu can handle arbitrary IEEE like floating point formats
(probably not wider than IEEE quad) using the generic_128 handling, but
ryu is hidden in libstdc++.so.  As only few architectures support
std::float16_t right now and some of them have special ISA requirements
for those (e.g. on i?86 one needs -msse2) and std::bfloat16_t is right
now supported only on x86 (again with -msse2), perhaps with aarch64/arm
coming next if ARM is interested, but I think it is possible that more
will be added later, instead of exporting APIs from the library to handle
directly the std::{,b}float16_t overloads this patch instead exports
functions which take a float which is a superset of those and expects
the inline overloads to promote the 16-bit formats to 32-bit, then inside
of the library it ensures they are printed right.
With the added [[gnu::cold]] attribute because I think most users
will primarily use these formats as storage formats and perform arithmetics
in the excess precision for them and print also as std::float32_t the
added support doesn't seem to be too large, on x86_64:
readelf -Ws libstdc++.so.6.0.31 | grep float16_t
   912: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
  5767: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
   842: 000000000016d430   106 FUNC    LOCAL  DEFAULT   13 _ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_
   865: 0000000000170980  1613 FUNC    LOCAL  DEFAULT   13 _ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0
  7205: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format
  7985: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format
so 3568 code bytes together or so.

Tested with the attached test (which doesn't prove the shortest
representation, just prints std::{,b}float16_t and std::float32_t
shortest strings side by side, then tries to verify it can be
emitted even into the exact sized range and can't be into range
one smaller than that and tries to read what is printed
back using from_chars float32_t overload (so there could be
double rounding, but apparently there is none for the shortest strings).
The only differences printed are for NaNs, where sNaNs are canonicalized
to canonical qNaNs and as to_chars doesn't print NaN mantissa, even qNaNs
other than the canonical one are read back just as the canonical NaN.

Also attaching what Patrick wrote to generate the pow10_adjustment_tab,
for std::float16_t only 1.0, 10.0, 100.0, 1000.0 and 10000.0 are powers
of 10 in the range because __FLT16_MAX__ is 65504.0, and all of the above
are exactly representable in std::float16_t, so we want to use 0 in
pow10_adjustment_tab.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-27  Jakub Jelinek  <jakub@redhat.com>

	* include/std/charconv (__to_chars_float16_t, __to_chars_bfloat16_t):
	Declare.
	(to_chars): Add _Float16 and __gnu_cxx::__bfloat16_t overloads.
	* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export
	_ZSt20__to_chars_float16_tPcS_fSt12chars_format and
	_ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format.
	* src/c++17/floating_to_chars.cc (floating_type_float16_t,
	floating_type_bfloat16_t): New types.
	(floating_type_traits<floating_type_float16_t>,
	floating_type_traits<floating_type_bfloat16_t>,
	get_ieee_repr<floating_type_float16_t>,
	get_ieee_repr<floating_type_bfloat16_t>,
	__handle_special_value<floating_type_float16_t>,
	__handle_special_value<floating_type_bfloat16_t>): New specializations.
	(floating_to_shortest_scientific): Handle floating_type_float16_t
	and floating_type_bfloat16_t like IEEE quad.
	(__floating_to_chars_shortest): For floating_type_bfloat16_t call
	__floating_to_chars_hex<float> rather than
	__floating_to_chars_hex<floating_type_bfloat16_t> to avoid
	instantiating the latter.
	(__to_chars_float16_t, __to_chars_bfloat16_t): New functions.

--- libstdc++-v3/include/std/charconv.jj	2022-10-26 13:50:40.334716005 +0200
+++ libstdc++-v3/include/std/charconv	2022-10-26 14:19:46.523769686 +0200
@@ -738,6 +738,32 @@ namespace __detail
   to_chars_result to_chars(char* __first, char* __last, long double __value,
 			   chars_format __fmt, int __precision) noexcept;
 
+  // Library routines for 16-bit extended floating point formats
+  // using float as interchange format.
+  to_chars_result __to_chars_float16_t(char* __first, char* __last,
+				       float __value,
+				       chars_format __fmt) noexcept;
+  to_chars_result __to_chars_bfloat16_t(char* __first, char* __last,
+					float __value,
+					chars_format __fmt) noexcept;
+
+#if defined(__STDCPP_FLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
+  inline to_chars_result
+  to_chars(char* __first, char* __last, _Float16 __value) noexcept
+  {
+    return __to_chars_float16_t(__first, __last, float(__value),
+				chars_format{});
+  }
+  inline to_chars_result
+  to_chars(char* __first, char* __last, _Float16 __value,
+	   chars_format __fmt) noexcept
+  { return __to_chars_float16_t(__first, __last, float(__value), __fmt); }
+  inline to_chars_result
+  to_chars(char* __first, char* __last, _Float16 __value,
+	   chars_format __fmt, int __precision) noexcept
+  { return to_chars(__first, __last, float(__value), __fmt, __precision); }
+#endif
+
 #if defined(__STDCPP_FLOAT32_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
   inline to_chars_result
   to_chars(char* __first, char* __last, _Float32 __value) noexcept
@@ -784,6 +810,24 @@ namespace __detail
 		    __precision);
   }
 #endif
+
+#if defined(__STDCPP_BFLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
+  inline to_chars_result
+  to_chars(char* __first, char* __last,
+	   __gnu_cxx::__bfloat16_t __value) noexcept
+  {
+    return __to_chars_bfloat16_t(__first, __last, float(__value),
+				 chars_format{});
+  }
+  inline to_chars_result
+  to_chars(char* __first, char* __last, __gnu_cxx::__bfloat16_t __value,
+	   chars_format __fmt) noexcept
+  { return __to_chars_bfloat16_t(__first, __last, float(__value), __fmt); }
+  inline to_chars_result
+  to_chars(char* __first, char* __last, __gnu_cxx::__bfloat16_t __value,
+	   chars_format __fmt, int __precision) noexcept
+  { return to_chars(__first, __last, float(__value), __fmt, __precision); }
+#endif
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
--- libstdc++-v3/config/abi/pre/gnu.ver.jj	2022-09-12 11:30:14.211870202 +0200
+++ libstdc++-v3/config/abi/pre/gnu.ver	2022-10-26 16:11:53.146300799 +0200
@@ -2446,6 +2446,8 @@ GLIBCXX_3.4.30 {
 
 GLIBCXX_3.4.31 {
     _ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE15_M_replace_cold*;
+    _ZSt20__to_chars_float16_tPcS_fSt12chars_format;
+    _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format;
 } GLIBCXX_3.4.30;
 
 # Symbols in the support library (libsupc++) have their own tag.
--- libstdc++-v3/src/c++17/floating_to_chars.cc.jj	2022-05-20 11:45:18.042741567 +0200
+++ libstdc++-v3/src/c++17/floating_to_chars.cc	2022-10-26 22:54:04.890144587 +0200
@@ -374,6 +374,44 @@ namespace
     };
 #endif
 
+  // Wrappers around float for std::{,b}float16_t promoted to float.
+  struct floating_type_float16_t
+  {
+    float x;
+    operator float() const { return x; }
+  };
+  struct floating_type_bfloat16_t
+  {
+    float x;
+    operator float() const { return x; }
+  };
+
+  template<>
+    struct floating_type_traits<floating_type_float16_t>
+    {
+      static constexpr int mantissa_bits = 10;
+      static constexpr int exponent_bits = 5;
+      static constexpr bool has_implicit_leading_bit = true;
+      using mantissa_t = uint32_t;
+      using shortest_scientific_t = ryu::floating_decimal_128;
+
+      static constexpr uint64_t pow10_adjustment_tab[]
+	= { 0 };
+    };
+
+  template<>
+    struct floating_type_traits<floating_type_bfloat16_t>
+    {
+      static constexpr int mantissa_bits = 7;
+      static constexpr int exponent_bits = 8;
+      static constexpr bool has_implicit_leading_bit = true;
+      using mantissa_t = uint32_t;
+      using shortest_scientific_t = ryu::floating_decimal_128;
+
+      static constexpr uint64_t pow10_adjustment_tab[]
+	= { 0b0000111001110001101010010110100101010010000000000000000000000000 };
+    };
+
   // An IEEE-style decomposition of a floating-point value of type T.
   template<typename T>
     struct ieee_t
@@ -482,6 +520,79 @@ namespace
     }
 #endif
 
+  template<>
+    ieee_t<floating_type_float16_t>
+    get_ieee_repr(const floating_type_float16_t value)
+    {
+      using mantissa_t = typename floating_type_traits<float>::mantissa_t;
+      constexpr int mantissa_bits = floating_type_traits<float>::mantissa_bits;
+      constexpr int exponent_bits = floating_type_traits<float>::exponent_bits;
+
+      uint32_t value_bits = 0;
+      memcpy(&value_bits, &value.x, sizeof(value));
+
+      ieee_t<floating_type_float16_t> ieee_repr;
+      ieee_repr.mantissa
+	= static_cast<mantissa_t>(value_bits & ((uint32_t{1} << mantissa_bits) - 1u));
+      value_bits >>= mantissa_bits;
+      ieee_repr.biased_exponent
+	= static_cast<uint32_t>(value_bits & ((uint32_t{1} << exponent_bits) - 1u));
+      value_bits >>= exponent_bits;
+      ieee_repr.sign = (value_bits & 1) != 0;
+      // We have mantissa and biased_exponent from the float (originally
+      // float16_t converted to float).
+      // Transform that to float16_t mantissa and biased_exponent.
+      // If biased_exponent is 0, then value is +-0.0.
+      // If biased_exponent is 0x67..0x70, then it is a float16_t denormal.
+      if (ieee_repr.biased_exponent >= 0x67
+	  && ieee_repr.biased_exponent <= 0x70)
+	{
+	  int n = ieee_repr.biased_exponent - 0x67;
+	  ieee_repr.mantissa = ((uint32_t{1} << n)
+				| (ieee_repr.mantissa >> (mantissa_bits - n)));
+	  ieee_repr.biased_exponent = 0;
+	}
+      // If biased_exponent is 0xff, then it is a float16_t inf or NaN.
+      else if (ieee_repr.biased_exponent == 0xff)
+	{
+	  ieee_repr.mantissa >>= 13;
+	  ieee_repr.biased_exponent = 0x1f;
+	}
+      // If biased_exponent is 0x71..0x8e, then it is a float16_t normal number.
+      else if (ieee_repr.biased_exponent > 0x70)
+	{
+	  ieee_repr.mantissa >>= 13;
+	  ieee_repr.biased_exponent -= 0x70;
+	}
+      return ieee_repr;
+    }
+
+  template<>
+    ieee_t<floating_type_bfloat16_t>
+    get_ieee_repr(const floating_type_bfloat16_t value)
+    {
+      using mantissa_t = typename floating_type_traits<float>::mantissa_t;
+      constexpr int mantissa_bits = floating_type_traits<float>::mantissa_bits;
+      constexpr int exponent_bits = floating_type_traits<float>::exponent_bits;
+
+      uint32_t value_bits = 0;
+      memcpy(&value_bits, &value.x, sizeof(value));
+
+      ieee_t<floating_type_bfloat16_t> ieee_repr;
+      ieee_repr.mantissa
+	= static_cast<mantissa_t>(value_bits & ((uint32_t{1} << mantissa_bits) - 1u));
+      value_bits >>= mantissa_bits;
+      ieee_repr.biased_exponent
+	= static_cast<uint32_t>(value_bits & ((uint32_t{1} << exponent_bits) - 1u));
+      value_bits >>= exponent_bits;
+      ieee_repr.sign = (value_bits & 1) != 0;
+      // We have mantissa and biased_exponent from the float (originally
+      // bfloat16_t converted to float).
+      // Transform that to bfloat16_t mantissa and biased_exponent.
+      ieee_repr.mantissa >>= 16;
+      return ieee_repr;
+    }
+
   // Invoke Ryu to obtain the shortest scientific form for the given
   // floating-point number.
   template<typename T>
@@ -493,7 +604,9 @@ namespace
       else if constexpr (std::is_same_v<T, double>)
 	return ryu::floating_to_fd64(value);
       else if constexpr (std::is_same_v<T, long double>
-			 || std::is_same_v<T, F128_type>)
+			 || std::is_same_v<T, F128_type>
+			 || std::is_same_v<T, floating_type_float16_t>
+			 || std::is_same_v<T, floating_type_bfloat16_t>)
 	{
 	  constexpr int mantissa_bits
 	    = floating_type_traits<T>::mantissa_bits;
@@ -678,6 +791,28 @@ template<typename T>
     return {{first, errc{}}};
   }
 
+template<>
+  optional<to_chars_result>
+  __handle_special_value<floating_type_float16_t>(char* first,
+						  char* const last,
+						  const floating_type_float16_t value,
+						  const chars_format fmt,
+						  const int precision)
+  {
+    return __handle_special_value(first, last, value.x, fmt, precision);
+  }
+
+template<>
+  optional<to_chars_result>
+  __handle_special_value<floating_type_bfloat16_t>(char* first,
+						   char* const last,
+						   const floating_type_bfloat16_t value,
+						   const chars_format fmt,
+						   const int precision)
+  {
+    return __handle_special_value(first, last, value.x, fmt, precision);
+  }
+
 // This subroutine of the floating-point to_chars overloads performs
 // hexadecimal formatting.
 template<typename T>
@@ -922,7 +1057,15 @@ template<typename T>
 			       chars_format fmt)
   {
     if (fmt == chars_format::hex)
-      return __floating_to_chars_hex(first, last, value, nullopt);
+      {
+	// std::bfloat16_t has the same exponent range as std::float32_t
+	// and so we can avoid instantiation of __floating_to_chars_hex
+	// for bfloat16_t.  Shortest hex will be the same as for float.
+	if constexpr (is_same_v<T, floating_type_bfloat16_t>)
+	  return __floating_to_chars_hex(first, last, value.x, nullopt);
+	else
+	  return __floating_to_chars_hex(first, last, value, nullopt);
+      }
 
     __glibcxx_assert(fmt == chars_format::fixed
 		     || fmt == chars_format::scientific
@@ -1662,6 +1805,23 @@ to_chars(char* first, char* last, __floa
 }
 #endif
 
+// Entrypoints for 16-bit floats.
+[[gnu::cold]] to_chars_result
+__to_chars_float16_t(char* first, char* last, float value,
+		     chars_format fmt) noexcept
+{
+  return __floating_to_chars_shortest(first, last,
+				      floating_type_float16_t{ value }, fmt);
+}
+
+[[gnu::cold]] to_chars_result
+__to_chars_bfloat16_t(char* first, char* last, float value,
+		      chars_format fmt) noexcept
+{
+  return __floating_to_chars_shortest(first, last,
+				      floating_type_bfloat16_t{ value }, fmt);
+}
+
 #ifdef _GLIBCXX_LONG_DOUBLE_COMPAT
 // Map the -mlong-double-64 long double overloads to the double overloads.
 extern "C" to_chars_result

	Jakub

[-- Attachment #2: 2.C --]
[-- Type: text/plain, Size: 3010 bytes --]

#include <array>
#include <charconv>
#include <stdfloat>
#include <iostream>
#include <string_view>
#include <system_error>

template<typename T>
void
test(std::chars_format fmt = std::chars_format{})
{
  std::array<char, 64> str1, str2;
  union U { unsigned short s; T f; } u, v;
  for (int i = 0; i <= (unsigned short) ~0; ++i)
    {
      u.s = i;
      auto [ptr1, ec1] = std::to_chars(str1.data(), str1.data() + str1.size(), u.f, fmt);
      auto [ptr2, ec2] = std::to_chars(str2.data(), str2.data() + str2.size(), std::float32_t(u.f), fmt);
      if (ec1 != std::errc())
	{
	  std::cout << std::make_error_code(ec1).message() << '\n';
	  continue;
	}
      else if (ec2 != std::errc())
	{
	  std::cout << std::make_error_code(ec2).message() << '\n';
	  continue;
	}
      std::cout << i << ' ' << std::string_view (str1.data(), ptr1) << '\t' << std::string_view (str2.data(), ptr2) << '\n';
      if (fmt == std::chars_format::fixed)
	{
	  auto [ptr3, ec3] = std::to_chars(str1.data(), ptr1, u.f, fmt);
	  if (ec3 != std::errc() || ptr3 != ptr1)
	    throw "Consistency failure";
	  else
	    {
	      auto [ptr4, ec4] = std::to_chars(str1.data(), ptr1 - 1, u.f, fmt);
	      if (ec4 == std::errc())
		throw "Consistency failure";
	    }
	}
      auto [ptr5, ec5] = std::to_chars(str1.data(), str1.data() + str1.size(), u.f, fmt);
      std::float32_t f;
      auto [ptr6, ec6] = std::from_chars(str1.data(), ptr5, f, fmt == std::chars_format{} ? std::chars_format::general : fmt);
      if (ec6 != std::errc())
	{
	  std::cout << std::make_error_code(ec6).message() << '\n';
	  continue;
	}
      v.f = T(f);
      if (u.s != v.s)
	{
	  auto [ptr7, ec7] = std::to_chars(str2.data(), str2.data() + str1.size(), v.f, fmt);
	  if (ec7 != std::errc())
	    {
	      std::cout << std::make_error_code(ec7).message() << '\n';
	      continue;
	    }
	  std::cout << "Difference on " << i << ' ' << u.s << ' ' << v.s << ' ' << std::string_view (str1.data(), ptr5) << '\t' << std::string_view (str2.data(), ptr7) << '\n';;
	}
    }
}

int
main()
{
#ifdef __STDCPP_FLOAT16_T__
  std::cout << "float16_t" << '\n';
  test<std::float16_t>();
  std::cout << "float16_t fixed" << '\n';
  test<std::float16_t>(std::chars_format::fixed);
  std::cout << "float16_t scientific" << '\n';
  test<std::float16_t>(std::chars_format::scientific);
  std::cout << "float16_t general" << '\n';
  test<std::float16_t>(std::chars_format::general);
  std::cout << "float16_t hex" << '\n';
  test<std::float16_t>(std::chars_format::hex);
#endif
#ifdef __STDCPP_BFLOAT16_T__
  std::cout << "bfloat16_t" << '\n';
  test<std::bfloat16_t>();
  std::cout << "bfloat16_t fixed" << '\n';
  test<std::bfloat16_t>(std::chars_format::fixed);
  std::cout << "bfloat16_t scientific" << '\n';
  test<std::bfloat16_t>(std::chars_format::scientific);
  std::cout << "bfloat16_t general" << '\n';
  test<std::bfloat16_t>(std::chars_format::general);
  std::cout << "bfloat16_t hex" << '\n';
  test<std::bfloat16_t>(std::chars_format::hex);
#endif
}

[-- Attachment #3: 4.C --]
[-- Type: text/plain, Size: 1503 bytes --]

#include <charconv>
#include <iostream>
 
int
main()
{
  char buffer[1000];
#define CHECK_POW10(N) \
  do { \
    auto ec = std::to_chars(buffer, buffer+1000, 1e##N##bf16, std::chars_format::fixed); \
    std::cout << (buffer[0] == '9' ? '1' : '0'); \
  } while (0);
 
  CHECK_POW10(0);
  CHECK_POW10(1);
  CHECK_POW10(2);
  CHECK_POW10(3);
  CHECK_POW10(4);
  CHECK_POW10(5);
  CHECK_POW10(6);
  CHECK_POW10(7);
  CHECK_POW10(8);
  CHECK_POW10(9);
  CHECK_POW10(10);
  CHECK_POW10(11);
  CHECK_POW10(12);
  CHECK_POW10(13);
  CHECK_POW10(14);
  CHECK_POW10(15);
  CHECK_POW10(16);
  CHECK_POW10(17);
  CHECK_POW10(18);
  CHECK_POW10(19);
  CHECK_POW10(20);
  CHECK_POW10(21);
  CHECK_POW10(22);
  CHECK_POW10(23);
  CHECK_POW10(24);
  CHECK_POW10(25);
  CHECK_POW10(26);
  CHECK_POW10(27);
  CHECK_POW10(28);
  CHECK_POW10(29);
  CHECK_POW10(30);
  CHECK_POW10(31);
  CHECK_POW10(32);
  CHECK_POW10(33);
  CHECK_POW10(34);
  CHECK_POW10(35);
  CHECK_POW10(36);
  CHECK_POW10(37);
  CHECK_POW10(38);
  CHECK_POW10(39);
  CHECK_POW10(40);
  CHECK_POW10(41);
  CHECK_POW10(42);
  CHECK_POW10(43);
  CHECK_POW10(44);
  CHECK_POW10(45);
  CHECK_POW10(46);
  CHECK_POW10(47);
  CHECK_POW10(48);
  CHECK_POW10(49);
  CHECK_POW10(50);
  CHECK_POW10(51);
  CHECK_POW10(52);
  CHECK_POW10(53);
  CHECK_POW10(54);
  CHECK_POW10(55);
  CHECK_POW10(56);
  CHECK_POW10(57);
  CHECK_POW10(58);
  CHECK_POW10(59);
  CHECK_POW10(60);
  CHECK_POW10(61);
  CHECK_POW10(62);
  CHECK_POW10(63);
  std::cout << std::endl;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] libstdc++: std::to_chars std::{,b}float16_t support
  2022-10-27  7:59 [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jakub Jelinek
@ 2022-10-28 16:52 ` Patrick Palka
  2022-10-28 17:16   ` Jakub Jelinek
  2022-11-01 12:18   ` [PATCH] libstdc++: Shortest denormal hex std::to_chars Jakub Jelinek
  2022-11-01 12:22 ` [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jonathan Wakely
  1 sibling, 2 replies; 7+ messages in thread
From: Patrick Palka @ 2022-10-28 16:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jonathan Wakely, Patrick Palka, gcc-patches, libstdc++

On Thu, 27 Oct 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following patch on top of
> https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
> adds std::{,b}float16_t support for std::to_chars.
> When precision is specified (or for std::bfloat16_t for hex mode even if not),
> I believe we can just use the std::to_chars float (when float is mode
> compatible with std::float32_t) overloads, both formats are proper subsets
> of std::float32_t.
> Unfortunately when precision is not specified and we are supposed to emit
> shortest string, the std::{,b}float16_t strings are usually much shorter.
> E.g. 1.e7p-14f16 shortest fixed representation is
> 0.0001161 and shortest scientific representation is
> 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
> 0.00011610985 and
> 1.1610985e-04.
> Similarly for 1.38p-112bf16,
> 0.000000000000000000000000000000000235
> 2.35e-34 vs. 1.38p-112f32
> 0.00000000000000000000000000000000023472271
> 2.3472271e-34
> For std::float16_t there are differences even in the shortest hex, say:
> 0.01p-14 vs. 1p-22
> but only for denormal std::float16_t values (where all std::float16_t
> denormals converted to std::float32_t are normal), __FLT16_MIN__ and
> everything larger in absolute value than that is the same.  Unless
> that is a bug and we should try to discover shorter representations
> even for denormals...

IIRC for hex formatting of denormals I opted to be consistent with how
glibc printf formats them, instead of outputting the truly shortest
form.

I wouldn't be against using the float32 overloads even for shortest hex
formatting of float16.  The output is shorter but equivalent so it
shouldn't cause any problems.

> std::bfloat16_t has the same exponent range as std::float32_t, so all
> std::bfloat16_t denormals are also std::float32_t denormals and thus
> the shortest hex representations are the same.
> 
> As documented, ryu can handle arbitrary IEEE like floating point formats
> (probably not wider than IEEE quad) using the generic_128 handling, but
> ryu is hidden in libstdc++.so.  As only few architectures support
> std::float16_t right now and some of them have special ISA requirements
> for those (e.g. on i?86 one needs -msse2) and std::bfloat16_t is right
> now supported only on x86 (again with -msse2), perhaps with aarch64/arm
> coming next if ARM is interested, but I think it is possible that more
> will be added later, instead of exporting APIs from the library to handle
> directly the std::{,b}float16_t overloads this patch instead exports
> functions which take a float which is a superset of those and expects
> the inline overloads to promote the 16-bit formats to 32-bit, then inside
> of the library it ensures they are printed right.
> With the added [[gnu::cold]] attribute because I think most users
> will primarily use these formats as storage formats and perform arithmetics
> in the excess precision for them and print also as std::float32_t the
> added support doesn't seem to be too large, on x86_64:
> readelf -Ws libstdc++.so.6.0.31 | grep float16_t
>    912: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
>   5767: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
>    842: 000000000016d430   106 FUNC    LOCAL  DEFAULT   13 _ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_
>    865: 0000000000170980  1613 FUNC    LOCAL  DEFAULT   13 _ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0
>   7205: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format
>   7985: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format
> so 3568 code bytes together or so.

Ouch, the instantiation of __floating_to_chars_hex for float16 is
responsible for nearly 50% of the .so size increase

> 
> Tested with the attached test (which doesn't prove the shortest
> representation, just prints std::{,b}float16_t and std::float32_t
> shortest strings side by side, then tries to verify it can be
> emitted even into the exact sized range and can't be into range
> one smaller than that and tries to read what is printed
> back using from_chars float32_t overload (so there could be
> double rounding, but apparently there is none for the shortest strings).
> The only differences printed are for NaNs, where sNaNs are canonicalized
> to canonical qNaNs and as to_chars doesn't print NaN mantissa, even qNaNs
> other than the canonical one are read back just as the canonical NaN.
> 
> Also attaching what Patrick wrote to generate the pow10_adjustment_tab,
> for std::float16_t only 1.0, 10.0, 100.0, 1000.0 and 10000.0 are powers
> of 10 in the range because __FLT16_MAX__ is 65504.0, and all of the above
> are exactly representable in std::float16_t, so we want to use 0 in
> pow10_adjustment_tab.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-10-27  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* include/std/charconv (__to_chars_float16_t, __to_chars_bfloat16_t):
> 	Declare.
> 	(to_chars): Add _Float16 and __gnu_cxx::__bfloat16_t overloads.
> 	* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export
> 	_ZSt20__to_chars_float16_tPcS_fSt12chars_format and
> 	_ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format.
> 	* src/c++17/floating_to_chars.cc (floating_type_float16_t,
> 	floating_type_bfloat16_t): New types.
> 	(floating_type_traits<floating_type_float16_t>,
> 	floating_type_traits<floating_type_bfloat16_t>,
> 	get_ieee_repr<floating_type_float16_t>,
> 	get_ieee_repr<floating_type_bfloat16_t>,
> 	__handle_special_value<floating_type_float16_t>,
> 	__handle_special_value<floating_type_bfloat16_t>): New specializations.
> 	(floating_to_shortest_scientific): Handle floating_type_float16_t
> 	and floating_type_bfloat16_t like IEEE quad.
> 	(__floating_to_chars_shortest): For floating_type_bfloat16_t call
> 	__floating_to_chars_hex<float> rather than
> 	__floating_to_chars_hex<floating_type_bfloat16_t> to avoid
> 	instantiating the latter.
> 	(__to_chars_float16_t, __to_chars_bfloat16_t): New functions.
> 
> --- libstdc++-v3/include/std/charconv.jj	2022-10-26 13:50:40.334716005 +0200
> +++ libstdc++-v3/include/std/charconv	2022-10-26 14:19:46.523769686 +0200
> @@ -738,6 +738,32 @@ namespace __detail
>    to_chars_result to_chars(char* __first, char* __last, long double __value,
>  			   chars_format __fmt, int __precision) noexcept;
>  
> +  // Library routines for 16-bit extended floating point formats
> +  // using float as interchange format.
> +  to_chars_result __to_chars_float16_t(char* __first, char* __last,
> +				       float __value,
> +				       chars_format __fmt) noexcept;
> +  to_chars_result __to_chars_bfloat16_t(char* __first, char* __last,
> +					float __value,
> +					chars_format __fmt) noexcept;
> +
> +#if defined(__STDCPP_FLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last, _Float16 __value) noexcept
> +  {
> +    return __to_chars_float16_t(__first, __last, float(__value),
> +				chars_format{});
> +  }
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last, _Float16 __value,
> +	   chars_format __fmt) noexcept
> +  { return __to_chars_float16_t(__first, __last, float(__value), __fmt); }
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last, _Float16 __value,
> +	   chars_format __fmt, int __precision) noexcept
> +  { return to_chars(__first, __last, float(__value), __fmt, __precision); }

FWIW when formatting as hex with explicit precision, the output is based
off of the shortest hex form, so going through the float32 overloads here
will mean that

  to_chars(1p-22f16, hex, 2)

outputs 0.01p-14 instead of 1.00p-22 I think.  But again this difference
in denormal hex output shouldn't cause any problems.

> +#endif
> +
>  #if defined(__STDCPP_FLOAT32_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
>    inline to_chars_result
>    to_chars(char* __first, char* __last, _Float32 __value) noexcept
> @@ -784,6 +810,24 @@ namespace __detail
>  		    __precision);
>    }
>  #endif
> +
> +#if defined(__STDCPP_BFLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last,
> +	   __gnu_cxx::__bfloat16_t __value) noexcept
> +  {
> +    return __to_chars_bfloat16_t(__first, __last, float(__value),
> +				 chars_format{});
> +  }
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last, __gnu_cxx::__bfloat16_t __value,
> +	   chars_format __fmt) noexcept
> +  { return __to_chars_bfloat16_t(__first, __last, float(__value), __fmt); }
> +  inline to_chars_result
> +  to_chars(char* __first, char* __last, __gnu_cxx::__bfloat16_t __value,
> +	   chars_format __fmt, int __precision) noexcept
> +  { return to_chars(__first, __last, float(__value), __fmt, __precision); }
> +#endif
>  #endif
>  
>  _GLIBCXX_END_NAMESPACE_VERSION
> --- libstdc++-v3/config/abi/pre/gnu.ver.jj	2022-09-12 11:30:14.211870202 +0200
> +++ libstdc++-v3/config/abi/pre/gnu.ver	2022-10-26 16:11:53.146300799 +0200
> @@ -2446,6 +2446,8 @@ GLIBCXX_3.4.30 {
>  
>  GLIBCXX_3.4.31 {
>      _ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE15_M_replace_cold*;
> +    _ZSt20__to_chars_float16_tPcS_fSt12chars_format;
> +    _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format;
>  } GLIBCXX_3.4.30;
>  
>  # Symbols in the support library (libsupc++) have their own tag.
> --- libstdc++-v3/src/c++17/floating_to_chars.cc.jj	2022-05-20 11:45:18.042741567 +0200
> +++ libstdc++-v3/src/c++17/floating_to_chars.cc	2022-10-26 22:54:04.890144587 +0200
> @@ -374,6 +374,44 @@ namespace
>      };
>  #endif
>  
> +  // Wrappers around float for std::{,b}float16_t promoted to float.
> +  struct floating_type_float16_t
> +  {
> +    float x;
> +    operator float() const { return x; }
> +  };
> +  struct floating_type_bfloat16_t
> +  {
> +    float x;
> +    operator float() const { return x; }
> +  };
> +
> +  template<>
> +    struct floating_type_traits<floating_type_float16_t>
> +    {
> +      static constexpr int mantissa_bits = 10;
> +      static constexpr int exponent_bits = 5;
> +      static constexpr bool has_implicit_leading_bit = true;
> +      using mantissa_t = uint32_t;
> +      using shortest_scientific_t = ryu::floating_decimal_128;
> +
> +      static constexpr uint64_t pow10_adjustment_tab[]
> +	= { 0 };
> +    };
> +
> +  template<>
> +    struct floating_type_traits<floating_type_bfloat16_t>
> +    {
> +      static constexpr int mantissa_bits = 7;
> +      static constexpr int exponent_bits = 8;
> +      static constexpr bool has_implicit_leading_bit = true;
> +      using mantissa_t = uint32_t;
> +      using shortest_scientific_t = ryu::floating_decimal_128;
> +
> +      static constexpr uint64_t pow10_adjustment_tab[]
> +	= { 0b0000111001110001101010010110100101010010000000000000000000000000 };
> +    };
> +
>    // An IEEE-style decomposition of a floating-point value of type T.
>    template<typename T>
>      struct ieee_t
> @@ -482,6 +520,79 @@ namespace
>      }
>  #endif
>  
> +  template<>
> +    ieee_t<floating_type_float16_t>
> +    get_ieee_repr(const floating_type_float16_t value)
> +    {
> +      using mantissa_t = typename floating_type_traits<float>::mantissa_t;
> +      constexpr int mantissa_bits = floating_type_traits<float>::mantissa_bits;
> +      constexpr int exponent_bits = floating_type_traits<float>::exponent_bits;
> +
> +      uint32_t value_bits = 0;
> +      memcpy(&value_bits, &value.x, sizeof(value));
> +
> +      ieee_t<floating_type_float16_t> ieee_repr;
> +      ieee_repr.mantissa
> +	= static_cast<mantissa_t>(value_bits & ((uint32_t{1} << mantissa_bits) - 1u));
> +      value_bits >>= mantissa_bits;
> +      ieee_repr.biased_exponent
> +	= static_cast<uint32_t>(value_bits & ((uint32_t{1} << exponent_bits) - 1u));
> +      value_bits >>= exponent_bits;
> +      ieee_repr.sign = (value_bits & 1) != 0;
> +      // We have mantissa and biased_exponent from the float (originally
> +      // float16_t converted to float).
> +      // Transform that to float16_t mantissa and biased_exponent.
> +      // If biased_exponent is 0, then value is +-0.0.
> +      // If biased_exponent is 0x67..0x70, then it is a float16_t denormal.
> +      if (ieee_repr.biased_exponent >= 0x67
> +	  && ieee_repr.biased_exponent <= 0x70)
> +	{
> +	  int n = ieee_repr.biased_exponent - 0x67;
> +	  ieee_repr.mantissa = ((uint32_t{1} << n)
> +				| (ieee_repr.mantissa >> (mantissa_bits - n)));
> +	  ieee_repr.biased_exponent = 0;
> +	}
> +      // If biased_exponent is 0xff, then it is a float16_t inf or NaN.
> +      else if (ieee_repr.biased_exponent == 0xff)
> +	{
> +	  ieee_repr.mantissa >>= 13;
> +	  ieee_repr.biased_exponent = 0x1f;
> +	}
> +      // If biased_exponent is 0x71..0x8e, then it is a float16_t normal number.
> +      else if (ieee_repr.biased_exponent > 0x70)
> +	{
> +	  ieee_repr.mantissa >>= 13;
> +	  ieee_repr.biased_exponent -= 0x70;
> +	}
> +      return ieee_repr;
> +    }
> +
> +  template<>
> +    ieee_t<floating_type_bfloat16_t>
> +    get_ieee_repr(const floating_type_bfloat16_t value)
> +    {
> +      using mantissa_t = typename floating_type_traits<float>::mantissa_t;
> +      constexpr int mantissa_bits = floating_type_traits<float>::mantissa_bits;
> +      constexpr int exponent_bits = floating_type_traits<float>::exponent_bits;
> +
> +      uint32_t value_bits = 0;
> +      memcpy(&value_bits, &value.x, sizeof(value));
> +
> +      ieee_t<floating_type_bfloat16_t> ieee_repr;
> +      ieee_repr.mantissa
> +	= static_cast<mantissa_t>(value_bits & ((uint32_t{1} << mantissa_bits) - 1u));
> +      value_bits >>= mantissa_bits;
> +      ieee_repr.biased_exponent
> +	= static_cast<uint32_t>(value_bits & ((uint32_t{1} << exponent_bits) - 1u));
> +      value_bits >>= exponent_bits;
> +      ieee_repr.sign = (value_bits & 1) != 0;
> +      // We have mantissa and biased_exponent from the float (originally
> +      // bfloat16_t converted to float).
> +      // Transform that to bfloat16_t mantissa and biased_exponent.
> +      ieee_repr.mantissa >>= 16;
> +      return ieee_repr;
> +    }
> +
>    // Invoke Ryu to obtain the shortest scientific form for the given
>    // floating-point number.
>    template<typename T>
> @@ -493,7 +604,9 @@ namespace
>        else if constexpr (std::is_same_v<T, double>)
>  	return ryu::floating_to_fd64(value);
>        else if constexpr (std::is_same_v<T, long double>
> -			 || std::is_same_v<T, F128_type>)
> +			 || std::is_same_v<T, F128_type>
> +			 || std::is_same_v<T, floating_type_float16_t>
> +			 || std::is_same_v<T, floating_type_bfloat16_t>)
>  	{
>  	  constexpr int mantissa_bits
>  	    = floating_type_traits<T>::mantissa_bits;
> @@ -678,6 +791,28 @@ template<typename T>
>      return {{first, errc{}}};
>    }
>  
> +template<>
> +  optional<to_chars_result>
> +  __handle_special_value<floating_type_float16_t>(char* first,
> +						  char* const last,
> +						  const floating_type_float16_t value,
> +						  const chars_format fmt,
> +						  const int precision)
> +  {
> +    return __handle_special_value(first, last, value.x, fmt, precision);
> +  }
> +
> +template<>
> +  optional<to_chars_result>
> +  __handle_special_value<floating_type_bfloat16_t>(char* first,
> +						   char* const last,
> +						   const floating_type_bfloat16_t value,
> +						   const chars_format fmt,
> +						   const int precision)
> +  {
> +    return __handle_special_value(first, last, value.x, fmt, precision);
> +  }
> +
>  // This subroutine of the floating-point to_chars overloads performs
>  // hexadecimal formatting.
>  template<typename T>
> @@ -922,7 +1057,15 @@ template<typename T>
>  			       chars_format fmt)
>    {
>      if (fmt == chars_format::hex)
> -      return __floating_to_chars_hex(first, last, value, nullopt);
> +      {
> +	// std::bfloat16_t has the same exponent range as std::float32_t
> +	// and so we can avoid instantiation of __floating_to_chars_hex
> +	// for bfloat16_t.  Shortest hex will be the same as for float.
> +	if constexpr (is_same_v<T, floating_type_bfloat16_t>)
> +	  return __floating_to_chars_hex(first, last, value.x, nullopt);

In light of the above, I'm inclined to suggest we might as well go
through float for the shortest hex formatting of float16 too.

> +	else
> +	  return __floating_to_chars_hex(first, last, value, nullopt);
> +      }
>  
>      __glibcxx_assert(fmt == chars_format::fixed
>  		     || fmt == chars_format::scientific
> @@ -1662,6 +1805,23 @@ to_chars(char* first, char* last, __floa
>  }
>  #endif
>  
> +// Entrypoints for 16-bit floats.
> +[[gnu::cold]] to_chars_result
> +__to_chars_float16_t(char* first, char* last, float value,
> +		     chars_format fmt) noexcept
> +{
> +  return __floating_to_chars_shortest(first, last,
> +				      floating_type_float16_t{ value }, fmt);
> +}
> +
> +[[gnu::cold]] to_chars_result
> +__to_chars_bfloat16_t(char* first, char* last, float value,
> +		      chars_format fmt) noexcept
> +{
> +  return __floating_to_chars_shortest(first, last,
> +				      floating_type_bfloat16_t{ value }, fmt);
> +}
> +
>  #ifdef _GLIBCXX_LONG_DOUBLE_COMPAT
>  // Map the -mlong-double-64 long double overloads to the double overloads.
>  extern "C" to_chars_result
> 
> 	Jakub
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] libstdc++: std::to_chars std::{,b}float16_t support
  2022-10-28 16:52 ` Patrick Palka
@ 2022-10-28 17:16   ` Jakub Jelinek
  2022-11-01 12:18   ` [PATCH] libstdc++: Shortest denormal hex std::to_chars Jakub Jelinek
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Jelinek @ 2022-10-28 17:16 UTC (permalink / raw)
  To: Patrick Palka; +Cc: Jonathan Wakely, gcc-patches, libstdc++

On Fri, Oct 28, 2022 at 12:52:44PM -0400, Patrick Palka wrote:
> IIRC for hex formatting of denormals I opted to be consistent with how
> glibc printf formats them, instead of outputting the truly shortest
> form.

Note, it isn't just denormals,
1.18cp-4
2.318p-5
4.63p-6
8.c6p-7
463p-10
8c6p-11
also represent the same number, the first is what glibc emits (and
is certainly nicer to read), but some of the others are shorter.

Now, the printf %a/%A documentation says that there must be one hexadecimal
digit before the dot if any and that for normalized numbers it must be
non-zero.
So that rules out the last 2, and allows but doesn't require the denormal
treatment the library does right now.
If we shall go really for the shortest, we should handle denormals with
non-zero leading digit too and for all cases consider the 4 shifting
possibilities which one results in shortest (perhaps prefer the smallest
non-zero leading digit among the shortest)?
> > readelf -Ws libstdc++.so.6.0.31 | grep float16_t
> >    912: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
> >   5767: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
> >    842: 000000000016d430   106 FUNC    LOCAL  DEFAULT   13 _ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_
> >    865: 0000000000170980  1613 FUNC    LOCAL  DEFAULT   13 _ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0
> >   7205: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format
> >   7985: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format
> > so 3568 code bytes together or so.
> 
> Ouch, the instantiation of __floating_to_chars_hex for float16 is
> responsible for nearly 50% of the .so size increase

True, but the increase isn't that huge.

	Jakub


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] libstdc++: Shortest denormal hex std::to_chars
  2022-10-28 16:52 ` Patrick Palka
  2022-10-28 17:16   ` Jakub Jelinek
@ 2022-11-01 12:18   ` Jakub Jelinek
  2022-11-01 12:24     ` Jonathan Wakely
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2022-11-01 12:18 UTC (permalink / raw)
  To: Patrick Palka; +Cc: Jonathan Wakely, gcc-patches, libstdc++

On Fri, Oct 28, 2022 at 12:52:44PM -0400, Patrick Palka wrote:
> > The following patch on top of
> > https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
> > adds std::{,b}float16_t support for std::to_chars.
> > When precision is specified (or for std::bfloat16_t for hex mode even if not),
> > I believe we can just use the std::to_chars float (when float is mode
> > compatible with std::float32_t) overloads, both formats are proper subsets
> > of std::float32_t.
> > Unfortunately when precision is not specified and we are supposed to emit
> > shortest string, the std::{,b}float16_t strings are usually much shorter.
> > E.g. 1.e7p-14f16 shortest fixed representation is
> > 0.0001161 and shortest scientific representation is
> > 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
> > 0.00011610985 and
> > 1.1610985e-04.
> > Similarly for 1.38p-112bf16,
> > 0.000000000000000000000000000000000235
> > 2.35e-34 vs. 1.38p-112f32
> > 0.00000000000000000000000000000000023472271
> > 2.3472271e-34
> > For std::float16_t there are differences even in the shortest hex, say:
> > 0.01p-14 vs. 1p-22
> > but only for denormal std::float16_t values (where all std::float16_t
> > denormals converted to std::float32_t are normal), __FLT16_MIN__ and
> > everything larger in absolute value than that is the same.  Unless
> > that is a bug and we should try to discover shorter representations
> > even for denormals...
> 
> IIRC for hex formatting of denormals I opted to be consistent with how
> glibc printf formats them, instead of outputting the truly shortest
> form.
> 
> I wouldn't be against using the float32 overloads even for shortest hex
> formatting of float16.  The output is shorter but equivalent so it
> shouldn't cause any problems.

The following patch changes the behavior of the shortest hex denormals,
such that they are printed like normals (so for has_implicit_leading_bit
with 1p-149 instead of 0.000002p-126 etc., otherwise (Intel extended)
with the leading digit before dot being [89abcdef]).  I think for all the
supported format it is never longer, it can be equal length e.g. for
0.fffffep-126 vs. 1.fffffcp-127 but fortunately no largest subnormal
in any format has the unbiased exponent like -9, -99, -999, -9999 because
then it would be longer and often it is shorter, sometimes much shorter.

For the cases with precision it keeps the handling as is.

While for !has_implicit_leading_bit we for normals or with this patch
even denormals have really shortest representation, for other formats
we sometimes do not, but this patch doesn't deal with that (we
always use 1.NNN while we could use 1.NNN up to f.NNN and by that shortening
by the last hexit if the last hexit doesn't have least significant bit set
and unbiased exponent is not -9, -99, -999 or -9999.

Tested on x86_64-linux (on top of the 3 to/from_chars {,b}float16_t
patches).

2022-11-01  Jakub Jelinek  <jakub@redhat.com>

	* src/c++17/floating_to_chars.cc (__floating_to_chars_hex): Drop const
	from unbiased_exponent.  Canonicalize denormals such that they have
	the leading bit set by shifting effective mantissa up and decreasing
	unbiased_exponent.
	(__floating_to_chars_shortest): Don't instantiate
	__floating_to_chars_hex for float16_t either and use float instead.
	* testsuite/20_util/to_chars/float.cc (float_to_chars_test_cases):
	Adjust testcases for shortest hex denormals.
	* testsuite/20_util/to_chars/double.cc (double_to_chars_test_cases):
	Likewise.

--- libstdc++-v3/src/c++17/floating_to_chars.cc.jj	2022-10-31 22:20:35.881121902 +0100
+++ libstdc++-v3/src/c++17/floating_to_chars.cc	2022-11-01 12:16:14.352652455 +0100
@@ -844,9 +844,9 @@ template<typename T>
     const bool is_normal_number = (biased_exponent != 0);
 
     // Calculate the unbiased exponent.
-    const int32_t unbiased_exponent = (is_normal_number
-				       ? biased_exponent - exponent_bias
-				       : 1 - exponent_bias);
+    int32_t unbiased_exponent = (is_normal_number
+				 ? biased_exponent - exponent_bias
+				 : 1 - exponent_bias);
 
     // Shift the mantissa so that its bitwidth is a multiple of 4.
     constexpr unsigned rounded_mantissa_bits = (mantissa_bits + 3) / 4 * 4;
@@ -863,6 +863,16 @@ template<typename T>
 	  __glibcxx_assert(effective_mantissa & (mantissa_t{1} << (mantissa_bits
 								   - 1u)));
       }
+    else if (!precision.has_value() && effective_mantissa)
+      {
+	// 1.8p-23 is shorter than 0.00cp-14, so if precision is
+	// omitted, try to canonicalize denormals such that they
+	// have the leading bit set.
+	int width = __bit_width(effective_mantissa);
+	int shift = rounded_mantissa_bits - width + has_implicit_leading_bit;
+	unbiased_exponent -= shift;
+	effective_mantissa <<= shift;
+      }
 
     // Compute the shortest precision needed to print this value exactly,
     // disregarding trailing zeros.
@@ -1061,7 +1071,10 @@ template<typename T>
 	// std::bfloat16_t has the same exponent range as std::float32_t
 	// and so we can avoid instantiation of __floating_to_chars_hex
 	// for bfloat16_t.  Shortest hex will be the same as for float.
-	if constexpr (is_same_v<T, floating_type_bfloat16_t>)
+	// When we print shortest form even for denormals, we can do it
+	// for std::float16_t as well.
+	if constexpr (is_same_v<T, floating_type_float16_t>
+		      || is_same_v<T, floating_type_bfloat16_t>)
 	  return __floating_to_chars_hex(first, last, value.x, nullopt);
 	else
 	  return __floating_to_chars_hex(first, last, value, nullopt);
--- libstdc++-v3/testsuite/20_util/to_chars/float.cc.jj	2022-01-11 22:31:41.605755528 +0100
+++ libstdc++-v3/testsuite/20_util/to_chars/float.cc	2022-11-01 12:34:21.370882443 +0100
@@ -521,8 +521,8 @@ inline constexpr float_to_chars_testcase
 
     // Test hexfloat corner cases.
     {0x1.728p+0f, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
-    {0x0.000002p-126f, chars_format::hex, "0.000002p-126"}, // instead of "1p-149", min subnormal
-    {0x0.fffffep-126f, chars_format::hex, "0.fffffep-126"}, // max subnormal
+    {0x0.000002p-126f, chars_format::hex, "1p-149"}, // min subnormal
+    {0x0.fffffep-126f, chars_format::hex, "1.fffffcp-127"}, // max subnormal
     {0x1p-126f, chars_format::hex, "1p-126"}, // min normal
     {0x1.fffffep+127f, chars_format::hex, "1.fffffep+127"}, // max normal
 
--- libstdc++-v3/testsuite/20_util/to_chars/double.cc.jj	2022-01-11 22:31:41.604755542 +0100
+++ libstdc++-v3/testsuite/20_util/to_chars/double.cc	2022-11-01 12:42:39.753112522 +0100
@@ -2821,8 +2821,8 @@ inline constexpr double_to_chars_testcas
 
     // Test hexfloat corner cases.
     {0x1.728p+0, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
-    {0x0.0000000000001p-1022, chars_format::hex, "0.0000000000001p-1022"}, // instead of "1p-1074", min subnormal
-    {0x0.fffffffffffffp-1022, chars_format::hex, "0.fffffffffffffp-1022"}, // max subnormal
+    {0x0.0000000000001p-1022, chars_format::hex, "1p-1074"}, // min subnormal
+    {0x0.fffffffffffffp-1022, chars_format::hex, "1.ffffffffffffep-1023"}, // max subnormal
     {0x1p-1022, chars_format::hex, "1p-1022"}, // min normal
     {0x1.fffffffffffffp+1023, chars_format::hex, "1.fffffffffffffp+1023"}, // max normal
 

	Jakub


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] libstdc++: std::to_chars std::{,b}float16_t support
  2022-10-27  7:59 [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jakub Jelinek
  2022-10-28 16:52 ` Patrick Palka
@ 2022-11-01 12:22 ` Jonathan Wakely
  1 sibling, 0 replies; 7+ messages in thread
From: Jonathan Wakely @ 2022-11-01 12:22 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Patrick Palka, gcc-patches, libstdc++

On Thu, 27 Oct 2022 at 09:00, Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The following patch on top of
> https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
> adds std::{,b}float16_t support for std::to_chars.
> When precision is specified (or for std::bfloat16_t for hex mode even if not),
> I believe we can just use the std::to_chars float (when float is mode
> compatible with std::float32_t) overloads, both formats are proper subsets
> of std::float32_t.
> Unfortunately when precision is not specified and we are supposed to emit
> shortest string, the std::{,b}float16_t strings are usually much shorter.
> E.g. 1.e7p-14f16 shortest fixed representation is
> 0.0001161 and shortest scientific representation is
> 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
> 0.00011610985 and
> 1.1610985e-04.
> Similarly for 1.38p-112bf16,
> 0.000000000000000000000000000000000235
> 2.35e-34 vs. 1.38p-112f32
> 0.00000000000000000000000000000000023472271
> 2.3472271e-34
> For std::float16_t there are differences even in the shortest hex, say:
> 0.01p-14 vs. 1p-22
> but only for denormal std::float16_t values (where all std::float16_t
> denormals converted to std::float32_t are normal), __FLT16_MIN__ and
> everything larger in absolute value than that is the same.  Unless
> that is a bug and we should try to discover shorter representations
> even for denormals...
> std::bfloat16_t has the same exponent range as std::float32_t, so all
> std::bfloat16_t denormals are also std::float32_t denormals and thus
> the shortest hex representations are the same.
>
> As documented, ryu can handle arbitrary IEEE like floating point formats
> (probably not wider than IEEE quad) using the generic_128 handling, but
> ryu is hidden in libstdc++.so.  As only few architectures support
> std::float16_t right now and some of them have special ISA requirements
> for those (e.g. on i?86 one needs -msse2) and std::bfloat16_t is right
> now supported only on x86 (again with -msse2), perhaps with aarch64/arm
> coming next if ARM is interested, but I think it is possible that more
> will be added later, instead of exporting APIs from the library to handle
> directly the std::{,b}float16_t overloads this patch instead exports
> functions which take a float which is a superset of those and expects
> the inline overloads to promote the 16-bit formats to 32-bit, then inside
> of the library it ensures they are printed right.
> With the added [[gnu::cold]] attribute because I think most users
> will primarily use these formats as storage formats and perform arithmetics
> in the excess precision for them and print also as std::float32_t the
> added support doesn't seem to be too large, on x86_64:
> readelf -Ws libstdc++.so.6.0.31 | grep float16_t
>    912: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
>   5767: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
>    842: 000000000016d430   106 FUNC    LOCAL  DEFAULT   13 _ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_
>    865: 0000000000170980  1613 FUNC    LOCAL  DEFAULT   13 _ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0
>   7205: 00000000000ae824   950 FUNC    GLOBAL DEFAULT   13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format
>   7985: 00000000000ae4a1   899 FUNC    GLOBAL DEFAULT   13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format
> so 3568 code bytes together or so.
>
> Tested with the attached test (which doesn't prove the shortest
> representation, just prints std::{,b}float16_t and std::float32_t
> shortest strings side by side, then tries to verify it can be
> emitted even into the exact sized range and can't be into range
> one smaller than that and tries to read what is printed
> back using from_chars float32_t overload (so there could be
> double rounding, but apparently there is none for the shortest strings).
> The only differences printed are for NaNs, where sNaNs are canonicalized
> to canonical qNaNs and as to_chars doesn't print NaN mantissa, even qNaNs
> other than the canonical one are read back just as the canonical NaN.
>
> Also attaching what Patrick wrote to generate the pow10_adjustment_tab,
> for std::float16_t only 1.0, 10.0, 100.0, 1000.0 and 10000.0 are powers
> of 10 in the range because __FLT16_MAX__ is 65504.0, and all of the above
> are exactly representable in std::float16_t, so we want to use 0 in
> pow10_adjustment_tab.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>

Unless I misunderstood something in Patrick's review, this is good and
can be incrementally improved.

OK for trunk, thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] libstdc++: Shortest denormal hex std::to_chars
  2022-11-01 12:18   ` [PATCH] libstdc++: Shortest denormal hex std::to_chars Jakub Jelinek
@ 2022-11-01 12:24     ` Jonathan Wakely
  2022-11-01 13:46       ` Patrick Palka
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Wakely @ 2022-11-01 12:24 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Patrick Palka, gcc-patches, libstdc++

On Tue, 1 Nov 2022 at 12:18, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Oct 28, 2022 at 12:52:44PM -0400, Patrick Palka wrote:
> > > The following patch on top of
> > > https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
> > > adds std::{,b}float16_t support for std::to_chars.
> > > When precision is specified (or for std::bfloat16_t for hex mode even if not),
> > > I believe we can just use the std::to_chars float (when float is mode
> > > compatible with std::float32_t) overloads, both formats are proper subsets
> > > of std::float32_t.
> > > Unfortunately when precision is not specified and we are supposed to emit
> > > shortest string, the std::{,b}float16_t strings are usually much shorter.
> > > E.g. 1.e7p-14f16 shortest fixed representation is
> > > 0.0001161 and shortest scientific representation is
> > > 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
> > > 0.00011610985 and
> > > 1.1610985e-04.
> > > Similarly for 1.38p-112bf16,
> > > 0.000000000000000000000000000000000235
> > > 2.35e-34 vs. 1.38p-112f32
> > > 0.00000000000000000000000000000000023472271
> > > 2.3472271e-34
> > > For std::float16_t there are differences even in the shortest hex, say:
> > > 0.01p-14 vs. 1p-22
> > > but only for denormal std::float16_t values (where all std::float16_t
> > > denormals converted to std::float32_t are normal), __FLT16_MIN__ and
> > > everything larger in absolute value than that is the same.  Unless
> > > that is a bug and we should try to discover shorter representations
> > > even for denormals...
> >
> > IIRC for hex formatting of denormals I opted to be consistent with how
> > glibc printf formats them, instead of outputting the truly shortest
> > form.
> >
> > I wouldn't be against using the float32 overloads even for shortest hex
> > formatting of float16.  The output is shorter but equivalent so it
> > shouldn't cause any problems.
>
> The following patch changes the behavior of the shortest hex denormals,
> such that they are printed like normals (so for has_implicit_leading_bit
> with 1p-149 instead of 0.000002p-126 etc., otherwise (Intel extended)
> with the leading digit before dot being [89abcdef]).  I think for all the
> supported format it is never longer, it can be equal length e.g. for
> 0.fffffep-126 vs. 1.fffffcp-127 but fortunately no largest subnormal
> in any format has the unbiased exponent like -9, -99, -999, -9999 because
> then it would be longer and often it is shorter, sometimes much shorter.
>
> For the cases with precision it keeps the handling as is.
>
> While for !has_implicit_leading_bit we for normals or with this patch
> even denormals have really shortest representation, for other formats
> we sometimes do not, but this patch doesn't deal with that (we
> always use 1.NNN while we could use 1.NNN up to f.NNN and by that shortening
> by the last hexit if the last hexit doesn't have least significant bit set
> and unbiased exponent is not -9, -99, -999 or -9999.
>
> Tested on x86_64-linux (on top of the 3 to/from_chars {,b}float16_t
> patches).

This looks good to me. Please give Patrick a chance to comment, but
it's approved for trunk unless he objects. Thanks!


>
> 2022-11-01  Jakub Jelinek  <jakub@redhat.com>
>
>         * src/c++17/floating_to_chars.cc (__floating_to_chars_hex): Drop const
>         from unbiased_exponent.  Canonicalize denormals such that they have
>         the leading bit set by shifting effective mantissa up and decreasing
>         unbiased_exponent.
>         (__floating_to_chars_shortest): Don't instantiate
>         __floating_to_chars_hex for float16_t either and use float instead.
>         * testsuite/20_util/to_chars/float.cc (float_to_chars_test_cases):
>         Adjust testcases for shortest hex denormals.
>         * testsuite/20_util/to_chars/double.cc (double_to_chars_test_cases):
>         Likewise.
>
> --- libstdc++-v3/src/c++17/floating_to_chars.cc.jj      2022-10-31 22:20:35.881121902 +0100
> +++ libstdc++-v3/src/c++17/floating_to_chars.cc 2022-11-01 12:16:14.352652455 +0100
> @@ -844,9 +844,9 @@ template<typename T>
>      const bool is_normal_number = (biased_exponent != 0);
>
>      // Calculate the unbiased exponent.
> -    const int32_t unbiased_exponent = (is_normal_number
> -                                      ? biased_exponent - exponent_bias
> -                                      : 1 - exponent_bias);
> +    int32_t unbiased_exponent = (is_normal_number
> +                                ? biased_exponent - exponent_bias
> +                                : 1 - exponent_bias);
>
>      // Shift the mantissa so that its bitwidth is a multiple of 4.
>      constexpr unsigned rounded_mantissa_bits = (mantissa_bits + 3) / 4 * 4;
> @@ -863,6 +863,16 @@ template<typename T>
>           __glibcxx_assert(effective_mantissa & (mantissa_t{1} << (mantissa_bits
>                                                                    - 1u)));
>        }
> +    else if (!precision.has_value() && effective_mantissa)
> +      {
> +       // 1.8p-23 is shorter than 0.00cp-14, so if precision is
> +       // omitted, try to canonicalize denormals such that they
> +       // have the leading bit set.
> +       int width = __bit_width(effective_mantissa);
> +       int shift = rounded_mantissa_bits - width + has_implicit_leading_bit;
> +       unbiased_exponent -= shift;
> +       effective_mantissa <<= shift;
> +      }
>
>      // Compute the shortest precision needed to print this value exactly,
>      // disregarding trailing zeros.
> @@ -1061,7 +1071,10 @@ template<typename T>
>         // std::bfloat16_t has the same exponent range as std::float32_t
>         // and so we can avoid instantiation of __floating_to_chars_hex
>         // for bfloat16_t.  Shortest hex will be the same as for float.
> -       if constexpr (is_same_v<T, floating_type_bfloat16_t>)
> +       // When we print shortest form even for denormals, we can do it
> +       // for std::float16_t as well.
> +       if constexpr (is_same_v<T, floating_type_float16_t>
> +                     || is_same_v<T, floating_type_bfloat16_t>)
>           return __floating_to_chars_hex(first, last, value.x, nullopt);
>         else
>           return __floating_to_chars_hex(first, last, value, nullopt);
> --- libstdc++-v3/testsuite/20_util/to_chars/float.cc.jj 2022-01-11 22:31:41.605755528 +0100
> +++ libstdc++-v3/testsuite/20_util/to_chars/float.cc    2022-11-01 12:34:21.370882443 +0100
> @@ -521,8 +521,8 @@ inline constexpr float_to_chars_testcase
>
>      // Test hexfloat corner cases.
>      {0x1.728p+0f, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
> -    {0x0.000002p-126f, chars_format::hex, "0.000002p-126"}, // instead of "1p-149", min subnormal
> -    {0x0.fffffep-126f, chars_format::hex, "0.fffffep-126"}, // max subnormal
> +    {0x0.000002p-126f, chars_format::hex, "1p-149"}, // min subnormal
> +    {0x0.fffffep-126f, chars_format::hex, "1.fffffcp-127"}, // max subnormal
>      {0x1p-126f, chars_format::hex, "1p-126"}, // min normal
>      {0x1.fffffep+127f, chars_format::hex, "1.fffffep+127"}, // max normal
>
> --- libstdc++-v3/testsuite/20_util/to_chars/double.cc.jj        2022-01-11 22:31:41.604755542 +0100
> +++ libstdc++-v3/testsuite/20_util/to_chars/double.cc   2022-11-01 12:42:39.753112522 +0100
> @@ -2821,8 +2821,8 @@ inline constexpr double_to_chars_testcas
>
>      // Test hexfloat corner cases.
>      {0x1.728p+0, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
> -    {0x0.0000000000001p-1022, chars_format::hex, "0.0000000000001p-1022"}, // instead of "1p-1074", min subnormal
> -    {0x0.fffffffffffffp-1022, chars_format::hex, "0.fffffffffffffp-1022"}, // max subnormal
> +    {0x0.0000000000001p-1022, chars_format::hex, "1p-1074"}, // min subnormal
> +    {0x0.fffffffffffffp-1022, chars_format::hex, "1.ffffffffffffep-1023"}, // max subnormal
>      {0x1p-1022, chars_format::hex, "1p-1022"}, // min normal
>      {0x1.fffffffffffffp+1023, chars_format::hex, "1.fffffffffffffp+1023"}, // max normal
>
>
>         Jakub
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] libstdc++: Shortest denormal hex std::to_chars
  2022-11-01 12:24     ` Jonathan Wakely
@ 2022-11-01 13:46       ` Patrick Palka
  0 siblings, 0 replies; 7+ messages in thread
From: Patrick Palka @ 2022-11-01 13:46 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Jakub Jelinek, Patrick Palka, gcc-patches, libstdc++

On Tue, 1 Nov 2022, Jonathan Wakely wrote:

> On Tue, 1 Nov 2022 at 12:18, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Fri, Oct 28, 2022 at 12:52:44PM -0400, Patrick Palka wrote:
> > > > The following patch on top of
> > > > https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
> > > > adds std::{,b}float16_t support for std::to_chars.
> > > > When precision is specified (or for std::bfloat16_t for hex mode even if not),
> > > > I believe we can just use the std::to_chars float (when float is mode
> > > > compatible with std::float32_t) overloads, both formats are proper subsets
> > > > of std::float32_t.
> > > > Unfortunately when precision is not specified and we are supposed to emit
> > > > shortest string, the std::{,b}float16_t strings are usually much shorter.
> > > > E.g. 1.e7p-14f16 shortest fixed representation is
> > > > 0.0001161 and shortest scientific representation is
> > > > 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
> > > > 0.00011610985 and
> > > > 1.1610985e-04.
> > > > Similarly for 1.38p-112bf16,
> > > > 0.000000000000000000000000000000000235
> > > > 2.35e-34 vs. 1.38p-112f32
> > > > 0.00000000000000000000000000000000023472271
> > > > 2.3472271e-34
> > > > For std::float16_t there are differences even in the shortest hex, say:
> > > > 0.01p-14 vs. 1p-22
> > > > but only for denormal std::float16_t values (where all std::float16_t
> > > > denormals converted to std::float32_t are normal), __FLT16_MIN__ and
> > > > everything larger in absolute value than that is the same.  Unless
> > > > that is a bug and we should try to discover shorter representations
> > > > even for denormals...
> > >
> > > IIRC for hex formatting of denormals I opted to be consistent with how
> > > glibc printf formats them, instead of outputting the truly shortest
> > > form.
> > >
> > > I wouldn't be against using the float32 overloads even for shortest hex
> > > formatting of float16.  The output is shorter but equivalent so it
> > > shouldn't cause any problems.
> >
> > The following patch changes the behavior of the shortest hex denormals,
> > such that they are printed like normals (so for has_implicit_leading_bit
> > with 1p-149 instead of 0.000002p-126 etc., otherwise (Intel extended)
> > with the leading digit before dot being [89abcdef]).  I think for all the
> > supported format it is never longer, it can be equal length e.g. for
> > 0.fffffep-126 vs. 1.fffffcp-127 but fortunately no largest subnormal
> > in any format has the unbiased exponent like -9, -99, -999, -9999 because
> > then it would be longer and often it is shorter, sometimes much shorter.
> >
> > For the cases with precision it keeps the handling as is.
> >
> > While for !has_implicit_leading_bit we for normals or with this patch
> > even denormals have really shortest representation, for other formats
> > we sometimes do not, but this patch doesn't deal with that (we
> > always use 1.NNN while we could use 1.NNN up to f.NNN and by that shortening
> > by the last hexit if the last hexit doesn't have least significant bit set
> > and unbiased exponent is not -9, -99, -999 or -9999.
> >
> > Tested on x86_64-linux (on top of the 3 to/from_chars {,b}float16_t
> > patches).
> 
> This looks good to me. Please give Patrick a chance to comment, but
> it's approved for trunk unless he objects. Thanks!

LGTM.  This'll mean the output of to_chars(denormal, hex, precision)
will no longer be based on the shortest form to_chars(denormal, hex)
which slightly bothers me, but doesn't seem to be nonconforming either.

> 
> 
> >
> > 2022-11-01  Jakub Jelinek  <jakub@redhat.com>
> >
> >         * src/c++17/floating_to_chars.cc (__floating_to_chars_hex): Drop const
> >         from unbiased_exponent.  Canonicalize denormals such that they have
> >         the leading bit set by shifting effective mantissa up and decreasing
> >         unbiased_exponent.
> >         (__floating_to_chars_shortest): Don't instantiate
> >         __floating_to_chars_hex for float16_t either and use float instead.
> >         * testsuite/20_util/to_chars/float.cc (float_to_chars_test_cases):
> >         Adjust testcases for shortest hex denormals.
> >         * testsuite/20_util/to_chars/double.cc (double_to_chars_test_cases):
> >         Likewise.
> >
> > --- libstdc++-v3/src/c++17/floating_to_chars.cc.jj      2022-10-31 22:20:35.881121902 +0100
> > +++ libstdc++-v3/src/c++17/floating_to_chars.cc 2022-11-01 12:16:14.352652455 +0100
> > @@ -844,9 +844,9 @@ template<typename T>
> >      const bool is_normal_number = (biased_exponent != 0);
> >
> >      // Calculate the unbiased exponent.
> > -    const int32_t unbiased_exponent = (is_normal_number
> > -                                      ? biased_exponent - exponent_bias
> > -                                      : 1 - exponent_bias);
> > +    int32_t unbiased_exponent = (is_normal_number
> > +                                ? biased_exponent - exponent_bias
> > +                                : 1 - exponent_bias);
> >
> >      // Shift the mantissa so that its bitwidth is a multiple of 4.
> >      constexpr unsigned rounded_mantissa_bits = (mantissa_bits + 3) / 4 * 4;
> > @@ -863,6 +863,16 @@ template<typename T>
> >           __glibcxx_assert(effective_mantissa & (mantissa_t{1} << (mantissa_bits
> >                                                                    - 1u)));
> >        }
> > +    else if (!precision.has_value() && effective_mantissa)
> > +      {
> > +       // 1.8p-23 is shorter than 0.00cp-14, so if precision is
> > +       // omitted, try to canonicalize denormals such that they
> > +       // have the leading bit set.
> > +       int width = __bit_width(effective_mantissa);
> > +       int shift = rounded_mantissa_bits - width + has_implicit_leading_bit;
> > +       unbiased_exponent -= shift;
> > +       effective_mantissa <<= shift;
> > +      }
> >
> >      // Compute the shortest precision needed to print this value exactly,
> >      // disregarding trailing zeros.
> > @@ -1061,7 +1071,10 @@ template<typename T>
> >         // std::bfloat16_t has the same exponent range as std::float32_t
> >         // and so we can avoid instantiation of __floating_to_chars_hex
> >         // for bfloat16_t.  Shortest hex will be the same as for float.
> > -       if constexpr (is_same_v<T, floating_type_bfloat16_t>)
> > +       // When we print shortest form even for denormals, we can do it
> > +       // for std::float16_t as well.
> > +       if constexpr (is_same_v<T, floating_type_float16_t>
> > +                     || is_same_v<T, floating_type_bfloat16_t>)
> >           return __floating_to_chars_hex(first, last, value.x, nullopt);
> >         else
> >           return __floating_to_chars_hex(first, last, value, nullopt);
> > --- libstdc++-v3/testsuite/20_util/to_chars/float.cc.jj 2022-01-11 22:31:41.605755528 +0100
> > +++ libstdc++-v3/testsuite/20_util/to_chars/float.cc    2022-11-01 12:34:21.370882443 +0100
> > @@ -521,8 +521,8 @@ inline constexpr float_to_chars_testcase
> >
> >      // Test hexfloat corner cases.
> >      {0x1.728p+0f, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
> > -    {0x0.000002p-126f, chars_format::hex, "0.000002p-126"}, // instead of "1p-149", min subnormal
> > -    {0x0.fffffep-126f, chars_format::hex, "0.fffffep-126"}, // max subnormal
> > +    {0x0.000002p-126f, chars_format::hex, "1p-149"}, // min subnormal
> > +    {0x0.fffffep-126f, chars_format::hex, "1.fffffcp-127"}, // max subnormal
> >      {0x1p-126f, chars_format::hex, "1p-126"}, // min normal
> >      {0x1.fffffep+127f, chars_format::hex, "1.fffffep+127"}, // max normal
> >
> > --- libstdc++-v3/testsuite/20_util/to_chars/double.cc.jj        2022-01-11 22:31:41.604755542 +0100
> > +++ libstdc++-v3/testsuite/20_util/to_chars/double.cc   2022-11-01 12:42:39.753112522 +0100
> > @@ -2821,8 +2821,8 @@ inline constexpr double_to_chars_testcas
> >
> >      // Test hexfloat corner cases.
> >      {0x1.728p+0, chars_format::hex, "1.728p+0"}, // instead of "2.e5p-1"
> > -    {0x0.0000000000001p-1022, chars_format::hex, "0.0000000000001p-1022"}, // instead of "1p-1074", min subnormal
> > -    {0x0.fffffffffffffp-1022, chars_format::hex, "0.fffffffffffffp-1022"}, // max subnormal
> > +    {0x0.0000000000001p-1022, chars_format::hex, "1p-1074"}, // min subnormal
> > +    {0x0.fffffffffffffp-1022, chars_format::hex, "1.ffffffffffffep-1023"}, // max subnormal
> >      {0x1p-1022, chars_format::hex, "1p-1022"}, // min normal
> >      {0x1.fffffffffffffp+1023, chars_format::hex, "1.fffffffffffffp+1023"}, // max normal
> >
> >
> >         Jakub
> >
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-01 13:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-27  7:59 [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jakub Jelinek
2022-10-28 16:52 ` Patrick Palka
2022-10-28 17:16   ` Jakub Jelinek
2022-11-01 12:18   ` [PATCH] libstdc++: Shortest denormal hex std::to_chars Jakub Jelinek
2022-11-01 12:24     ` Jonathan Wakely
2022-11-01 13:46       ` Patrick Palka
2022-11-01 12:22 ` [PATCH] libstdc++: std::to_chars std::{,b}float16_t support Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).