public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] libstdc++: Make chrono::hh_mm_ss more compact
@ 2022-11-21 20:43 Jonathan Wakely
  2022-12-06 21:40 ` Jonathan Wakely
  2022-12-12  7:10 ` Stephan Bergmann
  0 siblings, 2 replies; 6+ messages in thread
From: Jonathan Wakely @ 2022-11-21 20:43 UTC (permalink / raw)
  To: libstdc++, gcc-patches; +Cc: Patrick Palka

While finishing the time zone support for C++20, I noticed that the
chrono::hh_mm_ss type is surprisingly large: 40 bytes. That's because
we use duration<int64_t, ...> for each of the four members, _M_h, _M_m,
_M_s and _M_ss. This is very wasteful. The patch below reduces it to 16
bytes (or less for some targets) for most specializations using an
integral type for the hh_mm_ss<D>::precision::rep type.

Patrick, you added hh_mm_ss, but I assume you just implemented what is
shown for the exposition-only members in the standard, rather than
intentionally choosing this 40-byte representation? I don't recall
discussing it at the time, but I might be forgetting something.

Does the patch below make sense? I could remove the partial
specialization that use a single byte for the subseconds, or only do
that on 32-bit targets where alignof(int64_t) is 4. For targets where
alignof(int64_t) is 8 we're always going to be 16 bytes minimum, so
maybe there's no point squeezing everything into 12 bytes if we have a
mandatory 4 bytes of padding.

We could make those members use even less space, something like:

hours _M_h;
unsigned _M_m : 6;
unsigned _M_s : 6;
unsigned _M_is_neg : 1;
unsigned _M_ss : 51;  // good enough for femtoseconds

This would only require 16 bytes even for hh_mm_ss<femtoseconds>, but
requires extra instructions to read the values. That seems like the
wrong trade off. A compact layout for hh_mm_ss<system_clock::duration>
and hh_mm_ss<high_resolution_clock::duration> seems good enough. Paying
extra instructions to access the members just to support incredibly high
precision doesn't make sense to me.

We could also elide the bool _M_is_neg member if _Duration::rep is
unsigned, but it doesn't seem worth the metaprogramming/maintenance
cost.

We could consider using duration<uint_least32_t, ratio<3600>> for the
_M_h member instead of chrono::hours (which uses int64_t). It seems
unlikely (but not impossible) anybody will want to use hh_mm_ss for
billions of hours past midnight. If we do that, then using a single byte
for subseconds (when possible) does make sense, because the whole
hh_mm_ss would fit in just 8 bytes (4 for hours, 1 for minutes, 1 for
seconds, 1 for is_neg, and 1 for the subseconds). And then bit-fields
start to look more appealing. Maybe just for the subseconds and is_neg
flag, which would be 8 bytes for precision::period >= ratio<1,10'000>:

uint_least32_t _M_h;
uint_least8_t  _M_m;
uint_least8_t  _M_s;
uint_least16_t _M_ss : 15;
uint_least16_t _M_is_neg : 1;

Has anybody got any other suggestions for optimizing this or other
chrono types? e.g. I think we should also consider using int32_t as the
rep for chrono::{days,weeks,years} because a 64-bit type for years is
unnecessary.

It looks like other implementations don't bother trying to save space
for chrono::hh_mm_ss, which makes me wonder if there's some reason not
to do this. For MSVC it's only 32 bytes, which I think is because they
use a 32-bit type for chrono::hours and chrono::minutes, not because
they're doing anything special to make chrono::hh_mm_ss smaller.


-- >8 --

This uses a single byte for the minutes and seconds members, and places
the bool member next to those single bytes. This means we do not need 40
bytes to store a time that can fit in a single 8-byte integer.

When there is no subsecond precision we can do away with the _M_ss
member altogether. If the subsecond precision is coarse enough, we can
use a smaller representation for _M_ss, e.g. hh_mm_ss<milliseconds> only
needs uint_least32_t for _M_ss, and hh_mm_ss<duration<long, ratio<1,10>>
and hh_mm_ss<duration<int8_t, nano>> each only needs a single byte. In
the latter case the type can only ever represent up to 255ns anyway, so
we don't need a larger representation type (in such cases, we could even
remove the _M_h, _M_m and _M_s members, but it's a very unlikely
scenario that isn't worth optimizing for).

Except for specializations with a floating-point rep or using higher
precision than nanoseconds, hh_mm_ss should now fit in 16 bytes, or even
12 bytes for x86-32 where alignof(long long) == 4.

libstdc++-v3/ChangeLog:

	* include/std/chrono (chrono::hh_mm_ss): Do not use 64-bit
	representations for all four duration members. Reorder members.
	(hh_mm_ss::_S_fractional_width()): Optimize using countr_zero.
	(hh_mm_ss::hh_mm_ss()): Define as defaulted.
	(hh_mm_ss::hh_mm_ss(Duration)): Delegate to a new private
	constructor, instead of calling chrono::abs repeatedly.
	* testsuite/std/time/hh_mm_ss/1.cc: Check floating-point
	representations. Check default constructor. Check sizes.
---
 libstdc++-v3/include/std/chrono               | 139 +++++++++++++-----
 libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc |  54 ++++++-
 2 files changed, 158 insertions(+), 35 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 2468023f6c5..4ee163ba762 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -41,6 +41,7 @@
 #include <bits/chrono.h>
 
 #if __cplusplus >= 202002L
+# include <bit>
 # include <sstream>
 # include <string>
 # include <vector>
@@ -2262,6 +2263,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
     // HH_MM_SS
 
+    /// @cond undocumented
     namespace __detail
     {
       consteval long long
@@ -2273,22 +2275,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	return __r;
       }
     }
+    /// @endcond
 
+    /// A duration representing a time since midnight (00h:00min:00s).
     template<typename _Duration>
       class hh_mm_ss
       {
+	static_assert( __is_duration<_Duration>::value );
+
       private:
-	static constexpr int
+	static consteval int
 	_S_fractional_width()
 	{
-	  int __multiplicity_2 = 0;
-	  int __multiplicity_5 = 0;
 	  auto __den = _Duration::period::den;
-	  while ((__den % 2) == 0)
-	    {
-	      ++__multiplicity_2;
-	      __den /= 2;
-	    }
+	  const int __multiplicity_2 = std::__countr_zero((uintmax_t)__den);
+	  __den >>= __multiplicity_2;
+	  int __multiplicity_5 = 0;
 	  while ((__den % 5) == 0)
 	    {
 	      ++__multiplicity_5;
@@ -2304,6 +2306,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  return __width;
 	}
 
+	constexpr
+	hh_mm_ss(_Duration __d, bool __is_neg) noexcept
+	: _M_h (duration_cast<chrono::hours>(__d)),
+	  _M_m (duration_cast<chrono::minutes>(__d - hours())),
+	  _M_s (duration_cast<chrono::seconds>(__d - hours() - minutes())),
+	  _M_is_neg(__is_neg)
+	{
+	  auto __ss = __d - hours() - minutes() - seconds();
+	  if constexpr (treat_as_floating_point_v<typename precision::rep>)
+	    _M_ss._M_r = __ss.count();
+	  else if constexpr (precision::period::den != 1)
+	    _M_ss._M_r = duration_cast<precision>(__ss).count();
+	}
+
       public:
 	static constexpr unsigned fractional_width = {_S_fractional_width()};
 
@@ -2312,28 +2328,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 				   chrono::seconds::rep>,
 		     ratio<1, __detail::__pow10(fractional_width)>>;
 
-	constexpr
-	hh_mm_ss() noexcept
-	: hh_mm_ss{_Duration::zero()}
-	{ }
+	constexpr hh_mm_ss() noexcept = default;
 
 	constexpr explicit
 	hh_mm_ss(_Duration __d) noexcept
-	: _M_is_neg (__d < _Duration::zero()),
-	  _M_h (duration_cast<chrono::hours>(abs(__d))),
-	  _M_m (duration_cast<chrono::minutes>(abs(__d) - hours())),
-	  _M_s (duration_cast<chrono::seconds>(abs(__d) - hours() - minutes()))
-	{
-	  if constexpr (treat_as_floating_point_v<typename precision::rep>)
-	    _M_ss = abs(__d) - hours() - minutes() - seconds();
-	  else
-	    _M_ss = duration_cast<precision>(abs(__d) - hours()
-					     - minutes() - seconds());
-	}
+	: hh_mm_ss(chrono::abs(__d), __d < _Duration::zero())
+	{ }
 
 	constexpr bool
 	is_negative() const noexcept
-	{ return _M_is_neg; }
+	{
+	  if constexpr (!__is_unsigned)
+	    return _M_is_neg;
+	  else
+	    return false;
+	}
 
 	constexpr chrono::hours
 	hours() const noexcept
@@ -2349,7 +2358,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 	constexpr precision
 	subseconds() const noexcept
-	{ return _M_ss; }
+	{ return static_cast<precision>(_M_ss); }
 
 	constexpr explicit
 	operator precision() const noexcept
@@ -2358,20 +2367,82 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	constexpr precision
 	to_duration() const noexcept
 	{
-	  if (_M_is_neg)
-	    return -(_M_h + _M_m + _M_s + _M_ss);
-	  else
-	    return _M_h + _M_m + _M_s + _M_ss;
+	  if constexpr (!__is_unsigned)
+	    if (_M_is_neg)
+	      return -(_M_h + _M_m + _M_s + subseconds());
+	  return _M_h + _M_m + _M_s + subseconds();
 	}
 
 	// TODO: Implement operator<<.
 
       private:
-	bool _M_is_neg;
-	chrono::hours _M_h;
-	chrono::minutes _M_m;
-	chrono::seconds _M_s;
-	precision _M_ss;
+	static constexpr bool __is_unsigned
+	  = __and_v<is_integral<typename _Duration::rep>,
+		    is_unsigned<typename _Duration::rep>>;
+
+	template<typename _Ratio>
+	  using __byte_duration = duration<unsigned char, _Ratio>;
+
+	// The type of the _M_ss member that holds the subsecond precision.
+	template<typename _Dur>
+	  struct __subseconds
+	  {
+	    typename _Dur::rep _M_r{};
+
+	    constexpr explicit
+	    operator _Dur() const noexcept
+	    { return _Dur(_M_r); }
+	  };
+
+	// An empty class if this precision doesn't need subseconds.
+	template<typename _Rep, typename _Period>
+	  requires (!treat_as_floating_point_v<_Rep>) && (_Period::den == 1)
+	  struct __subseconds<duration<_Rep, _Period>>
+	  {
+	    constexpr explicit
+	    operator duration<_Rep, _Period>() const noexcept
+	    { return {}; }
+	  };
+
+	// True if the maximum constructor argument can be represented in _Tp.
+	template<typename _Tp>
+	  static constexpr bool __fits
+	    = duration_values<typename _Duration::rep>::max()
+		<= duration_values<_Tp>::max();
+
+	template<typename _Rep, typename _Period>
+	  requires (!treat_as_floating_point_v<_Rep>)
+	    && ratio_less_v<_Period, ratio<1, 1>>
+	    && (ratio_greater_equal_v<_Period, ratio<1, 250>>
+		  || __fits<unsigned char>)
+	  struct __subseconds<duration<_Rep, _Period>>
+	  {
+	    unsigned char _M_r{};
+
+	    constexpr explicit
+	    operator duration<_Rep, _Period>() const noexcept
+	    { return duration<_Rep, _Period>(_M_r); }
+	  };
+
+	template<typename _Rep, typename _Period>
+	  requires (!treat_as_floating_point_v<_Rep>)
+	    && ratio_less_v<_Period, ratio<1, 250>>
+	    && (ratio_greater_equal_v<_Period, ratio<1, 4'000'000'000>>
+		  || __fits<uint_least32_t>)
+	  struct __subseconds<duration<_Rep, _Period>>
+	  {
+	    uint_least32_t _M_r{};
+
+	    constexpr explicit
+	    operator duration<_Rep, _Period>() const noexcept
+	    { return duration<_Rep, _Period>(_M_r); }
+	  };
+
+	chrono::hours		    _M_h{};
+	__byte_duration<ratio<60>>  _M_m{};
+	__byte_duration<ratio<1>>   _M_s{};
+	bool			    _M_is_neg{};
+	__subseconds<precision>	    _M_ss{};
       };
 
     // 12/24 HOURS FUNCTIONS
diff --git a/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc b/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
index 31dd1b2a8f3..1c63276a5f0 100644
--- a/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
+++ b/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
@@ -59,5 +59,57 @@ constexpr_hh_mm_ss()
 
   static_assert(seconds{hh_mm_ss{100min}} == 100min);
 
-  // TODO: treat_as_floating_point_v
+  // treat_as_floating_point_v
+  using fseconds = duration<double, ratio<1>>;
+  constexpr hh_mm_ss<fseconds> fsec{0x123.0004p5s};
+  static_assert(std::is_same_v<hh_mm_ss<fseconds>::precision, fseconds>);
+  static_assert(fsec.hours() == 2h);
+  static_assert(fsec.minutes() == 35min);
+  static_assert(fsec.seconds() == 12s);
+  static_assert(fsec.subseconds() == 0x.0004p5s);
+  static_assert(!fsec.is_negative());
+
+  using fminutes = duration<double, ratio<60>>;
+  constexpr hh_mm_ss<fminutes> fmin{-0x1.23p4min};
+  static_assert(std::is_same_v<hh_mm_ss<fminutes>::precision, fseconds>);
+  static_assert(fmin.hours() == 0h);
+  static_assert(fmin.minutes() == 18min);
+  static_assert(fmin.seconds() == 11s);
+  static_assert(fmin.subseconds() == 0.25s);
+  static_assert(fmin.is_negative());
+}
+
+constexpr void
+default_construction()
+{
+  using namespace std::chrono;
+
+  constexpr hh_mm_ss<seconds> s1;
+  static_assert(s1.to_duration() == s1.to_duration().zero());
+  constexpr hh_mm_ss<duration<char>> s2;
+  static_assert(s2.to_duration() == s2.to_duration().zero());
+  constexpr hh_mm_ss<duration<int, std::centi>> s3;
+  static_assert(s3.to_duration() == s3.to_duration().zero());
+  constexpr hh_mm_ss<duration<long long, std::femto>> s4;
+  static_assert(s4.to_duration() == s4.to_duration().zero());
+  constexpr hh_mm_ss<duration<double>> s5;
+  static_assert(s5.to_duration() == s5.to_duration().zero());
+}
+
+constexpr void
+size()
+{
+  using namespace std::chrono;
+
+  struct S0 { long long h; char m; char s; bool neg; };
+  static_assert(sizeof(hh_mm_ss<seconds>) == sizeof(S0));
+  struct S1 { long long h; char m; char s; bool neg; char ss; };
+  static_assert(sizeof(hh_mm_ss<duration<int, std::centi>>) == sizeof(S1));
+  struct S2 { long long h; char m, s; bool neg; int ss; };
+  static_assert(sizeof(hh_mm_ss<duration<int, std::milli>>) == sizeof(S2));
+  static_assert(sizeof(hh_mm_ss<duration<int, std::pico>>) == sizeof(S2));
+  struct S3 { long long h; char m, s; bool neg; long long ss; };
+  static_assert(sizeof(hh_mm_ss<duration<long long, std::pico>>) == sizeof(S3));
+  struct S4 { long long h; char m, s; bool neg; double ss; };
+  static_assert(sizeof(hh_mm_ss<duration<double, std::micro>>) == sizeof(S4));
 }
-- 
2.38.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-12 16:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 20:43 [PATCH] libstdc++: Make chrono::hh_mm_ss more compact Jonathan Wakely
2022-12-06 21:40 ` Jonathan Wakely
2022-12-12  7:10 ` Stephan Bergmann
2022-12-12 11:18   ` Jonathan Wakely
2022-12-12 14:06     ` Jonathan Wakely
2022-12-12 16:30     ` Stephan Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).