public inbox for libstdc++-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r11-7371] libstdc++: More efficient is_leap
@ 2021-02-24 18:52 Jonathan Wakely
  0 siblings, 0 replies; only message in thread
From: Jonathan Wakely @ 2021-02-24 18:52 UTC (permalink / raw)
  To: gcc-cvs, libstdc++-cvs

https://gcc.gnu.org/g:126793971bee0e92bea237823bdc51a594951faa

commit r11-7371-g126793971bee0e92bea237823bdc51a594951faa
Author: Cassio Neri <cassio.neri@gmail.com>
Date:   Wed Feb 24 17:37:36 2021 +0000

    libstdc++: More efficient is_leap
    
    This patch reimplements std::chrono::year::is_leap().  Leap year check is
    ubiquitously implemented (including here) as:
    
        y % 4 == 0 && (y % 100 != 0 || y % 400 == 0).
    
    The rationale being that testing divisibility by 4 first implies an earlier
    return for 75% of the cases, therefore, avoiding the needless calculations of
    y % 100 and y % 400. Although this fact is true, it does not take into account
    the cost of branching.  This patch, instead, tests divisibility by 100 first:
    
        (y % 100 != 0 || y % 400 == 0) && y % 4 == 0.
    
    It is certainly counterintuitive that this could be more efficient since among
    the three divisibility tests (4, 100 and 400) the one by 100 is the only one
    that can never provide a definitive answer and a second divisibility test (by 4
    or 400) is always required. However, measurements [1] in x86_64 suggest this is
    3x more efficient!  A possible explanation is that checking divisibility by 100
    first implies a split in the execution path with probabilities of (1%, 99%)
    rather than (25%, 75%) when divisibility by 4 is checked first.  This decreases
    the entropy of the branching distribution which seems to help prediction.
    
    Given that y belongs to [-32767, 32767] [time.cal.year.members], a more
    efficient algorithm [2] to check divisibility by 100 is used (instead of
    y % 100 != 0).  Measurements suggest that this optimization improves performance
    by 20%.
    
    The patch adds a test that exhaustively compares the result of this
    implementation with the ubiquitous one for all y in [-32767, 32767]. Although
    its completeness, the test completes in a matter of seconds.
    
    References:
    [1] https://stackoverflow.com/a/60646967/1137388
    [2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
    
    libstdc++-v3/ChangeLog:
    
            * include/std/chrono (year::is_leap): New implementation.
            * testsuite/std/time/year/2.cc: New test.

Diff:
---
 libstdc++-v3/include/std/chrono           | 21 ++++++++++++-
 libstdc++-v3/testsuite/std/time/year/2.cc | 52 +++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index b03167863cd..3ba35a5bc86 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -1597,7 +1597,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       constexpr bool
       is_leap() const noexcept
-      { return _M_y % 4 == 0 && (_M_y % 100 != 0 || _M_y % 400 == 0); }
+      {
+	// Testing divisibility by 100 first gives better performance, that is,
+	// return (_M_y % 100 != 0 || _M_y % 400 == 0) && _M_y % 4 == 0;
+
+	// It gets even faster if _M_y is in [-536870800, 536870999]
+	// (which is the case here) and _M_y % 100 is replaced by
+	// __is_multiple_of_100 below.
+
+	// References:
+	// [1] https://github.com/cassioneri/calendar
+	// [2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
+
+	constexpr uint32_t __multiplier   = 42949673;
+	constexpr uint32_t __bound        = 42949669;
+	constexpr uint32_t __max_dividend = 1073741799;
+	constexpr uint32_t __offset       = __max_dividend / 2 / 100 * 100;
+	const bool __is_multiple_of_100
+	  = __multiplier * (_M_y + __offset) < __bound;
+	return (!__is_multiple_of_100 || _M_y % 400 == 0) && _M_y % 4 == 0;
+      }
 
       explicit constexpr
       operator int() const noexcept
diff --git a/libstdc++-v3/testsuite/std/time/year/2.cc b/libstdc++-v3/testsuite/std/time/year/2.cc
new file mode 100644
index 00000000000..57fab24d647
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/time/year/2.cc
@@ -0,0 +1,52 @@
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }
+
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// Class year [time.cal.year_month_day]
+
+#include <chrono>
+#include <testsuite_hooks.h>
+
+// Slow but clear test for leap year.
+constexpr bool
+is_leap_year(const std::chrono::year& y) noexcept
+{
+  const int n = static_cast<int>(y);
+  return n % 4 == 0 && (n % 100 != 0 || n % 400 == 0);
+}
+
+void test01()
+{
+  using namespace std::chrono;
+
+  year y{-32767};
+  while (y < year{32767}) {
+    VERIFY( y.is_leap() ==  is_leap_year(y) );
+    ++y;
+  }
+
+  // One more for y = 32767.
+  VERIFY( year{32767}.is_leap() == is_leap_year(year{32767}) );
+}
+
+int main()
+{
+  test01();
+  return 0;
+}


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-02-24 18:52 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-24 18:52 [gcc r11-7371] libstdc++: More efficient is_leap Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).