public inbox for libstdc++-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r11-7371] libstdc++: More efficient is_leap
@ 2021-02-24 18:52 Jonathan Wakely
0 siblings, 0 replies; only message in thread
From: Jonathan Wakely @ 2021-02-24 18:52 UTC (permalink / raw)
To: gcc-cvs, libstdc++-cvs
https://gcc.gnu.org/g:126793971bee0e92bea237823bdc51a594951faa
commit r11-7371-g126793971bee0e92bea237823bdc51a594951faa
Author: Cassio Neri <cassio.neri@gmail.com>
Date: Wed Feb 24 17:37:36 2021 +0000
libstdc++: More efficient is_leap
This patch reimplements std::chrono::year::is_leap(). Leap year check is
ubiquitously implemented (including here) as:
y % 4 == 0 && (y % 100 != 0 || y % 400 == 0).
The rationale being that testing divisibility by 4 first implies an earlier
return for 75% of the cases, therefore, avoiding the needless calculations of
y % 100 and y % 400. Although this fact is true, it does not take into account
the cost of branching. This patch, instead, tests divisibility by 100 first:
(y % 100 != 0 || y % 400 == 0) && y % 4 == 0.
It is certainly counterintuitive that this could be more efficient since among
the three divisibility tests (4, 100 and 400) the one by 100 is the only one
that can never provide a definitive answer and a second divisibility test (by 4
or 400) is always required. However, measurements [1] in x86_64 suggest this is
3x more efficient! A possible explanation is that checking divisibility by 100
first implies a split in the execution path with probabilities of (1%, 99%)
rather than (25%, 75%) when divisibility by 4 is checked first. This decreases
the entropy of the branching distribution which seems to help prediction.
Given that y belongs to [-32767, 32767] [time.cal.year.members], a more
efficient algorithm [2] to check divisibility by 100 is used (instead of
y % 100 != 0). Measurements suggest that this optimization improves performance
by 20%.
The patch adds a test that exhaustively compares the result of this
implementation with the ubiquitous one for all y in [-32767, 32767]. Although
its completeness, the test completes in a matter of seconds.
References:
[1] https://stackoverflow.com/a/60646967/1137388
[2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
libstdc++-v3/ChangeLog:
* include/std/chrono (year::is_leap): New implementation.
* testsuite/std/time/year/2.cc: New test.
Diff:
---
libstdc++-v3/include/std/chrono | 21 ++++++++++++-
libstdc++-v3/testsuite/std/time/year/2.cc | 52 +++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+), 1 deletion(-)
diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index b03167863cd..3ba35a5bc86 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -1597,7 +1597,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
constexpr bool
is_leap() const noexcept
- { return _M_y % 4 == 0 && (_M_y % 100 != 0 || _M_y % 400 == 0); }
+ {
+ // Testing divisibility by 100 first gives better performance, that is,
+ // return (_M_y % 100 != 0 || _M_y % 400 == 0) && _M_y % 4 == 0;
+
+ // It gets even faster if _M_y is in [-536870800, 536870999]
+ // (which is the case here) and _M_y % 100 is replaced by
+ // __is_multiple_of_100 below.
+
+ // References:
+ // [1] https://github.com/cassioneri/calendar
+ // [2] https://accu.org/journals/overload/28/155/overload155.pdf#page=16
+
+ constexpr uint32_t __multiplier = 42949673;
+ constexpr uint32_t __bound = 42949669;
+ constexpr uint32_t __max_dividend = 1073741799;
+ constexpr uint32_t __offset = __max_dividend / 2 / 100 * 100;
+ const bool __is_multiple_of_100
+ = __multiplier * (_M_y + __offset) < __bound;
+ return (!__is_multiple_of_100 || _M_y % 400 == 0) && _M_y % 4 == 0;
+ }
explicit constexpr
operator int() const noexcept
diff --git a/libstdc++-v3/testsuite/std/time/year/2.cc b/libstdc++-v3/testsuite/std/time/year/2.cc
new file mode 100644
index 00000000000..57fab24d647
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/time/year/2.cc
@@ -0,0 +1,52 @@
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }
+
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+
+// Class year [time.cal.year_month_day]
+
+#include <chrono>
+#include <testsuite_hooks.h>
+
+// Slow but clear test for leap year.
+constexpr bool
+is_leap_year(const std::chrono::year& y) noexcept
+{
+ const int n = static_cast<int>(y);
+ return n % 4 == 0 && (n % 100 != 0 || n % 400 == 0);
+}
+
+void test01()
+{
+ using namespace std::chrono;
+
+ year y{-32767};
+ while (y < year{32767}) {
+ VERIFY( y.is_leap() == is_leap_year(y) );
+ ++y;
+ }
+
+ // One more for y = 32767.
+ VERIFY( year{32767}.is_leap() == is_leap_year(year{32767}) );
+}
+
+int main()
+{
+ test01();
+ return 0;
+}
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-02-24 18:52 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-24 18:52 [gcc r11-7371] libstdc++: More efficient is_leap Jonathan Wakely
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).