From: Jonathan Wakely <jwakely@redhat.com>
To: cassio.neri@gmail.com
Cc: "Koning, Paul" <Paul.Koning@dell.com>,
libstdc++ <libstdc++@gcc.gnu.org>,
Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] libstdc++: More efficient std::chrono::year::leap.
Date: Wed, 23 Jun 2021 12:45:12 +0100 [thread overview]
Message-ID: <YNMeyECLtYj3OpmP@redhat.com> (raw)
In-Reply-To: <CAOfgUPiBscyfyhT_Mtx4mvWLm+ay1=OuG-J8tM=TjDuGDi27=A@mail.gmail.com>
On 21/05/21 19:44 +0100, Cassio Neri via Libstdc++ wrote:
>I've checked the generated code and the compiler doesn't figure out
>the logic. I added a comment to explain.
>
>(Revised patch below and attached.)
>
>Best wishes,
>Cassio.
>
>---
>
>Simple change to std::chrono::year::is_leap. If a year is multiple of 100,
>then it's divisible by 400 if and only if it's divisible by 16. The latter
>allows for better code generation.
>
>Tested on x86_64-pc-linux-gnu.
>
>libstdc++-v3/ChangeLog:
>libstdc++-v3/ChangeLog:
>
> * include/std/chrono:
>
>diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
>index 4631a727d73..85aa0379432 100644
>--- a/libstdc++-v3/include/std/chrono
>+++ b/libstdc++-v3/include/std/chrono
>@@ -1612,7 +1612,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> constexpr uint32_t __offset = __max_dividend / 2 / 100 * 100;
> const bool __is_multiple_of_100
> = __multiplier * (_M_y + __offset) < __bound;
>- return (!__is_multiple_of_100 || _M_y % 400 == 0) && _M_y % 4 == 0;
>+ // Usually we test _M_y % 400 == 0 but, when it's already known that
>+ // _M_y%100 == 0, then _M_y % 400==0 is equivalent to _M_y % 16 == 0.
^^
N.B. this comment should say !=
>+ return (!__is_multiple_of_100 || _M_y % 16 == 0) && _M_y % 4 == 0;
If y % 16 == 0 then y % 4 == 0 too. So we could write that as:
return (!__is_multiple_of_100 && _M_y % 4 == 0) || _M_y % 16 == 0;
This seems to perform even better over a wide range of inputs, can you
confirm that result with your own tests?
However, my microbenchmark also shows that the simplistic code using
y%100 often performs even better than the current code calculating
__is_multiple_of_100 to avoid the modulus operation. So maybe my tests
are bad.
My rearranged expression above is equivalent to:
return _M_y % (__is_multiple_of_100 ? 16 : 4) == 0;
which can be written without branches:
return _M_y % (4 << (2 * __is_multiple_of_100)) == 0;
However, both Clang and GCC already remove the branch for (x ? 16 : 4)
and the conditional expression produces slightly smaller code with GCC
(see https://gcc.gnu.org/PR101179 regarding that). But neither of
these seems to improve compared to my first rearrangement above.
next prev parent reply other threads:[~2021-06-23 11:45 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-21 17:46 Cassio Neri
2021-05-21 18:05 ` Koning, Paul
2021-05-21 18:44 ` Cassio Neri
2021-06-23 11:45 ` Jonathan Wakely [this message]
2021-06-23 13:16 ` Jonathan Wakely
2021-06-23 17:51 ` Jonathan Wakely
2021-06-23 17:52 ` Jonathan Wakely
2021-06-24 11:42 ` Cassio Neri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YNMeyECLtYj3OpmP@redhat.com \
--to=jwakely@redhat.com \
--cc=Paul.Koning@dell.com \
--cc=cassio.neri@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).