public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/98479] New: Missed optimization opportunity for unsigned __int128 modulo
@ 2020-12-30 14:03 dabler at gmail dot com
2020-12-30 14:18 ` [Bug c/98479] " jakub at gcc dot gnu.org
2021-01-05 10:47 ` rguenth at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: dabler at gmail dot com @ 2020-12-30 14:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98479
Bug ID: 98479
Summary: Missed optimization opportunity for unsigned __int128
modulo
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: dabler at gmail dot com
Target Milestone: ---
I have found that manually calculating the % operator on __int128 is
significantly faster than the built-in compiler operator. I will show you how
to calculate modulo 9, but the method can be used to calculate modulo any other
number.
First, consider the built-in compiler operator:
uint64_t mod9_v1(unsigned __int128 n)
{
return n % 9;
}
Now consider my manual implementation:
uint64_t mod9_v2(unsigned __int128 n)
{
uint64_t r = 0;
r += (uint32_t)(n);
r += (uint32_t)(n >> 32) * (uint64_t)4;
r += (uint32_t)(n >> 64) * (uint64_t)7;
r += (uint32_t)(n >> 96);
return r % 9;
}
Measuring over 100,000,000 random numbers gives the following results:
mod9_v1 | 3.986052 secs
mod9_v2 | 1.814339 secs
GCC 9.3.0 with -march=native -O3 was used on AMD Ryzen Threadripper 2990WX.
Note that 2^32 == 4 (mod 9), 2^64 == 7 (mod 9), etc.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug c/98479] Missed optimization opportunity for unsigned __int128 modulo
2020-12-30 14:03 [Bug c/98479] New: Missed optimization opportunity for unsigned __int128 modulo dabler at gmail dot com
@ 2020-12-30 14:18 ` jakub at gcc dot gnu.org
2021-01-05 10:47 ` rguenth at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-30 14:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98479
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11 will calculate this since PR97459 as
uint64_t r = (uint64_t) n & 0xfffffffffffffff;
r += (uint64_t) (n >> 60) & 0xfffffffffffffff;
r += (uint64_t) (n >> 120);
return r % 9;
instead as 2^60 == 1 (mod 9).
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug c/98479] Missed optimization opportunity for unsigned __int128 modulo
2020-12-30 14:03 [Bug c/98479] New: Missed optimization opportunity for unsigned __int128 modulo dabler at gmail dot com
2020-12-30 14:18 ` [Bug c/98479] " jakub at gcc dot gnu.org
@ 2021-01-05 10:47 ` rguenth at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 10:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98479
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
Known to work| |11.0
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So fixed on trunk.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-01-05 10:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-30 14:03 [Bug c/98479] New: Missed optimization opportunity for unsigned __int128 modulo dabler at gmail dot com
2020-12-30 14:18 ` [Bug c/98479] " jakub at gcc dot gnu.org
2021-01-05 10:47 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).