public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom
@ 2020-04-27 14:42 gabravier at gmail dot com
2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-04-27 14:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
Bug ID: 94800
Summary: Failure to optimize yet another popcount idiom
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
int populationCount(uint32_t x)
{
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0F0F0F0F;
x += (x << 8);
return ((x + (x << 16)) >> 24);
}
This can be optimized to `__builtin_popcount(x)` (when `sizeof(int) ==
sizeof(uint32_t)`). This transformation is done by LLVM, but not by GCC
Comparison here : https://godbolt.org/z/iz9qJf
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
@ 2020-04-27 15:27 ` jakub at gcc dot gnu.org
2020-05-04 15:57 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-27 15:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC implements (and similarly for 64-bit).
int popcount32c (uint32_t x)
{
x -= (x >> 1) & 0x55555555;
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0f0f0f0f;
return (x * 0x01010101) >> 24;
}
Guess people are creative in how to write in other ways.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
@ 2020-05-04 15:57 ` jakub at gcc dot gnu.org
2020-05-04 17:36 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-04 15:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2020-05-04
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Rather than adding yet another popcount pattern, I wonder if we shouldn't just
canonicalize the x + (x << cst) to x * (1 + (1 << cst)) and
(x << cst1) + (x << cst2) to x * ((1 << cst1) + (1 << cst2)), and this PR would
fall naturally out of that. We already have code to emit the best
multiplication expansion and if it is not the fastest on some target, we should
tweak it anyway, because users could write the multiplication by constant.
Testcase to consider:
unsigned long long int
foo (unsigned long long int x)
{
unsigned long long int a = x + (x << 8);
unsigned long long int b = a + (a << 16);
unsigned long long int c = b + (b << 32);
return c;
}
unsigned long long int
bar (unsigned long long int x)
{
return x * 0x0101010101010101ULL;
}
unsigned int
baz (unsigned int x)
{
unsigned int a = x + (x << 8);
unsigned int b = a + (a << 16);
return b;
}
unsigned int
qux (unsigned int x)
{
return x * 0x01010101U;
}
unsigned long long int
quux (unsigned long long int x)
{
unsigned long long int a = (x << 2) + (x << 10);
unsigned long long int b = a + (a << 16);
unsigned long long int c = b + (b << 32);
return c;
}
unsigned long long int
corge (unsigned long long int x)
{
unsigned long long int a = x + (x << 4);
unsigned long long int b = a + (a << 8);
unsigned long long int c = b + (b << 16);
unsigned long long int d = c + (c << 32);
return d;
}
unsigned long long int
corge2 (unsigned long long int x)
{
unsigned long long int a = x * 0x11;
unsigned long long int b = a * 0x101;
unsigned long long int c = b * 0x10001;
unsigned long long int d = c * 0x100000001ULL;
return d;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
2020-05-04 15:57 ` jakub at gcc dot gnu.org
@ 2020-05-04 17:36 ` jakub at gcc dot gnu.org
2020-05-05 9:37 ` cvs-commit at gcc dot gnu.org
2020-05-05 9:38 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-04 17:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48443
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48443&action=edit
gcc11-pr94800.patch
Untested fix.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
` (2 preceding siblings ...)
2020-05-04 17:36 ` jakub at gcc dot gnu.org
@ 2020-05-05 9:37 ` cvs-commit at gcc dot gnu.org
2020-05-05 9:38 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-05 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:144aee70b80de50f96a97ee64edd2f1c237c4906
commit r11-65-g144aee70b80de50f96a97ee64edd2f1c237c4906
Author: Jakub Jelinek <jakub@redhat.com>
Date: Tue May 5 11:36:47 2020 +0200
match.pd: Canonicalize (x + (x << cst)) into (x * cst2) [PR94800]
The popcount* testcases show yet another creative way to write popcount,
but rather than adjusting the popcount matcher to deal with it, I think
we just should canonicalize those (X + (X << C) to X * (1 + (1 << C))
and (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2)), because for
multiplication we already have simplification rules that can handle nested
multiplication (X * CST1 * CST2), while the the shifts and adds we have
nothing like that. And user could have written the multiplication anyway,
so if we don't emit the fastest or smallest code for the multiplication by
constant, we should improve that. At least on the testcases seems the
emitted code is reasonable according to cost, except that perhaps we could
in some cases try to improve expansion of vector multiplication by
uniform constant.
2020-05-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94800
* match.pd (X + (X << C) to X * (1 + (1 << C)),
(X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2))): New
canonicalizations.
* gcc.dg/tree-ssa/pr94800.c: New test.
* gcc.dg/tree-ssa/popcount5.c: New test.
* gcc.dg/tree-ssa/popcount5l.c: New test.
* gcc.dg/tree-ssa/popcount5ll.c: New test.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
` (3 preceding siblings ...)
2020-05-05 9:37 ` cvs-commit at gcc dot gnu.org
@ 2020-05-05 9:38 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-05 9:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now for 11+.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-05-05 9:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
2020-05-04 15:57 ` jakub at gcc dot gnu.org
2020-05-04 17:36 ` jakub at gcc dot gnu.org
2020-05-05 9:37 ` cvs-commit at gcc dot gnu.org
2020-05-05 9:38 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).