public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom
@ 2020-04-27 14:42 gabravier at gmail dot com
  2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-04-27 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

            Bug ID: 94800
           Summary: Failure to optimize yet another popcount idiom
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

int populationCount(uint32_t x)
{
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = (x + (x >> 4)) & 0x0F0F0F0F;
    x += (x << 8);
    return ((x + (x << 16)) >> 24);
}

This can be optimized to `__builtin_popcount(x)` (when `sizeof(int) ==
sizeof(uint32_t)`). This transformation is done by LLVM, but not by GCC

Comparison here : https://godbolt.org/z/iz9qJf

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
  2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
@ 2020-04-27 15:27 ` jakub at gcc dot gnu.org
  2020-05-04 15:57 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-27 15:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC implements (and similarly for 64-bit).
   int popcount32c (uint32_t x)
   {
     x -= (x >> 1) & 0x55555555;
     x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
     x = (x + (x >> 4)) & 0x0f0f0f0f;
     return (x * 0x01010101) >> 24;
   }
Guess people are creative in how to write in other ways.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
  2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
  2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
@ 2020-05-04 15:57 ` jakub at gcc dot gnu.org
  2020-05-04 17:36 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-04 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-05-04
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Rather than adding yet another popcount pattern, I wonder if we shouldn't just
canonicalize the x + (x << cst) to x * (1 + (1 << cst)) and
(x << cst1) + (x << cst2) to x * ((1 << cst1) + (1 << cst2)), and this PR would
fall naturally out of that.  We already have code to emit the best
multiplication expansion and if it is not the fastest on some target, we should
tweak it anyway, because users could write the multiplication by constant.

Testcase to consider:
unsigned long long int
foo (unsigned long long int x)
{
  unsigned long long int a = x + (x << 8);
  unsigned long long int b = a + (a << 16);
  unsigned long long int c = b + (b << 32);
  return c;
}

unsigned long long int
bar (unsigned long long int x)
{
  return x * 0x0101010101010101ULL;
}

unsigned int
baz (unsigned int x)
{
  unsigned int a = x + (x << 8);
  unsigned int b = a + (a << 16);
  return b;
}

unsigned int
qux (unsigned int x)
{
  return x * 0x01010101U;
}

unsigned long long int
quux (unsigned long long int x)
{
  unsigned long long int a = (x << 2) + (x << 10);
  unsigned long long int b = a + (a << 16);
  unsigned long long int c = b + (b << 32);
  return c;
}

unsigned long long int
corge (unsigned long long int x)
{
  unsigned long long int a = x + (x << 4);
  unsigned long long int b = a + (a << 8);
  unsigned long long int c = b + (b << 16);
  unsigned long long int d = c + (c << 32);
  return d;
}

unsigned long long int
corge2 (unsigned long long int x)
{
  unsigned long long int a = x * 0x11;
  unsigned long long int b = a * 0x101;
  unsigned long long int c = b * 0x10001;
  unsigned long long int d = c * 0x100000001ULL;
  return d;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
  2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
  2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
  2020-05-04 15:57 ` jakub at gcc dot gnu.org
@ 2020-05-04 17:36 ` jakub at gcc dot gnu.org
  2020-05-05  9:37 ` cvs-commit at gcc dot gnu.org
  2020-05-05  9:38 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-04 17:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48443
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48443&action=edit
gcc11-pr94800.patch

Untested fix.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
  2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2020-05-04 17:36 ` jakub at gcc dot gnu.org
@ 2020-05-05  9:37 ` cvs-commit at gcc dot gnu.org
  2020-05-05  9:38 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-05  9:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:144aee70b80de50f96a97ee64edd2f1c237c4906

commit r11-65-g144aee70b80de50f96a97ee64edd2f1c237c4906
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue May 5 11:36:47 2020 +0200

    match.pd: Canonicalize (x + (x << cst)) into (x * cst2) [PR94800]

    The popcount* testcases show yet another creative way to write popcount,
    but rather than adjusting the popcount matcher to deal with it, I think
    we just should canonicalize those (X + (X << C) to X * (1 + (1 << C))
    and (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2)), because for
    multiplication we already have simplification rules that can handle nested
    multiplication (X * CST1 * CST2), while the the shifts and adds we have
    nothing like that.  And user could have written the multiplication anyway,
    so if we don't emit the fastest or smallest code for the multiplication by
    constant, we should improve that.  At least on the testcases seems the
    emitted code is reasonable according to cost, except that perhaps we could
    in some cases try to improve expansion of vector multiplication by
    uniform constant.

    2020-05-05  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/94800
            * match.pd (X + (X << C) to X * (1 + (1 << C)),
            (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2))): New
            canonicalizations.

            * gcc.dg/tree-ssa/pr94800.c: New test.
            * gcc.dg/tree-ssa/popcount5.c: New test.
            * gcc.dg/tree-ssa/popcount5l.c: New test.
            * gcc.dg/tree-ssa/popcount5ll.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/94800] Failure to optimize yet another popcount idiom
  2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2020-05-05  9:37 ` cvs-commit at gcc dot gnu.org
@ 2020-05-05  9:38 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-05-05  9:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94800

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now for 11+.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-05  9:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 14:42 [Bug tree-optimization/94800] New: Failure to optimize yet another popcount idiom gabravier at gmail dot com
2020-04-27 15:27 ` [Bug tree-optimization/94800] " jakub at gcc dot gnu.org
2020-05-04 15:57 ` jakub at gcc dot gnu.org
2020-05-04 17:36 ` jakub at gcc dot gnu.org
2020-05-05  9:37 ` cvs-commit at gcc dot gnu.org
2020-05-05  9:38 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).