public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/115667] New: Improve expansion for popcountti2
@ 2024-06-26 14:36 ktkachov at gcc dot gnu.org
  2024-06-26 14:38 ` [Bug rtl-optimization/115667] " ktkachov at gcc dot gnu.org
  2024-06-26 15:05 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-06-26 14:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115667

            Bug ID: 115667
           Summary: Improve expansion for popcountti2
           Product: gcc
           Version: 13.3.1
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---

Maybe this is aarch64-specific but for the testcase:
int
cnt (unsigned __int128 a)
{
  return __builtin_popcountg (a);
}

GCC for aarch64 will generate:
cnt:
        fmov    d30, x0
        fmov    d31, x1
        cnt     v30.8b, v30.8b
        cnt     v31.8b, v31.8b
        addv    b30, v30.8b
        addv    b31, v31.8b
        fmov    x1, d30
        fmov    x0, d31
        add     w0, w1, w0
        ret

Effectively doing two DImode popcount expansions and adding the results.
Clang does the more effective:
cnt:                                    // @cnt
        fmov    d0, x0
        mov     v0.d[1], x1
        cnt     v0.16b, v0.16b
        uaddlv  h0, v0.16b
        fmov    w0, s0
        ret

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/115667] Improve expansion for popcountti2
  2024-06-26 14:36 [Bug rtl-optimization/115667] New: Improve expansion for popcountti2 ktkachov at gcc dot gnu.org
@ 2024-06-26 14:38 ` ktkachov at gcc dot gnu.org
  2024-06-26 15:05 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2024-06-26 14:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115667

--- Comment #1 from ktkachov at gcc dot gnu.org ---
In fact I'm sure it could even use the proposed new udot approach

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/115667] Improve expansion for popcountti2
  2024-06-26 14:36 [Bug rtl-optimization/115667] New: Improve expansion for popcountti2 ktkachov at gcc dot gnu.org
  2024-06-26 14:38 ` [Bug rtl-optimization/115667] " ktkachov at gcc dot gnu.org
@ 2024-06-26 15:05 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-26 15:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115667

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup. The analysis of what goes wrong is in the dup. Adding popcountti is not
just enough.  I will be handling it in the coming weeks.

*** This bug has been marked as a duplicate of bug 113042 ***

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-06-26 15:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-26 14:36 [Bug rtl-optimization/115667] New: Improve expansion for popcountti2 ktkachov at gcc dot gnu.org
2024-06-26 14:38 ` [Bug rtl-optimization/115667] " ktkachov at gcc dot gnu.org
2024-06-26 15:05 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).