public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101822] New: Codegen bug for popcount
@ 2021-08-08 22:59 llvm at rifkin dot dev
  2021-08-08 23:10 ` [Bug tree-optimization/101822] " llvm at rifkin dot dev
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: llvm at rifkin dot dev @ 2021-08-08 22:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

            Bug ID: 101822
           Summary: Codegen bug for popcount
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: llvm at rifkin dot dev
  Target Milestone: ---

GCC cleverly optimizes the following loop into a popcount intrinsic:

uint32_t foo(uint32_t n) {
    uint32_t count = 0;
    while(n) {
        n &= n - 1;
        count++;
    }
    return count;
}

But the generated assembly is highly redundant https://godbolt.org/z/nbGb13G5W:

foo(unsigned int):
        xor     eax, eax
        xor     edx, edx
        popcnt  eax, edi
        test    edi, edi
        cmove   eax, edx
        ret

if(n == 0) __builtin_unreachable(); does seem to help the compiler's analysis.

It seems here the compiler is not realizing both the loop and popcnt intrinsic
are well-defined for n == 0. This is closely related to another bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101821.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/101822] Codegen bug for popcount
  2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
@ 2021-08-08 23:10 ` llvm at rifkin dot dev
  2021-08-08 23:17 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: llvm at rifkin dot dev @ 2021-08-08 23:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

--- Comment #1 from Jeremy R. <llvm at rifkin dot dev> ---
Never mind, 101821 was invalid and the initial xor eax eax is by design (still
wondering whether this applies to new CPUs though). There is still a
discrepancy between this code and the __builtin_popcount code though.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/101822] Codegen bug for popcount
  2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
  2021-08-08 23:10 ` [Bug tree-optimization/101822] " llvm at rifkin dot dev
@ 2021-08-08 23:17 ` pinskia at gcc dot gnu.org
  2021-08-09 11:55 ` llvm at rifkin dot dev
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-08 23:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2021-08-08
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Severity|normal                      |enhancement
                 CC|                            |pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
MIne, phiopt is not working for this case:
  if (n_4(D) != 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 4>; [11.00%]

  <bb 3> [local count: 105119324]:
  _9 = __builtin_popcount (n_4(D));
  count_13 = (uint32_t) _9;

  <bb 4> [local count: 118111600]:
  # count_12 = PHI <count_13(3), 0(2)>

The code in phiopt for handling casts is not done correctly.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/101822] Codegen bug for popcount
  2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
  2021-08-08 23:10 ` [Bug tree-optimization/101822] " llvm at rifkin dot dev
  2021-08-08 23:17 ` pinskia at gcc dot gnu.org
@ 2021-08-09 11:55 ` llvm at rifkin dot dev
  2022-02-02  5:02 ` pinskia at gcc dot gnu.org
  2023-10-24 11:20 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: llvm at rifkin dot dev @ 2021-08-09 11:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

--- Comment #3 from Jeremy R. <llvm at rifkin dot dev> ---
Interestingly it's optimized correctly on -Os

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/101822] Codegen bug for popcount
  2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
                   ` (2 preceding siblings ...)
  2021-08-09 11:55 ` llvm at rifkin dot dev
@ 2022-02-02  5:02 ` pinskia at gcc dot gnu.org
  2023-10-24 11:20 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-02  5:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=71016

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The code that was added to fix PR 71016 is causing this. Looks like there needs
to be a better way of doing this.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/101822] Codegen bug for popcount
  2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
                   ` (3 preceding siblings ...)
  2022-02-02  5:02 ` pinskia at gcc dot gnu.org
@ 2023-10-24 11:20 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-24 11:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |14.0
             Status|ASSIGNED                    |RESOLVED

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed fully by r14-4889-g0fc13e8c0e39c51e82de .

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-10-24 11:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-08 22:59 [Bug tree-optimization/101822] New: Codegen bug for popcount llvm at rifkin dot dev
2021-08-08 23:10 ` [Bug tree-optimization/101822] " llvm at rifkin dot dev
2021-08-08 23:17 ` pinskia at gcc dot gnu.org
2021-08-09 11:55 ` llvm at rifkin dot dev
2022-02-02  5:02 ` pinskia at gcc dot gnu.org
2023-10-24 11:20 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).