public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits
@ 2022-09-27 15:18 pinskia at gcc dot gnu.org
  2022-09-27 15:18 ` [Bug tree-optimization/107052] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-27 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

            Bug ID: 107052
           Summary: Range of __builtin_popcount can be improved with
                    nonzerobits
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
void link_failure();
void f(int a)
{
    a &= 0x300;
    int b =  __builtin_popcount(a);
    if (b > 3)
        link_failure();
}

```
The if statement should be optimized away as the only values for popcount here
is 0-3.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
@ 2022-09-27 15:18 ` pinskia at gcc dot gnu.org
  2022-09-27 15:24 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-27 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
                 CC|                            |aldyh at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
  2022-09-27 15:18 ` [Bug tree-optimization/107052] " pinskia at gcc dot gnu.org
@ 2022-09-27 15:24 ` pinskia at gcc dot gnu.org
  2022-09-27 15:33 ` aldyh at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-27 15:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note this testcase is optimized by clang/llvm.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
  2022-09-27 15:18 ` [Bug tree-optimization/107052] " pinskia at gcc dot gnu.org
  2022-09-27 15:24 ` pinskia at gcc dot gnu.org
@ 2022-09-27 15:33 ` aldyh at gcc dot gnu.org
  2022-09-27 15:36 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-09-27 15:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-09-27
                 CC|                            |amacleod at redhat dot com
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #2 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Don't you mean the only values for popcount are 0-2?  I mean, there are only
two bits that could be 1 with a mask of 0x300.  Or am I missing something?

Either way, your check is for b > 3, and we should be able to fold that away.

There are two problems here.  The cast of a_4 to a.0_1 dropped the nonzero
mask.  I would've expected that a cast to a number of the same precision would
keep the 0x300 mask, instead we have:

a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff

If we had the 0x300 mask available in cfn_popcount::fold_range(), then we could
fold it.  The second problem is that cfn_popcont, does not look at the nonzero
bits at all.

=========== BB 2 ============
Imports: a_3(D)  
Exports: a.0_1  a_3(D)  a_4  b_5  
         a.0_1 : a_3(D)(I)  a_4  
         a_4 : a_3(D)(I)  
         b_5 : a.0_1  a_3(D)(I)  a_4  
a_3(D)  [irange] int VARYING
    <bb 2> :
    a_4 = a_3(D) & 768;
    a.0_1 = (unsigned int) a_4;
    b_5 = __builtin_popcount (a.0_1);
    if (b_5 > 3)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

a.0_1 : [irange] unsigned int [0, 768] NONZERO 0x3ff
a_4 : [irange] int [0, 768] NONZERO 0x300
b_5 : [irange] int [0, 10] NONZERO 0xf
2->3  (T) a.0_1 :       [irange] unsigned int [0, 768] NONZERO 0x3ff
2->3  (T) a_4 :         [irange] int [0, 768] NONZERO 0x300
2->3  (T) b_5 :         [irange] int [4, 10] NONZERO 0xf
2->4  (F) a.0_1 :       [irange] unsigned int [0, 768] NONZERO 0x3ff
2->4  (F) a_4 :         [irange] int [0, 768] NONZERO 0x300
2->4  (F) b_5 :         [irange] int [0, 3] NONZERO 0x3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-09-27 15:33 ` aldyh at gcc dot gnu.org
@ 2022-09-27 15:36 ` pinskia at gcc dot gnu.org
  2022-09-27 15:54 ` aldyh at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-09-27 15:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #2)
> Don't you mean the only values for popcount are 0-2?  I mean, there are only
> two bits that could be 1 with a mask of 0x300.  Or am I missing something?

Yes, 0-2 I am still trying to wake up.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-09-27 15:36 ` pinskia at gcc dot gnu.org
@ 2022-09-27 15:54 ` aldyh at gcc dot gnu.org
  2022-09-27 16:22 ` aldyh at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-09-27 15:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #4 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Aldy Hernandez from comment #2)
> > Don't you mean the only values for popcount are 0-2?  I mean, there are only
> > two bits that could be 1 with a mask of 0x300.  Or am I missing something?
> 
> Yes, 0-2 I am still trying to wake up.

No worries, just trying to make sure I understand things correctly.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-09-27 15:54 ` aldyh at gcc dot gnu.org
@ 2022-09-27 16:22 ` aldyh at gcc dot gnu.org
  2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-09-27 16:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #5 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 53633
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53633&action=edit
patch in testing

This might do it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-09-27 16:22 ` aldyh at gcc dot gnu.org
@ 2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
  2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
  2022-10-05 12:23 ` aldyh at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-05 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Aldy Hernandez <aldyh@gcc.gnu.org>:

https://gcc.gnu.org/g:ae56d600d223e996054483d7d7033ec8e258d39d

commit r13-3085-gae56d600d223e996054483d7d7033ec8e258d39d
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Tue Oct 4 17:03:54 2022 +0200

    [PR tree-optimization/107052] range-ops: Pass nonzero masks through cast.

    Track nonzero masks through a cast in range-ops.

            PR tree-optimization/107052

    gcc/ChangeLog:

            * range-op.cc (operator_cast::fold_range): Set nonzero mask.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
@ 2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
  2022-10-05 12:23 ` aldyh at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-05 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Aldy Hernandez <aldyh@gcc.gnu.org>:

https://gcc.gnu.org/g:4c451631f722c9939260a5c2fc209802a47e525f

commit r13-3086-g4c451631f722c9939260a5c2fc209802a47e525f
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Tue Oct 4 17:05:10 2022 +0200

    [PR tree-optimization/107052] range-ops: Take into account nonzero mask in
popcount.

            PR tree-optimization/107052

    gcc/ChangeLog:

            * gimple-range-op.cc (cfn_popcount::fold_range): Take into account
            nonzero bit mask.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/107052] Range of __builtin_popcount can be improved with nonzerobits
  2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
@ 2022-10-05 12:23 ` aldyh at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-10-05 12:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107052

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
fixed

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-10-05 12:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-27 15:18 [Bug tree-optimization/107052] New: Range of __builtin_popcount can be improved with nonzerobits pinskia at gcc dot gnu.org
2022-09-27 15:18 ` [Bug tree-optimization/107052] " pinskia at gcc dot gnu.org
2022-09-27 15:24 ` pinskia at gcc dot gnu.org
2022-09-27 15:33 ` aldyh at gcc dot gnu.org
2022-09-27 15:36 ` pinskia at gcc dot gnu.org
2022-09-27 15:54 ` aldyh at gcc dot gnu.org
2022-09-27 16:22 ` aldyh at gcc dot gnu.org
2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
2022-10-05 12:22 ` cvs-commit at gcc dot gnu.org
2022-10-05 12:23 ` aldyh at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).