* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
@ 2022-02-04 4:26 ` pinskia at gcc dot gnu.org
2022-02-04 4:27 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-04 4:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Last reconfirmed| |2022-02-04
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Target| |aarch64-*-* x86_64-*-*
| |(with -mlzcnt)
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There are two issues, both are tree level issues, though the second one works
on the RTL level just fine.
Right now we have:
_1 = __builtin_clz (x_5(D));
_2 = 31 - _1;
_3 = _2 ^ 31;
But the _3 can be optimized to just _1.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
2022-02-04 4:26 ` [Bug tree-optimization/104376] " pinskia at gcc dot gnu.org
@ 2022-02-04 4:27 ` pinskia at gcc dot gnu.org
2022-02-04 7:01 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-04 4:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The second issue can be seen with:
#include <stdint.h>
uint32_t countLeadingZeros32(uint32_t x)
{
if (x == 0)
return 32;
return (__builtin_clz(x)) ;
}
This gets optimized for aarch64 at the rtl level but not for x86_64 with
-mlzcnt.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
2022-02-04 4:26 ` [Bug tree-optimization/104376] " pinskia at gcc dot gnu.org
2022-02-04 4:27 ` pinskia at gcc dot gnu.org
@ 2022-02-04 7:01 ` pinskia at gcc dot gnu.org
2022-02-09 6:10 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-04 7:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |104378
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Filed PR 104378 for the (31 - x) ^ 31 issue.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104378
[Bug 104378] (N - x) ^ N should be optimized to x if x <= N (unsigned) and N is
a pow2 - 1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (2 preceding siblings ...)
2022-02-04 7:01 ` pinskia at gcc dot gnu.org
@ 2022-02-09 6:10 ` pinskia at gcc dot gnu.org
2022-02-09 7:40 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-09 6:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> The second issue can be seen with:
> #include <stdint.h>
>
> uint32_t countLeadingZeros32(uint32_t x)
> {
> if (x == 0)
> return 32;
> return (__builtin_clz(x)) ;
> }
cond_removal_in_builtin_zero_pattern should have optimized the above but does
not for some reason.
Let me take a look.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (3 preceding siblings ...)
2022-02-09 6:10 ` pinskia at gcc dot gnu.org
@ 2022-02-09 7:40 ` pinskia at gcc dot gnu.org
2023-05-06 21:23 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-09 7:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=101822,
| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=99997,
| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=71016
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> cond_removal_in_builtin_zero_pattern should have optimized the above but
> does not for some reason.
> Let me take a look.
So one problem is we have:
<bb 2> [local count: 1073741824]:
if (x_3(D) == 0)
goto <bb 4>; [21.72%]
else
goto <bb 3>; [78.28%]
<bb 3> [local count: 840525097]:
_1 = __builtin_clz (x_3(D));
_4 = (uint32_t) _1;
<bb 4> [local count: 1073741824]:
# _2 = PHI <32(2), _4(3)>
Which we don't handle in cond_removal_in_builtin_zero_pattern, this similar to
PR 99997 and PR 101822, that is the code which added to fix PR 71016 is getting
in the way.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (4 preceding siblings ...)
2022-02-09 7:40 ` pinskia at gcc dot gnu.org
@ 2023-05-06 21:23 ` pinskia at gcc dot gnu.org
2023-10-15 18:58 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-06 21:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |NEW
Depends on| |101822
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The cast issue is basically PR 101822.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822
[Bug 101822] Codegen bug for popcount
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (5 preceding siblings ...)
2023-05-06 21:23 ` pinskia at gcc dot gnu.org
@ 2023-10-15 18:58 ` pinskia at gcc dot gnu.org
2023-10-15 19:21 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-15 18:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I have a fix for the secondary issue which does not cause PR 71016 to show up
again.
Basically we should allow nop conversions always in
factor_out_conditional_operation .
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (6 preceding siblings ...)
2023-10-15 18:58 ` pinskia at gcc dot gnu.org
@ 2023-10-15 19:21 ` pinskia at gcc dot gnu.org
2023-10-24 11:17 ` cvs-commit at gcc dot gnu.org
2023-10-24 11:20 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-15 19:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 56117
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56117&action=edit
Patch which I am testing to fix the second issue
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (7 preceding siblings ...)
2023-10-15 19:21 ` pinskia at gcc dot gnu.org
@ 2023-10-24 11:17 ` cvs-commit at gcc dot gnu.org
2023-10-24 11:20 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-24 11:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Andrew Pinski <pinskia@gcc.gnu.org>:
https://gcc.gnu.org/g:0fc13e8c0e39c51e82deb93f324d9d86ad8d7460
commit r14-4889-g0fc13e8c0e39c51e82deb93f324d9d86ad8d7460
Author: Andrew Pinski <pinskia@gmail.com>
Date: Sun Oct 15 19:15:38 2023 +0000
Improve factor_out_conditional_operation for conversions and constants
In the case of a NOP conversion (precisions of the 2 types are equal),
factoring out the conversion can be done even if int_fits_type_p returns
false and even when the conversion is defined by a statement inside the
conditional. Since it is a NOP conversion there is no zero/sign extending
happening which is why it is ok to be done here; we were trying to prevent
an extra sign/zero extend from being moved away from definition which no-op
conversions are not.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/104376
PR tree-optimization/101541
* tree-ssa-phiopt.cc (factor_out_conditional_operation):
Allow nop conversions even if it is defined by a statement
inside the conditional.
gcc/testsuite/ChangeLog:
PR tree-optimization/101541
* gcc.dg/tree-ssa/phi-opt-39.c: New test.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
2022-02-04 1:10 [Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz gabravier at gmail dot com
` (8 preceding siblings ...)
2023-10-24 11:17 ` cvs-commit at gcc dot gnu.org
@ 2023-10-24 11:20 ` pinskia at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-24 11:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376
Bug 104376 depends on bug 101822, which changed state.
Bug 101822 Summary: Codegen bug for popcount
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101822
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
^ permalink raw reply [flat|nested] 11+ messages in thread