public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests
[not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
@ 2021-06-08 9:44 ` pinskia at gcc dot gnu.org
2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
2024-07-22 9:39 ` rguenther at suse dot de
2 siblings, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-08 9:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|tree-optimization |rtl-optimization
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note on the trunk I have change the code slightly to get a cmove done.
With the cmove we could simplify the following RTL:
Trying 27, 28 -> 29:
27: {flags:CCZ=cmp(r86:SI&0x100,0);r82:SI=r86:SI&0x100;}
REG_DEAD r86:SI
28: r85:SI=0xffffffffffffffe6
29: r82:SI={(flags:CCZ==0)?r82:SI:r85:SI}
REG_DEAD r85:SI
REG_DEAD flags:CCZ
Failed to match this instruction:
(set (reg/v:SI 82 [ tt ])
(if_then_else:SI (eq (zero_extract:SI (reg:SI 86)
(const_int 1 [0x1])
(const_int 8 [0x8]))
(const_int 0 [0]))
(and:SI (reg:SI 86)
(const_int 256 [0x100]))
(const_int -26 [0xffffffffffffffe6])))
But that would be a 3->3 combine which I don't know if combine does. I know it
does 3->1 and 3->2
andl $256, %edi
movl $-26, %eax
cmovne %eax, %edi
I also don't know what the cost of doing cmov vs the shifts here though.
I know for aarch64, it is worse but that should have been modeled already.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/45215] Tree-optimization misses a trick with bit tests
[not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
2021-06-08 9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
@ 2024-07-19 23:05 ` pinskia at gcc dot gnu.org
2024-07-22 9:39 ` rguenther at suse dot de
2 siblings, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-19 23:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pinskia at gcc dot gnu.org
Component|tree-optimization |middle-end
Status|NEW |ASSIGNED
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
_1 = t_3(D) & 256;
if (_1 != 0)
goto <bb 4>; [1.04%]
else
goto <bb 3>; [98.96%]
<bb 3> [local count: 1062574912]:
<bb 4> [local count: 1073741824]:
# _2 = PHI <-26(2), 0(3)>
So the trick here is that 256 is `0x1<<8` so we want to shift that bit up to
the sign bit and then arthimetic shift down to get 0/-1 and then and with -26.
It seems like we could do this in ifcvt I think.
we do find the block:
IF-THEN-JOIN block found, pass 1, test 2, then 3, join 4
IF-CASE-2 found, start 2, else 3
but we don't optimize it (on aarch64 we use csel).
We could do this on the gimple level with a match pattern.
(simplify
(cond (ne (and @0 integer_pow2p@1) integer_zerop) INTEGER_CST@2 integer_zerop)
....
Maybe only do this late and if shift is cheap (though there is no predicates
for that yet).
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug middle-end/45215] Tree-optimization misses a trick with bit tests
[not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
2021-06-08 9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
@ 2024-07-22 9:39 ` rguenther at suse dot de
2 siblings, 0 replies; 3+ messages in thread
From: rguenther at suse dot de @ 2024-07-22 9:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 19 Jul 2024, pinskia at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
>
> Andrew Pinski <pinskia at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |pinskia at gcc dot gnu.org
> Component|tree-optimization |middle-end
> Status|NEW |ASSIGNED
>
> --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> _1 = t_3(D) & 256;
> if (_1 != 0)
> goto <bb 4>; [1.04%]
> else
> goto <bb 3>; [98.96%]
>
> <bb 3> [local count: 1062574912]:
>
> <bb 4> [local count: 1073741824]:
> # _2 = PHI <-26(2), 0(3)>
>
>
> So the trick here is that 256 is `0x1<<8` so we want to shift that bit up to
> the sign bit and then arthimetic shift down to get 0/-1 and then and with -26.
So .BIT_SPLAT (t_3(D), 8) & -26, there's nothing special in x86 to help
.BIT_SPLAT though, back-to-back shift might be throughput constrained.
I think x86 can do -1 vs. 0 set from flags of the and though.
I'm not sure whether two shifts and and are a good way to recover an
optimal non-branch insn sequence later?
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-07-22 9:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
2021-06-08 9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
2024-07-22 9:39 ` rguenther at suse dot de
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).