[Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests
       [not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
@ 2021-06-08  9:44 ` pinskia at gcc dot gnu.org
  2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
  2024-07-22  9:39 ` rguenther at suse dot de
  2 siblings, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-08  9:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |rtl-optimization

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note on the trunk I have change the code slightly to get a cmove done.
With the cmove we could simplify the following RTL:
Trying 27, 28 -> 29:
   27: {flags:CCZ=cmp(r86:SI&0x100,0);r82:SI=r86:SI&0x100;}
      REG_DEAD r86:SI
   28: r85:SI=0xffffffffffffffe6
   29: r82:SI={(flags:CCZ==0)?r82:SI:r85:SI}
      REG_DEAD r85:SI
      REG_DEAD flags:CCZ
Failed to match this instruction:
(set (reg/v:SI 82 [ tt ])
    (if_then_else:SI (eq (zero_extract:SI (reg:SI 86)
                (const_int 1 [0x1])
                (const_int 8 [0x8]))
            (const_int 0 [0]))
        (and:SI (reg:SI 86)
            (const_int 256 [0x100]))
        (const_int -26 [0xffffffffffffffe6])))

But that would be a 3->3 combine which I don't know if combine does.  I know it
does 3->1 and 3->2

        andl    $256, %edi
        movl    $-26, %eax
        cmovne  %eax, %edi

I also don't know what the cost of doing cmov vs the shifts here though.
I know for aarch64, it is worse but that should have been modeled already.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/45215] Tree-optimization misses a trick with bit tests
       [not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
  2021-06-08  9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
@ 2024-07-19 23:05 ` pinskia at gcc dot gnu.org
  2024-07-22  9:39 ` rguenther at suse dot de
  2 siblings, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-19 23:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org
          Component|tree-optimization           |middle-end
             Status|NEW                         |ASSIGNED

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
  _1 = t_3(D) & 256;
  if (_1 != 0)
    goto <bb 4>; [1.04%]
  else
    goto <bb 3>; [98.96%]

  <bb 3> [local count: 1062574912]:

  <bb 4> [local count: 1073741824]:
  # _2 = PHI <-26(2), 0(3)>


So the trick here is that 256 is `0x1<<8` so we want to shift that bit up to
the sign bit and then arthimetic shift down to get 0/-1 and then and with -26.

It seems like we could do this in ifcvt I think.

we do find the block:

IF-THEN-JOIN block found, pass 1, test 2, then 3, join 4

IF-CASE-2 found, start 2, else 3


but we don't optimize it (on aarch64 we use csel).

We could do this on the gimple level with a match pattern.

(simplify
 (cond (ne (and @0 integer_pow2p@1) integer_zerop) INTEGER_CST@2 integer_zerop)
....

Maybe only do this late and if shift is cheap (though there is no predicates
for that yet).

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug middle-end/45215] Tree-optimization misses a trick with bit tests
       [not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
  2021-06-08  9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
  2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
@ 2024-07-22  9:39 ` rguenther at suse dot de
  2 siblings, 0 replies; 3+ messages in thread
From: rguenther at suse dot de @ 2024-07-22  9:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215

--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 19 Jul 2024, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
> 
> Andrew Pinski <pinskia at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |pinskia at gcc dot gnu.org
>           Component|tree-optimization           |middle-end
>              Status|NEW                         |ASSIGNED
> 
> --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>   _1 = t_3(D) & 256;
>   if (_1 != 0)
>     goto <bb 4>; [1.04%]
>   else
>     goto <bb 3>; [98.96%]
> 
>   <bb 3> [local count: 1062574912]:
> 
>   <bb 4> [local count: 1073741824]:
>   # _2 = PHI <-26(2), 0(3)>
> 
> 
> So the trick here is that 256 is `0x1<<8` so we want to shift that bit up to
> the sign bit and then arthimetic shift down to get 0/-1 and then and with -26.

So .BIT_SPLAT (t_3(D), 8) & -26, there's nothing special in x86 to help
.BIT_SPLAT though, back-to-back shift might be throughput constrained.
I think x86 can do -1 vs. 0 set from flags of the and though.

I'm not sure whether two shifts and and are a good way to recover an
optimal non-branch insn sequence later?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-22  9:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-45215-4@http.gcc.gnu.org/bugzilla/>
2021-06-08  9:44 ` [Bug rtl-optimization/45215] Tree-optimization misses a trick with bit tests pinskia at gcc dot gnu.org
2024-07-19 23:05 ` [Bug middle-end/45215] " pinskia at gcc dot gnu.org
2024-07-22  9:39 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).