public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96703] New: Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0
@ 2020-08-19 10:15 gabravier at gmail dot com
  2020-08-24 22:36 ` [Bug tree-optimization/96703] " pinskia at gcc dot gnu.org
  2023-09-04  4:31 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: gabravier at gmail dot com @ 2020-08-19 10:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96703

            Bug ID: 96703
           Summary: Failure to optimize combined comparison of variables
                    and of variable with 0 to two comparisons with 0
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

bool f(int x, int y)
{
    return x > y && y == 0;
}

This can be optimized to `return (y == 0) && (x > 0);` (This transformation
doesn't by itself make the code faster, but it probably helps with pipelined
CPUs (avoids dependency on both variables for the first comparison) and looks
like it would most likely make other optimizations easier). This transformation
is done by LLVM, but not by GCC.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/96703] Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0
  2020-08-19 10:15 [Bug tree-optimization/96703] New: Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0 gabravier at gmail dot com
@ 2020-08-24 22:36 ` pinskia at gcc dot gnu.org
  2023-09-04  4:31 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-08-24 22:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96703

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-08-24
           Keywords|                            |easyhack
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed, a small one.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/96703] Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0
  2020-08-19 10:15 [Bug tree-optimization/96703] New: Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0 gabravier at gmail dot com
  2020-08-24 22:36 ` [Bug tree-optimization/96703] " pinskia at gcc dot gnu.org
@ 2023-09-04  4:31 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-04  4:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96703

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm for
```
#define cst 0x1234

bool f(int x, int y)
{
    return x > y && y == cst;
}

bool f0(int x, int y)
{
    return x > cst && y == cst;
}
```

currently for GCC on aarch64:
```
f:
        cmp     w0, w1
        mov     w2, 4660
        ccmp    w1, w2, 0, gt
        cset    w0, eq
        ret
f0:
        mov     w2, 4660
        cmp     w0, w2
        ccmp    w1, w2, 0, gt
        cset    w0, eq
        ret
```
The f is actually better because the first cmp is indepdent from the move.
So for a dual issue CPU, f would be better almost always. Even if the move does
not occupy an issue slot.

For RISCV not doing is actually better:
        li      a5,4096
        addi    a5,a5,564
        sub     a5,a1,a5
        sgt     a0,a0,a1
        seqz    a5,a5
        and     a0,a5,a0
        ret

vs
        li      a5,4096
        addi    a5,a5,564
        sub     a1,a1,a5
        seqz    a1,a1
        sgt     a0,a0,a5
        and     a0,a1,a0
        ret

The sgt without doing this is indepdent of the constant forming.

Now 0 could be handled as a special case because most targets handle 0 nicely.

I see doing it is better for power but I don't know if that is true in general
or just the constants I tried.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-09-04  4:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19 10:15 [Bug tree-optimization/96703] New: Failure to optimize combined comparison of variables and of variable with 0 to two comparisons with 0 gabravier at gmail dot com
2020-08-24 22:36 ` [Bug tree-optimization/96703] " pinskia at gcc dot gnu.org
2023-09-04  4:31 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).