public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
@ 2021-07-20  0:03 ` pinskia at gcc dot gnu.org
  2021-07-20  0:06 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  0:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2016-05-30 00:00:00         |2021-7-19

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Something like (very much pesdu code):
(simplify
 (cond (eq (bit_and@0 @1 int_pow2p@2) integer_zerop) INTEGET_CST@3
INTEGER_CST@4)
 (switch
  (if (@3 u> @4 && exact_pow2(@3 - @4))
   (convert (plus (mult @0:typeu (minus @3:typeu @4:typeu)) @4:typeu)))
  (if (@4 u> @3 && exact_pow2(@4 - @3))
   (convert (minus (mult @0:typeu (minus @4:typeu @3:typeu)) @3:typeu))
 ))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
  2021-07-20  0:03 ` [Bug rtl-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1" pinskia at gcc dot gnu.org
@ 2021-07-20  0:06 ` pinskia at gcc dot gnu.org
  2021-07-20  0:17 ` [Bug tree-optimization/71336] " pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  0:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Sorry this one is more correct
Something like (very much pesdu code):
(simplify
 (cond (eq (bit_and@0 @1 int_pow2p@2) integer_zerop) INTEGET_CST@3
INTEGER_CST@4)
 (switch
  (if (@3 u> @4 && exact_pow2(@3 - @4))
   (convert (plus (mult (rshift @0:typeu @2) (minus @3:typeu @4:typeu))
@4:typeu)))
  (if (@4 u> @3 && exact_pow2(@4 - @3))
   (convert (minus (mult (rshift @0:typeu @2) (minus @4:typeu @3:typeu))
@3:typeu))
 ))

The mult could be changed to lshift to be more correct.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
  2021-07-20  0:03 ` [Bug rtl-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1" pinskia at gcc dot gnu.org
  2021-07-20  0:06 ` pinskia at gcc dot gnu.org
@ 2021-07-20  0:17 ` pinskia at gcc dot gnu.org
  2021-07-20  0:18 ` [Bug tree-optimization/71336] Suboptimal " pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  0:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |tree-optimization

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #3)
> I guess it could be useful even without lea.

Pretty much.
Even on aarch64 we get for the original case:
        ubfiz   w0, w0, 2, 1
        add     w0, w0, 3
        ret
vs
        tst     x0, 1
        mov     w2, 7
        mov     w1, 3
        csel    w0, w2, w1, ne
        ret

For something slightly different:
unsigned test(unsigned a) {
    return a & 4 ? 7 : 3;
}

unsigned test1(unsigned a)
{
  a &= 4;
  a >>= 2;
  return a*4 + 3;
}

----
We get:
        tst     x0, 4
        mov     w2, 7
        mov     w1, 3
        csel    w0, w2, w1, ne
        ret
vs
        and     w0, w0, 4
        add     w0, w0, 3
        ret

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2021-07-20  0:17 ` [Bug tree-optimization/71336] " pinskia at gcc dot gnu.org
@ 2021-07-20  0:18 ` pinskia at gcc dot gnu.org
  2021-07-20  0:19 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  0:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-07-20  0:18 ` [Bug tree-optimization/71336] Suboptimal " pinskia at gcc dot gnu.org
@ 2021-07-20  0:19 ` pinskia at gcc dot gnu.org
  2023-04-09 23:14 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  0:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #6)
> Sorry this one is more correct
> Something like (very much pesdu code):
> (simplify
>  (cond (eq (bit_and@0 @1 int_pow2p@2) integer_zerop) INTEGET_CST@3
> INTEGER_CST@4)
>  (switch
>   (if (@3 u> @4 && exact_pow2(@3 - @4))
>    (convert (plus (mult (rshift @0:typeu @2) (minus @3:typeu @4:typeu))
> @4:typeu)))
>   (if (@4 u> @3 && exact_pow2(@4 - @3))
>    (convert (minus (mult (rshift @0:typeu @2) (minus @4:typeu @3:typeu))
> @3:typeu))
>  ))
> 
> The mult could be changed to lshift to be more correct.

I should note, the above is now only possible because of the recent changes I
did to phi-opt.  You no longer need to open code anything in phi-opt for things
like this bug, just add them to match.pd and be done with it :).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2021-07-20  0:19 ` pinskia at gcc dot gnu.org
@ 2023-04-09 23:14 ` pinskia at gcc dot gnu.org
  2023-05-06 21:44 ` pinskia at gcc dot gnu.org
  2023-08-23  1:58 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-04-09 23:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2021-07-19 00:00:00         |2023-4-9

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Take:
```
int test(int a) {
    return a & 1 ? 7 : 3;
}
int test1(int a) {
    int t = (a & 1) ? 4 : 0;
    return t+3;
}
int test2(int a) {
  int t = a & 1;
    t *= 4;
    return t+3;
}
```
these 3 all produce different code on x86_64. test2 produces the best though.
For aarch64, test1/test2 produce the same decent code; the and/shift has been
merged. for riscv, test1/test2 produce the same decent code too, 3 instructions
and no branches.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2023-04-09 23:14 ` pinskia at gcc dot gnu.org
@ 2023-05-06 21:44 ` pinskia at gcc dot gnu.org
  2023-08-23  1:58 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-06 21:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So it should be possible to extend the match pattern for what was the
two_value_replacement replacement to do this too.
The main thing is instead of difference by 1, the difference of the two
constants should be a power of 2.
We could also extend it such that the (a !=/== CST) part where a has only one
non-zero bit in addition to a having a range over two values.

That would be better than what I was proposing in comment #6.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/71336] Suboptimal code generated for "(a & 1) ? (CST1 + CST2) : CST1"
       [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2023-05-06 21:44 ` pinskia at gcc dot gnu.org
@ 2023-08-23  1:58 ` pinskia at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-23  1:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71336

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #10)
> So it should be possible to extend the match pattern for what was the
> two_value_replacement replacement to do this too.
> The main thing is instead of difference by 1, the difference of the two
> constants should be a power of 2.
> We could also extend it such that the (a !=/== CST) part where a has only
> one non-zero bit in addition to a having a range over two values.
> 
> That would be better than what I was proposing in comment #6.

Actually we already have a pattern for:
(zero_one == 0) ? y : z <op> y
(zero_one != 0) ? z <op> y : y

Just in this case y and z are constant.
And the mention of one bit set for a in `a != 0` has been mentioned in one
places too ...

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-23  1:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-71336-4@http.gcc.gnu.org/bugzilla/>
2021-07-20  0:03 ` [Bug rtl-optimization/71336] Suboptimal x86 code generated for "(a & 1) ? (CST1 + CST2) : CST1" pinskia at gcc dot gnu.org
2021-07-20  0:06 ` pinskia at gcc dot gnu.org
2021-07-20  0:17 ` [Bug tree-optimization/71336] " pinskia at gcc dot gnu.org
2021-07-20  0:18 ` [Bug tree-optimization/71336] Suboptimal " pinskia at gcc dot gnu.org
2021-07-20  0:19 ` pinskia at gcc dot gnu.org
2023-04-09 23:14 ` pinskia at gcc dot gnu.org
2023-05-06 21:44 ` pinskia at gcc dot gnu.org
2023-08-23  1:58 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).