* [Bug middle-end/68000] Suboptimal ternary operator codegen
[not found] <bug-68000-4@http.gcc.gnu.org/bugzilla/>
@ 2015-10-17 6:54 ` pinskia at gcc dot gnu.org
2015-10-17 8:34 ` glisse at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-10-17 6:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68000
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|c |middle-end
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I think the trunk already has improved code generation.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/68000] Suboptimal ternary operator codegen
[not found] <bug-68000-4@http.gcc.gnu.org/bugzilla/>
2015-10-17 6:54 ` [Bug middle-end/68000] Suboptimal ternary operator codegen pinskia at gcc dot gnu.org
@ 2015-10-17 8:34 ` glisse at gcc dot gnu.org
2015-10-18 20:38 ` thaines.astro at gmail dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 5+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-10-17 8:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68000
--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
Independently of hoisting,
mov eax, edx
add edx, 1
add eax, 1
apparently we fail to CSE this because at the time of CSE, one addition is done
in mode QI and the other in SI, and it is only in split2 that the QI one is
promoted to SI, which is too late for CSE.
Actually, it is quite hard to notice that foo is equivalent to the other
versions. If you wrote: return (uint8_t)(p->x + 1) == p->y ? 0 : p->x + 1; it
would be much easier, and indeed I get better code. For the equivalence, the
compiler has to notice that the only case where the cast matters is when x is
255 and y is 0, and in that case both branches (after sinking the cast to
uint8_t from the return type) are equivalent.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/68000] Suboptimal ternary operator codegen
[not found] <bug-68000-4@http.gcc.gnu.org/bugzilla/>
2015-10-17 6:54 ` [Bug middle-end/68000] Suboptimal ternary operator codegen pinskia at gcc dot gnu.org
2015-10-17 8:34 ` glisse at gcc dot gnu.org
@ 2015-10-18 20:38 ` thaines.astro at gmail dot com
2015-10-19 5:17 ` pinskia at gcc dot gnu.org
2021-06-03 4:11 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 5+ messages in thread
From: thaines.astro at gmail dot com @ 2015-10-18 20:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68000
--- Comment #3 from Tim Haines <thaines.astro at gmail dot com> ---
(In reply to Andrew Pinski from comment #1)
> Note I think the trunk already has improved code generation.
Here is the codegen from the latest trunk build using the same options as
before.
foo_manual_hoist:
movzx eax, BYTE PTR [rdi]
mov edx, 0
inc eax
cmp al, BYTE PTR [rdi+1]
cmove eax, edx
ret
foo:
movzx edx, BYTE PTR [rdi]
movzx ecx, BYTE PTR [rdi+1]
mov eax, edx
inc edx
cmp edx, ecx
je .L6
inc eax
ret
.L6:
xor eax, eax
ret
foo_if:
movzx eax, BYTE PTR [rdi]
mov edx, 0
inc eax
cmp al, BYTE PTR [rdi+1]
cmove eax, edx
ret
Changing from a cmove to a cmp/jmp doesn't change the instruction latency
(although I couldn't find the latency for je on intel. I assume it's 1 like on
AMD), but now the branch predictor will be invoked- bringing possible pipeline
hazards. I don't mean to be overly critical, but I wouldn't consider this to be
an improvement to the previous code- especially since the other two versions of
the function use cmove. As Marc noted, we are still missing CSE here.
NB: The structure offsets are different here because the assembly I originally
posted was poorly anonymized by me. Mea culpa!
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/68000] Suboptimal ternary operator codegen
[not found] <bug-68000-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2015-10-18 20:38 ` thaines.astro at gmail dot com
@ 2015-10-19 5:17 ` pinskia at gcc dot gnu.org
2021-06-03 4:11 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-10-19 5:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68000
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I was looking at the code generation of AARCH64 which does not have
addition in QI Mode and does those adds in SImode and RTL CSE (or RTL GCSE, I
did not look into which one) could remove the extra addition.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug middle-end/68000] Suboptimal ternary operator codegen
[not found] <bug-68000-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2015-10-19 5:17 ` pinskia at gcc dot gnu.org
@ 2021-06-03 4:11 ` pinskia at gcc dot gnu.org
4 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-03 4:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68000
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So take:
p->x + 1 == p->y
If p->x == 255 and p->y == 0. The above will be false due to C integer
promotion rules.
The other two testcases have the case where p->x == 255 and p->y == 0 will be
true. Basically foo is not the same as the other two.
^ permalink raw reply [flat|nested] 5+ messages in thread