public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap
@ 2020-04-29 1:11 gabravier at gmail dot com
2020-04-29 1:48 ` [Bug rtl-optimization/94837] " pinskia at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-04-29 1:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
Bug ID: 94837
Summary: Failure to optimize out spurious movbe into bswap
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
float swapFloat(float x)
{
union
{
float f;
uint32_t u32;
} swapper;
swapper.f = x;
swapper.u32 = __builtin_bswap32(swapper.u32);
return swapper.f;
}
For this function, on x86-64 with `-O3 -mmovbe`, LLVM outputs this :
swapFloat(float): # @swapFloat(float)
movd eax, xmm0
bswap eax
movd xmm0, eax
ret
GCC instead outputs this :
swapFloat(float):
movd DWORD PTR [rsp-4], xmm0
movbe eax, DWORD PTR [rsp-4]
movd xmm0, eax
ret
It seems highly likely to me that a spill to memory is much slower than a
direct `bswap`.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/94837] Failure to optimize out spurious movbe into bswap
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
@ 2020-04-29 1:48 ` pinskia at gcc dot gnu.org
2020-04-29 8:26 ` gabravier at gmail dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-29 1:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |DUPLICATE
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is on purpose.
Use -mtune=intel to get the result you want.
See PR 54593 of the reason why.
*** This bug has been marked as a duplicate of bug 54593 ***
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/94837] Failure to optimize out spurious movbe into bswap
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
2020-04-29 1:48 ` [Bug rtl-optimization/94837] " pinskia at gcc dot gnu.org
@ 2020-04-29 8:26 ` gabravier at gmail dot com
2020-04-29 8:46 ` gabravier at gmail dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-04-29 8:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
This is what I get with `-O3 -mmovbe -mtune=intel` :
swapFloat(float):
movd DWORD PTR [rsp-4], xmm0
movbe eax, DWORD PTR [rsp-4]
movd xmm0, eax
ret
This seems erroneous
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/94837] Failure to optimize out spurious movbe into bswap
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
2020-04-29 1:48 ` [Bug rtl-optimization/94837] " pinskia at gcc dot gnu.org
2020-04-29 8:26 ` gabravier at gmail dot com
@ 2020-04-29 8:46 ` gabravier at gmail dot com
2020-04-29 9:49 ` ubizjak at gmail dot com
2020-04-29 9:54 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-04-29 8:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
--- Comment #3 from Gabriel Ravier <gabravier at gmail dot com> ---
Also, I've tested the code from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593 and the optimization in
question is no longer in in `-mtune=generic`, only with specific architectures
like `-mtune=k8`
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/94837] Failure to optimize out spurious movbe into bswap
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
` (2 preceding siblings ...)
2020-04-29 8:46 ` gabravier at gmail dot com
@ 2020-04-29 9:49 ` ubizjak at gmail dot com
2020-04-29 9:54 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: ubizjak at gmail dot com @ 2020-04-29 9:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords|missed-optimization |ra
CC| |vmakarov at gcc dot gnu.org
Last reconfirmed| |2020-04-29
Resolution|DUPLICATE |---
Status|RESOLVED |NEW
Ever confirmed|0 |1
--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
Looks like RA (tuning?) problem.
We enter reload (-O2 -mmovbe -mtune=intel) with:
(insn 14 4 2 2 (set (reg:SF 87)
(reg:SF 20 xmm0 [ x ])) "pr94837.c":2:1 112 {*movsf_internal}
(expr_list:REG_DEAD (reg:SF 20 xmm0 [ x ])
(nil)))
(insn 7 6 11 2 (set (subreg:SI (reg:SF 84 [ <retval> ]) 0)
(bswap:SI (subreg:SI (reg:SF 87) 0))) "pr94837.c":11:19 869
{*bswapsi2_movbe}
(expr_list:REG_DEAD (reg:SF 87)
(nil)))
(insn 11 7 12 2 (set (reg/i:SF 20 xmm0)
(reg:SF 84 [ <retval> ])) "pr94837.c":12:1 112 {*movsf_internal}
(expr_list:REG_DEAD (reg:SF 84 [ <retval> ])
(nil)))
and this sequence gets reloaded to:
(insn 17 6 7 2 (set (mem/c:SI (plus:DI (reg/f:DI 7 sp)
(const_int -4 [0xfffffffffffffffc])) [1 %sfp+-4 S4 A32])
(reg:SI 20 xmm0 [87])) "pr94837.c":11:19 67 {*movsi_internal}
(nil))
(insn 7 17 16 2 (set (reg:SI 0 ax [88])
(bswap:SI (mem/c:SI (plus:DI (reg/f:DI 7 sp)
(const_int -4 [0xfffffffffffffffc])) [1 %sfp+-4 S4 A32])))
"pr94837.c":11:19 869 {*bswapsi2_movbe}
(nil))
(insn 16 7 12 2 (set (reg:SI 20 xmm0 [orig:84 <retval> ] [84])
(reg:SI 0 ax [88])) "pr94837.c":11:19 67 {*movsi_internal}
(nil))
One would expect reg allocator to choose alternative 0 from:
(define_insn "*bswap<mode>2_movbe"
[(set (match_operand:SWI48 0 "nonimmediate_operand" "=r,r,m")
(bswap:SWI48 (match_operand:SWI48 1 "nonimmediate_operand" "0,m,r")))]
"TARGET_MOVBE
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"@
bswap\t%0
movbe{<imodesuffix>}\t{%1, %0|%0, %1}
movbe{<imodesuffix>}\t{%1, %0|%0, %1}"
but for some reason this is not the case.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug rtl-optimization/94837] Failure to optimize out spurious movbe into bswap
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
` (3 preceding siblings ...)
2020-04-29 9:49 ` ubizjak at gmail dot com
@ 2020-04-29 9:54 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: ubizjak at gmail dot com @ 2020-04-29 9:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94837
--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
Probably some secondary effect of subregs on register allocation, changing
"float" to "int" in the original testcase gets us expected alternative and
optimal code using BSWAP.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-04-29 9:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 1:11 [Bug rtl-optimization/94837] New: Failure to optimize out spurious movbe into bswap gabravier at gmail dot com
2020-04-29 1:48 ` [Bug rtl-optimization/94837] " pinskia at gcc dot gnu.org
2020-04-29 8:26 ` gabravier at gmail dot com
2020-04-29 8:46 ` gabravier at gmail dot com
2020-04-29 9:49 ` ubizjak at gmail dot com
2020-04-29 9:54 ` ubizjak at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).