* unneeded spills of SF values on xtensa with FPU
@ 2024-03-25 22:31 Max Filippov
0 siblings, 0 replies; only message in thread
From: Max Filippov @ 2024-03-25 22:31 UTC (permalink / raw)
To: Takayuki 'January June' Suwa; +Cc: gcc
Hi Suwa-san,
I've noticed that in xtensa configurations with hardware FPU
function arguments of type float are spilled on the stack although
there's no need for that. E.g. the following function:
int f(float a, float b)
{
return a < b;
}
translates to the following with -O2:
f:
entry sp, 48
wfr f0, a2
wfr f1, a3
s32i.n a2, sp, 0
olt.s b0, f0, f1
movi.n a8, 0
movi.n a2, 1
s32i.n a3, sp, 4
movf a2, a8, b0
retw.n
The relevant RTL looks like this at the end of IRA:
(insn 18 4 19 2 (set (reg:SF 51)
(reg:SF 2 a2 [ a ])) "test2.c":2:1 61 {movsf_internal}
(expr_list:REG_DEAD (reg:SF 2 a2 [ a ])
(nil)))
(insn 19 18 7 2 (set (reg:SF 52)
(reg:SF 3 a3 [ b ])) "test2.c":2:1 61 {movsf_internal}
(expr_list:REG_DEAD (reg:SF 3 a3 [ b ])
(nil)))
(insn 7 19 21 2 (set (reg:CC 18 b0)
(lt:CC (reg:SF 51)
(reg:SF 52))) "test2.c":3:11 100 {slt_sf}
(expr_list:REG_DEAD (reg:SF 52)
(expr_list:REG_DEAD (reg:SF 51)
(nil))))
and it is transformed to the following by the end of LRA:
(insn 18 4 19 2 (set (mem/c:SF (reg/f:SI 1 sp) [1 %sfp+0 S4 A32])
(reg:SF 2 a2 [ a ])) "test2.c":2:1 61 {movsf_internal}
(nil))
(insn 19 18 24 2 (set (mem/c:SF (plus:SI (reg/f:SI 1 sp)
(const_int 4 [0x4])) [1 %sfp+4 S4 A32])
(reg:SF 3 a3 [ b ])) "test2.c":2:1 61 {movsf_internal}
(nil))
(insn 24 19 25 2 (set (reg:SF 19 f0 [51])
(mem/c:SF (reg/f:SI 1 sp) [1 %sfp+0 S4 A32])) "test2.c":3:11 61
{movsf_internal}
(nil))
(insn 25 24 7 2 (set (reg:SF 20 f1 [52])
(mem/c:SF (plus:SI (reg/f:SI 1 sp)
(const_int 4 [0x4])) [1 %sfp+4 S4 A32])) "test2.c":3:11
61 {movsf_internal}
(nil))
(insn 7 25 21 2 (set (reg:CC 18 b0)
(lt:CC (reg:SF 19 f0 [51])
(reg:SF 20 f1 [52]))) "test2.c":3:11 100 {slt_sf}
(nil))
LRA stops checking alternatives for insns 18 and 19 at s32i.n,
but even if I move wfr at the head of the movsf_internal list it
still loses to s32i.n.
Postreload pass replaces the lsi instructions 24 and 25 with
wfr from a2 and a3, but doesn't remove the spills.
I wonder what can be done with that?
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-03-25 22:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-25 22:31 unneeded spills of SF values on xtensa with FPU Max Filippov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).