public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/109463] New: suboptimal sequence for converting 64-bit unsigned int to float
@ 2023-04-10 10:45 elronnd at elronnd dot net
2023-04-10 10:54 ` [Bug target/109463] " pinskia at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: elronnd at elronnd dot net @ 2023-04-10 10:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109463
Bug ID: 109463
Summary: suboptimal sequence for converting 64-bit unsigned int
to float
Product: gcc
Version: 12.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: elronnd at elronnd dot net
Target Milestone: ---
double f(uint64_t x) { return x; } gives:
test rdi,rdi
js 10 <f+0x10>
pxor xmm0,xmm0
cvtsi2sd xmm0,rdi
ret
nop
10:
mov rax,rdi
and edi,0x1
pxor xmm0,xmm0
shr rax,1
or rax,rdi
cvtsi2sd xmm0,rax
addsd xmm0,xmm0
ret
In particular, the sequence:
mov rax,rdi
and edi,0x1
shr rax,1
or rax,rdi
cvtsi2sd xmm0,rax
Can be replaced with:
movzx eax,dil
shr rdi,1
or rdi,rax
cvtsi2sd xmm0,rdi
Since all 9 low bits of rdi are below the sticky bit, oring them together in
any order suffices to round correctly.
Alternatively, in order to avoid clobbering rdi, use the following sequence:
mov rax,rdi
shr rax,1
or al,dil
cvtsi2sd xmm0,rax
(The penalty for partial register access appears to be very cheap or
nonexistent on recent uarchs.)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/109463] suboptimal sequence for converting 64-bit unsigned int to float
2023-04-10 10:45 [Bug c/109463] New: suboptimal sequence for converting 64-bit unsigned int to float elronnd at elronnd dot net
@ 2023-04-10 10:54 ` pinskia at gcc dot gnu.org
2023-04-10 10:56 ` pinskia at gcc dot gnu.org
2023-04-10 10:57 ` elronnd at elronnd dot net
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-04-10 10:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109463
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Keywords| |missed-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
clang/LLVM produces:
movq %rdi, %xmm1
punpckldq .LCPI1_0(%rip), %xmm1 # xmm1 =
xmm1[0],mem[0],xmm1[1],mem[1]
subpd .LCPI1_1(%rip), %xmm1
movapd %xmm1, %xmm0
unpckhpd %xmm1, %xmm0 # xmm0 =
xmm0[1],xmm1[1]
addsd %xmm1, %xmm0
retq
LCPI1_0 being:
.LCPI1_1:
.quad 0x4330000000000000 # double 4503599627370496
.quad 0x4530000000000000 # double 1.9342813113834067E+25
note clang even produces that even if you say the top bit is not set via:
double f(unsigned long x) { if (x >>63) __builtin_unreachable(); return x; }
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/109463] suboptimal sequence for converting 64-bit unsigned int to float
2023-04-10 10:45 [Bug c/109463] New: suboptimal sequence for converting 64-bit unsigned int to float elronnd at elronnd dot net
2023-04-10 10:54 ` [Bug target/109463] " pinskia at gcc dot gnu.org
@ 2023-04-10 10:56 ` pinskia at gcc dot gnu.org
2023-04-10 10:57 ` elronnd at elronnd dot net
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-04-10 10:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109463
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
It might be the case that having the top bit set for an 64bit unsigned integer
is not often enough to optimize for ...
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/109463] suboptimal sequence for converting 64-bit unsigned int to float
2023-04-10 10:45 [Bug c/109463] New: suboptimal sequence for converting 64-bit unsigned int to float elronnd at elronnd dot net
2023-04-10 10:54 ` [Bug target/109463] " pinskia at gcc dot gnu.org
2023-04-10 10:56 ` pinskia at gcc dot gnu.org
@ 2023-04-10 10:57 ` elronnd at elronnd dot net
2 siblings, 0 replies; 4+ messages in thread
From: elronnd at elronnd dot net @ 2023-04-10 10:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109463
--- Comment #3 from elronnd at elronnd dot net ---
Yes, I think the gcc approach of branching is definitely better. But it's
still a good idea to optimise for size in the cold path.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-04-10 10:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-10 10:45 [Bug c/109463] New: suboptimal sequence for converting 64-bit unsigned int to float elronnd at elronnd dot net
2023-04-10 10:54 ` [Bug target/109463] " pinskia at gcc dot gnu.org
2023-04-10 10:56 ` pinskia at gcc dot gnu.org
2023-04-10 10:57 ` elronnd at elronnd dot net
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).