public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/108255] New: Repeated address-of (lea) not optimized for size.
@ 2022-12-30 22:30 witold.baryluk+gcc at gmail dot com
2022-12-30 23:10 ` [Bug target/108255] " pinskia at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: witold.baryluk+gcc at gmail dot com @ 2022-12-30 22:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108255
Bug ID: 108255
Summary: Repeated address-of (lea) not optimized for size.
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: witold.baryluk+gcc at gmail dot com
Target Milestone: ---
https://godbolt.org/z/q5sx9e49j
void f(int *);
int g(int of) {
int x = 13;
f(&x);
f(&x);
f(&x);
f(&x);
f(&x);
f(&x);
f(&x);
f(&x);
return 0;
}
Got:
g(int):
sub rsp, 24
lea rdi, [rsp+12]
mov DWORD PTR [rsp+12], 13
call f(int*)
lea rdi, [rsp+12] # compute, 5 bytes
call f(int*)
lea rdi, [rsp+12] # recompute, 5 bytes
call f(int*)
lea rdi, [rsp+12] # recompute, 5 bytes
call f(int*)
lea rdi, [rsp+12]
call f(int*)
lea rdi, [rsp+12]
call f(int*)
lea rdi, [rsp+12]
call f(int*)
lea rdi, [rsp+12]
call f(int*)
xor eax, eax
add rsp, 24
ret
But, note that lea is 5 bytes.
Expected (generated by clang 3.0 - 15.0):
g(int): # @g(int)
push rbx # extra, but just 1 byte
sub rsp, 16
mov dword ptr [rsp + 12], 13 # CSE temp
lea rbx, [rsp + 12]
mov rdi, rbx # use
call f(int*)@PLT
mov rdi, rbx # reuse, 3 bytes
call f(int*)@PLT
mov rdi, rbx # reuse, 3 bytes
call f(int*)@PLT
mov rdi, rbx
call f(int*)@PLT
mov rdi, rbx
call f(int*)@PLT
mov rdi, rbx
call f(int*)@PLT
mov rdi, rbx
call f(int*)@PLT
mov rdi, rbx
call f(int*)@PLT
xor eax, eax
add rsp, 16
pop rbx # extra, but just 1 byte
ret
Technically this is more instructions.
But
mov rdi, rbx is 3 bytes, which is shorter than 5 bytes of lea. This is at minor
expense of needing to save and restore rbx.
PS. Same happens when using temporary `int *const y = &x;`
Also same when optimizing for size (`-Os`).
It looks like gcc 4.8.5 produced expected code, but gcc 4.9.0 does not.
It is possible that the code produced by gcc 4.9.0 is faster, but it is also
likely it contributes quite a bit to binary size.
clang uses CSE even if there are even just two uses of `&x` in the above
example. It is likely a bit higher threshold is (3 or 4) is actually optimal
(can be calculated knowing encoding sizes).
Weirdly tho, gcc -m32 does this:
g():
push ebp
mov ebp, esp
push ebx
lea ebx, [ebp-12]
sub esp, 32
mov DWORD PTR [ebp-12], 13
push ebx
call f(int*)
mov DWORD PTR [esp], ebx
call f(int*)
mov DWORD PTR [esp], ebx
call f(int*)
mov ebx, DWORD PTR [ebp-4]
xor eax, eax
leave
ret
Where, it does compute address and stores it in temporary. But does it on a
stack, instead in a register (my guess is there are no free register to store
it and it is spilled)., but in fact lea here would be likely faster (mov
DWORD PTR [esp], ebx, but requires memory/cache access, lea is 5 bytes, but
does not require memory access)
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug target/108255] Repeated address-of (lea) not optimized for size.
2022-12-30 22:30 [Bug c/108255] New: Repeated address-of (lea) not optimized for size witold.baryluk+gcc at gmail dot com
@ 2022-12-30 23:10 ` pinskia at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-12-30 23:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108255
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I suspect r0-127773-g3e7291458b96 changed the behavior for GCC 4.9+
I have not figured out what changed the behavior for GCC 4.8 yet though.
I suspect it was just a mistake that GCC 4.8 cost model was incorrect really.
LLVM might be not tuning correctly anyways ...
Also note ICC (not ICX) does the same as GCC ...
So I think this is just a LLVM issue rather than a GCC issue.
Someone who knows more about the x86 processors behavior can explain more.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-12-30 23:10 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-30 22:30 [Bug c/108255] New: Repeated address-of (lea) not optimized for size witold.baryluk+gcc at gmail dot com
2022-12-30 23:10 ` [Bug target/108255] " pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).