public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/115204] New: unnecessary stack usage and copies (of temporaries)
@ 2024-05-23 12:19 mkretz at gcc dot gnu.org
2024-05-23 12:21 ` [Bug target/115204] " mkretz at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: mkretz at gcc dot gnu.org @ 2024-05-23 12:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115204
Bug ID: 115204
Summary: unnecessary stack usage and copies (of temporaries)
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: mkretz at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
Test case (https://compiler-explorer.com/z/P7s75EhMr):
struct A {
int data[8];
};
struct A gen();
void g(struct A);
void f()
{
g(gen());
}
This places the returned A object from 'gen()' on the stack, copies it and then
calls 'g'. Why? So instead of
f:
sub rsp, 40
xor eax, eax
mov rdi, rsp
call gen
sub rsp, 32
movdqa xmm0, XMMWORD PTR [rsp+32]
movups XMMWORD PTR [rsp], xmm0
movdqa xmm0, XMMWORD PTR [rsp+48]
movups XMMWORD PTR [rsp+16], xmm0
call g
add rsp, 72
ret
can GCC just elide the copy? Like this:
f:
sub rsp, 40
xor eax, eax
mov rdi, rsp
call gen
call g
add rsp, 40
ret
I understand that this optimization requires the caller to never read from the
object anymore. So a second call to 'g' with the same object returned from
'gen' (like in https://compiler-explorer.com/z/6rMYdnb34) requires that the
first call to 'g' gets a copy. But the second call does not require the copy.
I.e.
int f()
{
struct A a = gen();
g(a);
g(a);
return 1;
}
compiles to
f:
sub rsp, 40
xor eax, eax
mov rdi, rsp
call gen
sub rsp, 32
movdqa xmm0, XMMWORD PTR [rsp+32]
movups XMMWORD PTR [rsp], xmm0
movdqa xmm0, XMMWORD PTR [rsp+48]
movups XMMWORD PTR [rsp+16], xmm0
call g
movdqa xmm0, XMMWORD PTR [rsp+32]
movups XMMWORD PTR [rsp], xmm0
movdqa xmm0, XMMWORD PTR [rsp+48]
movups XMMWORD PTR [rsp+16], xmm0
call g
mov eax, 1
add rsp, 72
ret
but could be
f:
sub rsp, 40
xor eax, eax
mov rdi, rsp
call gen
sub rsp, 32
movdqa xmm0, XMMWORD PTR [rsp+32]
movups XMMWORD PTR [rsp], xmm0
movdqa xmm0, XMMWORD PTR [rsp+48]
movups XMMWORD PTR [rsp+16], xmm0
call g
add rsp, 32
call g
mov eax, 1
add rsp, 40
ret
IIUC, the second change would be significantly harder to implement because it
needs to shrink the stack. However, I don't believe this second case is as
important. The first one should be sufficiently common because of temporaries
passed into function arguments. So the following variation
void f()
{
g(gen(), gen());
}
is something I see often, leading to many unnecessary stack copies. Instead of
f:
sub rsp, 72
xor eax, eax
mov rdi, rsp
call gen
lea rdi, [rsp+32]
xor eax, eax
call gen
sub rsp, 64
movdqa xmm0, XMMWORD PTR [rsp+64]
movups XMMWORD PTR [rsp+32], xmm0
movdqa xmm0, XMMWORD PTR [rsp+80]
movups XMMWORD PTR [rsp+48], xmm0
movdqa xmm0, XMMWORD PTR [rsp+96]
movups XMMWORD PTR [rsp], xmm0
movdqa xmm0, XMMWORD PTR [rsp+112]
movups XMMWORD PTR [rsp+16], xmm0
call g
add rsp, 136
ret
I think it should be:
f:
sub rsp, 72
xor eax, eax
mov rdi, rsp
call gen
lea rdi, [rsp+32]
xor eax, eax
call gen
call g
add rsp, 72
ret
IIUC, this depends on the psABI and I don't know how target-dependent such an
optimization is. That's why I
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/115204] unnecessary stack usage and copies (of temporaries)
2024-05-23 12:19 [Bug target/115204] New: unnecessary stack usage and copies (of temporaries) mkretz at gcc dot gnu.org
@ 2024-05-23 12:21 ` mkretz at gcc dot gnu.org
2024-05-23 12:21 ` pinskia at gcc dot gnu.org
2024-05-23 12:29 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: mkretz at gcc dot gnu.org @ 2024-05-23 12:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115204
--- Comment #1 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
That's why I tagged is as 'target'. I'd be happy to learn that it can be
resolved target-independently.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/115204] unnecessary stack usage and copies (of temporaries)
2024-05-23 12:19 [Bug target/115204] New: unnecessary stack usage and copies (of temporaries) mkretz at gcc dot gnu.org
2024-05-23 12:21 ` [Bug target/115204] " mkretz at gcc dot gnu.org
@ 2024-05-23 12:21 ` pinskia at gcc dot gnu.org
2024-05-23 12:29 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-05-23 12:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115204
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I am 99% sure there is a dup of this bug already.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/115204] unnecessary stack usage and copies (of temporaries)
2024-05-23 12:19 [Bug target/115204] New: unnecessary stack usage and copies (of temporaries) mkretz at gcc dot gnu.org
2024-05-23 12:21 ` [Bug target/115204] " mkretz at gcc dot gnu.org
2024-05-23 12:21 ` pinskia at gcc dot gnu.org
@ 2024-05-23 12:29 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-05-23 12:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115204
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup.
*** This bug has been marked as a duplicate of bug 28831 ***
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-23 12:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-23 12:19 [Bug target/115204] New: unnecessary stack usage and copies (of temporaries) mkretz at gcc dot gnu.org
2024-05-23 12:21 ` [Bug target/115204] " mkretz at gcc dot gnu.org
2024-05-23 12:21 ` pinskia at gcc dot gnu.org
2024-05-23 12:29 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).