public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov.
@ 2011-03-02 5:30 svfuerst at gmail dot com
2011-03-02 10:23 ` [Bug target/47949] " rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: svfuerst at gmail dot com @ 2011-03-02 5:30 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
Summary: Missed optimization for -Os using xchg instead of mov.
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: svfuerst@gmail.com
Target: x86 / amd64
xchg %eax, reg is a one-byte instruction. If reg is dead, this instruction
could replace the two-byte mov reg, %eax for a one-byte savings.
ie:
int foo(int x)
{
return x;
}
currently compiles to
mov %edi,%eax
retq
with -Os, whereas the following may be better:
xchg %eax, %edi
retq
(Similar cases exist with mov reg, %rax; mov reg, %ax; and mov reg, %al)
Note that xchg is slower than mov, so this is only an optimization when size is
more important than speed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
@ 2011-03-02 10:23 ` rguenth at gcc dot gnu.org
2011-03-02 10:53 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-03-02 10:23 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target|x86 / amd64 |x86_64-*-*, i?86-*-*
Status|UNCONFIRMED |NEW
Last reconfirmed| |2011.03.02 10:23:29
Ever Confirmed|0 |1
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-03-02 10:23:29 UTC ---
Interesting idea.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
2011-03-02 10:23 ` [Bug target/47949] " rguenth at gcc dot gnu.org
@ 2011-03-02 10:53 ` jakub at gcc dot gnu.org
2011-03-02 21:51 ` svfuerst at gmail dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-03-02 10:53 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-03-02 10:53:28 UTC ---
I'm afraid it will upset Linux kernel people and others who are using -Os for
performance reasons to decrease its cache footprint, but if their code slows
down too much, they won't be happy.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
2011-03-02 10:23 ` [Bug target/47949] " rguenth at gcc dot gnu.org
2011-03-02 10:53 ` jakub at gcc dot gnu.org
@ 2011-03-02 21:51 ` svfuerst at gmail dot com
2021-06-08 9:52 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: svfuerst at gmail dot com @ 2011-03-02 21:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
--- Comment #3 from Steven Fuerst <svfuerst at gmail dot com> 2011-03-02 21:51:12 UTC ---
Having a quick look at generated code... it appears that this pattern doesn't
come up all that often. However, there is one case where it does: the epilogue
of a function. i.e. gcc tends to generate code looking like:
movl %ebp, %eax
movq 8(%rsp), %rbx
movq 16(%rsp), %rbp
movq 24(%rsp), %r12
movq 32(%rsp), %r13
addq $40, %rsp
ret
Replacing the move to %eax with an exchange with %ebp is a win in this
particular case. The extra cycle or two of latency that xchg takes doesn't
matter as the other moves and ret instruction overlap in execution with it.
Benchmarking on an opteron in 64bit mode confirms this hypothesis even in the
degenerate case where no other moves exist:
foo1:
mov %edi, %eax
retq
foo2:
xchg %eax, %edi
retq
foo1 and foo2 take the same time to execute.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
` (2 preceding siblings ...)
2011-03-02 21:51 ` svfuerst at gmail dot com
@ 2021-06-08 9:52 ` pinskia at gcc dot gnu.org
2022-08-03 8:11 ` cvs-commit at gcc dot gnu.org
2022-08-04 18:23 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-08 9:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This might have been fixed already via the commit referenced in PR92549
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
` (3 preceding siblings ...)
2021-06-08 9:52 ` pinskia at gcc dot gnu.org
@ 2022-08-03 8:11 ` cvs-commit at gcc dot gnu.org
2022-08-04 18:23 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-08-03 8:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:fc6ef90173478521982e9df3831a06ea85b4f41e
commit r13-1945-gfc6ef90173478521982e9df3831a06ea85b4f41e
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Wed Aug 3 09:07:36 2022 +0100
PR target/47949: Use xchg to move from/to AX_REG with -Oz on x86.
This patch adds a peephole2 to i386.md to implement the suggestion in
PR target/47949, of using xchg instead of mov for moving values to/from
the %rax/%eax register, controlled by -Oz, as the xchg instruction is
one byte shorter than the move it is replacing.
The new test case is taken from the PR:
int foo(int x) { return x; }
where previously we'd generate:
foo: mov %edi,%eax // 2 bytes
ret
but with this patch, using -Oz, we generate:
foo: xchg %eax,%edi // 1 byte
ret
On the CSiBE benchmark, this saves a total of 10238 bytes (reducing
the -Oz total from 3661796 bytes to 3651558 bytes, a 0.28% saving).
Interestingly, some modern architectures (such as Zen 3) implement
xchg using zero latency register renaming (just like mov), so in theory
this transformation could be enabled when optimizing for speed, if
benchmarking shows the improved code density produces consistently
better performance. However, this is architecture dependent, and
there may be interactions using xchg (instead a single_set) in the
late RTL passes (such as cprop_hardreg), so for now I've restricted
this to -Oz.
2022-08-03 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/47949
* config/i386/i386.md (peephole2): New peephole2 to convert
SWI48 moves to/from %rax/%eax where the src is dead to xchg,
when optimizing for minimal size with -Oz.
gcc/testsuite/ChangeLog
PR target/47949
* gcc.target/i386/pr47949.c: New test case.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/47949] Missed optimization for -Os using xchg instead of mov.
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
` (4 preceding siblings ...)
2022-08-03 8:11 ` cvs-commit at gcc dot gnu.org
@ 2022-08-04 18:23 ` roger at nextmovesoftware dot com
5 siblings, 0 replies; 7+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-08-04 18:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47949
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
Status|NEW |RESOLVED
Target Milestone|--- |13.0
Resolution|--- |FIXED
--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
This suggestion has now been implemented on mainline (when using -Oz).
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-08-04 18:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-02 5:30 [Bug target/47949] New: Missed optimization for -Os using xchg instead of mov svfuerst at gmail dot com
2011-03-02 10:23 ` [Bug target/47949] " rguenth at gcc dot gnu.org
2011-03-02 10:53 ` jakub at gcc dot gnu.org
2011-03-02 21:51 ` svfuerst at gmail dot com
2021-06-08 9:52 ` pinskia at gcc dot gnu.org
2022-08-03 8:11 ` cvs-commit at gcc dot gnu.org
2022-08-04 18:23 ` roger at nextmovesoftware dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).