public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/44194] New: struct returned by value generates useless stores
@ 2010-05-19 5:06 jhaberman at gmail dot com
2010-05-19 9:39 ` [Bug rtl-optimization/44194] " rguenth at gcc dot gnu dot org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: jhaberman at gmail dot com @ 2010-05-19 5:06 UTC (permalink / raw)
To: gcc-bugs
Test case:
--
#include <stdint.h>
struct twoints { uint64_t a, b; } foo();
void bar(uint64_t a, uint64_t b);
void func() {
struct twoints s = foo();
bar(s.a, s.b);
}
--
$ gcc -save-temps -Wall -c -o testbad.o -msse2 -O3 -fomit-frame-pointer
testbad.c
$ objdump -d -r -M intel testbad.o
testbad.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func>:
0: 48 83 ec 28 sub rsp,0x28
4: 31 c0 xor eax,eax
6: e8 00 00 00 00 call b <func+0xb>
7: R_X86_64_PC32 foo-0x4
b: 48 89 04 24 mov QWORD PTR [rsp],rax
f: 48 89 54 24 08 mov QWORD PTR [rsp+0x8],rdx
14: 48 89 d6 mov rsi,rdx
17: 48 89 44 24 10 mov QWORD PTR [rsp+0x10],rax
1c: 48 89 54 24 18 mov QWORD PTR [rsp+0x18],rdx
21: 48 89 c7 mov rdi,rax
24: 48 83 c4 28 add rsp,0x28
28: e9 00 00 00 00 jmp 2d <func+0x2d>
29: R_X86_64_PC32 bar-0x4
--
As you can see above, rax and rdx are stored to the stack twice, but these
stores are unnecessary.
$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared
--enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc
--disable-werror --with-arch-32=i486 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
--
Summary: struct returned by value generates useless stores
Product: gcc
Version: 4.4.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jhaberman at gmail dot com
GCC build triplet: x86_64-linux-gnu
GCC host triplet: x86_64-linux-gnu
GCC target triplet: x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
@ 2010-05-19 9:39 ` rguenth at gcc dot gnu dot org
2010-05-19 10:14 ` jakub at gcc dot gnu dot org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-19 9:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2010-05-19 09:38 -------
Confirmed.
We already expand it that way:
;; s = foo ();
(insn 5 4 6 t.c:7 (set (reg:QI 0 ax)
(const_int 0 [0x0])) -1 (nil))
(call_insn 6 5 7 t.c:7 (set (parallel:BLK [
(expr_list:REG_DEP_TRUE (reg:DI 0 ax)
(const_int 0 [0x0]))
(expr_list:REG_DEP_TRUE (reg:DI 1 dx)
(const_int 8 [0x8]))
])
(call (mem:QI (symbol_ref:DI ("foo") [flags 0x41] <function_decl
0x7ffff5aed500 foo>) [0 S1 A8])
(const_int 0 [0x0]))) -1 (nil)
(expr_list:REG_DEP_TRUE (use (reg:QI 0 ax))
(nil)))
(insn 7 6 8 t.c:7 (set (reg:DI 60)
(reg:DI 0 ax)) -1 (nil))
(insn 8 7 9 t.c:7 (set (reg:DI 61)
(reg:DI 1 dx)) -1 (nil))
(insn 9 8 10 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -32 [0xffffffffffffffe0])) [2 S8 A128])
(reg:DI 60)) -1 (nil))
(insn 10 9 11 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -24 [0xffffffffffffffe8])) [2 S8 A64])
(reg:DI 61)) -1 (nil))
(insn 11 10 12 t.c:7 (set (reg:DI 62)
(mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -32 [0xffffffffffffffe0])) [2 S8 A128])) -1 (nil))
(insn 12 11 13 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -16 [0xfffffffffffffff0])) [2 s+0 S8 A128])
(reg:DI 62)) -1 (nil))
(insn 13 12 14 t.c:7 (set (reg:DI 63)
(mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -24 [0xffffffffffffffe8])) [2 S8 A64])) -1 (nil))
(insn 14 13 0 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -8 [0xfffffffffffffff8])) [2 s+8 S8 A64])
(reg:DI 63)) -1 (nil))
So we create a stack representation to copy it to the stack temporary
(which both are unneeded). We should see that we can avoid the
temporary at all as there is no aggregate use of the struct left.
At least we should avoid the 2nd temporary.
I'm very sure there is a duplicate for this bug somewhere.
Also I wonder why RTL DSE cannot remove all the stores to the stack.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |matz at gcc dot gnu dot org
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2010-05-19 09:38:49
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
2010-05-19 9:39 ` [Bug rtl-optimization/44194] " rguenth at gcc dot gnu dot org
@ 2010-05-19 10:14 ` jakub at gcc dot gnu dot org
2010-05-19 10:22 ` rguenth at gcc dot gnu dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-05-19 10:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from jakub at gcc dot gnu dot org 2010-05-19 10:13 -------
RTL DSE doesn't handle this because the call to bar, which isn't a const
function, is considered a wild read and thus makes all stores necessary in the
global as well as local algorithm.
RTL DSE doesn't consider whether a frame based address could have possibly
address taken or not and whether a call thus might read it or not.
For tail calls before reload, perhaps we could handle tail calls similarly to
const calls (be a read of all argument stores) with the addition that it would
act as a read for all constant address stores (basically wild read for anything
but frame based stores for the global algorithm, given that the stack is
unwound before the tail call).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
2010-05-19 9:39 ` [Bug rtl-optimization/44194] " rguenth at gcc dot gnu dot org
2010-05-19 10:14 ` jakub at gcc dot gnu dot org
@ 2010-05-19 10:22 ` rguenth at gcc dot gnu dot org
2010-07-10 1:38 ` jhaberman at gmail dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-19 10:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenth at gcc dot gnu dot org 2010-05-19 10:22 -------
At leas the stores to s have alias info:
(insn 12 10 14 2 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 20 frame)
(const_int -16 [0xfffffffffffffff0])) [2 s+0 S8 A128])
(reg:DI 60)) 89 {*movdi_1_rex64} (nil))
(insn 14 12 17 2 t.c:7 (set (mem/s/c:DI (plus:DI (reg/f:DI 20 frame)
(const_int -8 [0xfffffffffffffff8])) [2 s+8 S8 A64])
(reg:DI 61)) 89 {*movdi_1_rex64} (nil))
so RTL DSE could check whether the stack slot is aliased at all.
The other memory temporary should be avoided at expansion time already.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
` (2 preceding siblings ...)
2010-05-19 10:22 ` rguenth at gcc dot gnu dot org
@ 2010-07-10 1:38 ` jhaberman at gmail dot com
2010-07-10 1:40 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: jhaberman at gmail dot com @ 2010-07-10 1:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jhaberman at gmail dot com 2010-07-10 01:38 -------
This seems to happen even with POD return types:
int foo();
void bar(int a);
void func() {
bar(foo());
}
In 32-bit mode it spills the return value to the stack for no reason. It also
seems to overallocate the stack (28 bytes allocated, only 4 used):
00000000 <func>:
0: 83 ec 1c sub esp,0x1c
3: e8 fc ff ff ff call 4 <func+0x4>
4: R_386_PC32 foo
8: 89 04 24 mov DWORD PTR [esp],eax
b: e8 fc ff ff ff call c <func+0xc>
c: R_386_PC32 bar
10: 83 c4 1c add esp,0x1c
13: c3 ret
In 64-bit mode there is no store, but it *does* allocate 8 bytes of stack that
it never uses:
0000000000000000 <func>:
0: 48 83 ec 08 sub rsp,0x8
4: 31 c0 xor eax,eax
6: e8 00 00 00 00 call b <func+0xb>
7: R_X86_64_PC32 foo+0xfffffffffffffffc
b: 48 83 c4 08 add rsp,0x8
f: 89 c7 mov edi,eax
11: e9 00 00 00 00 jmp 16 <func+0x16>
12: R_X86_64_PC32 bar+0xfffffffffffffffc
Any idea how hard this bug is to fix?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
` (3 preceding siblings ...)
2010-07-10 1:38 ` jhaberman at gmail dot com
@ 2010-07-10 1:40 ` pinskia at gcc dot gnu dot org
2010-07-10 1:42 ` pinskia at gcc dot gnu dot org
2010-07-10 1:48 ` jhaberman at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-07-10 1:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2010-07-10 01:40 -------
>In 32-bit mode it spills the return value to the stack for no reason.
Huh? arguments are passed via the stack in 32bit mode.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
` (4 preceding siblings ...)
2010-07-10 1:40 ` pinskia at gcc dot gnu dot org
@ 2010-07-10 1:42 ` pinskia at gcc dot gnu dot org
2010-07-10 1:48 ` jhaberman at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2010-07-10 1:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from pinskia at gcc dot gnu dot org 2010-07-10 01:42 -------
>In 64-bit mode there is no store, but it *does* allocate 8 bytes of stack that
it never uses:
Oh no that is called aligning the stack to be 16 byte aligned.
>It also
seems to overallocate the stack (28 bytes allocated, only 4 used):
No, it is not over allocating the stack really, it does align it be to 16 byte
aligned though.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/44194] struct returned by value generates useless stores
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
` (5 preceding siblings ...)
2010-07-10 1:42 ` pinskia at gcc dot gnu dot org
@ 2010-07-10 1:48 ` jhaberman at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: jhaberman at gmail dot com @ 2010-07-10 1:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from jhaberman at gmail dot com 2010-07-10 01:48 -------
I must have been on crack when I wrote that last comment. Sorry for the noise.
Though I do wonder how difficult the original bug is to fix. This seems to
make it more expensive to return structures by value.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44194
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-07-10 1:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-19 5:06 [Bug rtl-optimization/44194] New: struct returned by value generates useless stores jhaberman at gmail dot com
2010-05-19 9:39 ` [Bug rtl-optimization/44194] " rguenth at gcc dot gnu dot org
2010-05-19 10:14 ` jakub at gcc dot gnu dot org
2010-05-19 10:22 ` rguenth at gcc dot gnu dot org
2010-07-10 1:38 ` jhaberman at gmail dot com
2010-07-10 1:40 ` pinskia at gcc dot gnu dot org
2010-07-10 1:42 ` pinskia at gcc dot gnu dot org
2010-07-10 1:48 ` jhaberman at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).