public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/65413] New: inefficient code returning aggregates on powepc64le
@ 2015-03-12 22:36 msebor at gcc dot gnu.org
2015-03-12 23:49 ` [Bug target/65413] " msebor at gcc dot gnu.org
2015-03-14 20:19 ` [Bug target/65413] inefficient code returning aggregates on powerpc64le segher at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: msebor at gcc dot gnu.org @ 2015-03-12 22:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65413
Bug ID: 65413
Summary: inefficient code returning aggregates on powepc64le
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: msebor at gcc dot gnu.org
When returning aggregates that don't fit into a single register on powerpc64le,
gcc emits the assembly below which ultimately results in the same values in the
r3 and r4 registers as it started out with. All the shift and rotate
instructions are unnecessary (and in fact, foo could be as simple as a blr).
$ cat ~/tmp/x.c && gcc -O2 -S -Wall -o/dev/tty ~/tmp/x.c
typedef struct { int a[3]; } A;
A __attribute__ ((noinline)) foo (A a) {
return a;
}
A bar (A a) {
return foo (a);
}
.file "x.c"
.machine power8
.abiversion 2
.section ".toc","aw"
.section ".text"
.align 2
.p2align 4,,15
.globl foo
.type foo, @function
foo:
mr 9,3
li 3,0
rldicl 10,9,0,32
srdi 9,9,32
rldimi 3,10,0,32
rldicl 4,4,0,32
rldimi 3,9,32,0
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.size foo,.-foo
.align 2
.p2align 4,,15
.globl bar
.type bar, @function
bar:
0: addis 2,12,.TOC.-0b@ha
addi 2,2,.TOC.-0b@l
.localentry bar,.-bar
mflr 0
std 0,16(1)
stdu 1,-64(1)
bl foo
addi 1,1,64
ld 0,16(1)
mr 9,3
li 3,0
rldicl 10,9,0,32
srdi 9,9,32
rldimi 3,10,0,32
rldicl 4,4,0,32
mtlr 0
rldimi 3,9,32,0
blr
.long 0
.byte 0,0,0,1,128,0,0,0
.size bar,.-bar
.ident "GCC: (GNU) 5.0.0 20150303 (experimental)"
.section .note.GNU-stack,"",@progbits
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/65413] inefficient code returning aggregates on powepc64le
2015-03-12 22:36 [Bug c/65413] New: inefficient code returning aggregates on powepc64le msebor at gcc dot gnu.org
@ 2015-03-12 23:49 ` msebor at gcc dot gnu.org
2015-03-14 20:19 ` [Bug target/65413] inefficient code returning aggregates on powerpc64le segher at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: msebor at gcc dot gnu.org @ 2015-03-12 23:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65413
--- Comment #1 from Martin Sebor <msebor at gcc dot gnu.org> ---
Actually, similarly inefficient code is generated even for aggregates that do
fit into a register. The trigger appears to be that the aggregate not take up
an even multiple of a register. For example, returning a "struct { char a[7];
}" results in the code below. Another example is "struct { short a[3]; }"
foo:
mr 10,3
rlwinm 8,3,0,0xff
li 9,0
rldicl 7,10,48,56
rldimi 9,8,0,56
rldicl 8,10,56,56
rldimi 9,8,8,48
rldicl 10,10,40,56
srdi 8,3,32
rldimi 9,7,16,40
rldimi 9,10,24,32
rlwinm 10,8,0,0xff
rldimi 9,10,32,24
rldicl 8,8,56,56
rldicl 3,3,16,56
rldimi 9,8,40,16
rldimi 9,3,48,8
mr 3,9
blr
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/65413] inefficient code returning aggregates on powerpc64le
2015-03-12 22:36 [Bug c/65413] New: inefficient code returning aggregates on powepc64le msebor at gcc dot gnu.org
2015-03-12 23:49 ` [Bug target/65413] " msebor at gcc dot gnu.org
@ 2015-03-14 20:19 ` segher at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: segher at gcc dot gnu.org @ 2015-03-14 20:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65413
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-03-14
CC| |segher at gcc dot gnu.org
Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org
Target Milestone|--- |6.0
Summary|inefficient code returning |inefficient code returning
|aggregates on powepc64le |aggregates on powerpc64le
Ever confirmed|0 |1
--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
This is first expanded to a copy of the arg to a struct on stack; then
CSE etc. keep everything in regs, and DSE removes the stores completely.
We are left with copying 0 to a reg, and then copying parts of the arg
to a zero_extract of that. Which combine will not handle (see
combinable_i3pat, the "inner_dest != dest" branch).
This will get better when we no longer write rl*imi as a zero_extract,
hopefully completely solved even. We'll see. Mine.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-03-14 20:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12 22:36 [Bug c/65413] New: inefficient code returning aggregates on powepc64le msebor at gcc dot gnu.org
2015-03-12 23:49 ` [Bug target/65413] " msebor at gcc dot gnu.org
2015-03-14 20:19 ` [Bug target/65413] inefficient code returning aggregates on powerpc64le segher at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).