public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call
@ 2007-04-25 15:09 ian at airs dot com
2007-04-26 9:23 ` [Bug rtl-optimization/31704] " rguenth at gcc dot gnu dot org
2008-02-07 16:37 ` hubicka at gcc dot gnu dot org
0 siblings, 2 replies; 3+ messages in thread
From: ian at airs dot com @ 2007-04-25 15:09 UTC (permalink / raw)
To: gcc-bugs
When I compile this test case with -O2 for x86_64:
extern void g (void);
float
f (float sum, float mult, int *pi)
{
int i, j;
for (i = 0; i < 10; ++i)
{
g ();
for (j = 0; j < 1000; ++j)
sum += *pi++ * mult;
}
return sum;
}
I get this result:
f:
.LFB2:
pushq %rbp
.LCFI0:
movaps %xmm0, %xmm2
xorl %ebp, %ebp
pushq %rbx
.LCFI1:
movq %rdi, %rbx
subq $40, %rsp
.LCFI2:
movss %xmm1, 28(%rsp)
.L2:
movss %xmm2, (%rsp)
call g
cvtsi2ss (%rbx), %xmm0
leaq 4(%rbx), %rax
movl $1, %edx
movss (%rsp), %xmm2
mulss 28(%rsp), %xmm0
addss %xmm0, %xmm2
.p2align 4,,7
.L3:
cvtsi2ss (%rax), %xmm1
addl $1, %edx
addq $4, %rax
cmpl $1000, %edx
mulss 28(%rsp), %xmm1
addss %xmm1, %xmm2
jne .L3
addl $1, %ebp
addq $4000, %rbx
cmpl $10, %ebp
jne .L2
addq $40, %rsp
movaps %xmm2, %xmm0
popq %rbx
popq %rbp
ret
In the original code, the inner loop is performance critical. Note that this
compiles into a mulss loading a value from memory. It would be more efficient
to have the value in a register during the inner loop. In fact the value was
in a register, but we stored it in the stack because it crossed the function
call, and we load it from the stack once for each inner loop iteration rather
than once for each outer loop iteration.
I don't see a simple approach to fixing this. Some sort of live range
splitting might work.
--
Summary: x86_64 poor floating point register allocation across
function call
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ian at airs dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31704
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug rtl-optimization/31704] x86_64 poor floating point register allocation across function call
2007-04-25 15:09 [Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call ian at airs dot com
@ 2007-04-26 9:23 ` rguenth at gcc dot gnu dot org
2008-02-07 16:37 ` hubicka at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-26 9:23 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu dot
| |org
Severity|normal |enhancement
Keywords| |missed-optimization, ra
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31704
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug rtl-optimization/31704] x86_64 poor floating point register allocation across function call
2007-04-25 15:09 [Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call ian at airs dot com
2007-04-26 9:23 ` [Bug rtl-optimization/31704] " rguenth at gcc dot gnu dot org
@ 2008-02-07 16:37 ` hubicka at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2008-02-07 16:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from hubicka at gcc dot gnu dot org 2008-02-07 16:36 -------
This is fixed by the call frequency patch on mainline.
.L2:
cvtsi2ss (%ebx,%eax,4), %xmm0
addl $1, %eax
cmpl $1000, %eax
mulss %xmm2, %xmm0
addss %xmm0, %xmm1
jne .L2
(on i386, but x86-64 behaves same way)
Honza
--
hubicka at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31704
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-02-07 16:37 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-25 15:09 [Bug rtl-optimization/31704] New: x86_64 poor floating point register allocation across function call ian at airs dot com
2007-04-26 9:23 ` [Bug rtl-optimization/31704] " rguenth at gcc dot gnu dot org
2008-02-07 16:37 ` hubicka at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).