public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/47010] New: Missed optimization: x86-64 prologue not deleted
@ 2010-12-19 2:43 schnetter at gmail dot com
2010-12-28 14:51 ` [Bug rtl-optimization/47010] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: schnetter at gmail dot com @ 2010-12-19 2:43 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47010
Summary: Missed optimization: x86-64 prologue not deleted
Product: gcc
Version: 4.5.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: schnetter@gmail.com
Created attachment 22818
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22818
pre-processed bzipped source code
The following code is generated by g++ 4.5.1 on an x86-64 architecture (Mac OS
10.6). This is a static function where g++ may even have modified the argument
list. I believe the three instructions "pushq", "movq", and "leave" are not
necessary. This routine is called in a compute-intensive inner loop that has
problems fitting into the level 1 instruction cache.
The disassembled routine is:
__ZL20PDstandardNth11_implPKdll.clone.1:
0000000000000140 pushq %rbp
0000000000000141 movupd 0x10(%rdi),%xmm3
0000000000000146 movupd 0xf0(%rdi),%xmm0
000000000000014b movupd 0x08(%rdi),%xmm2
0000000000000150 addpd %xmm3,%xmm0
0000000000000154 movupd 0xf8(%rdi),%xmm1
0000000000000159 movq %rsp,%rbp
000000000000015c addpd %xmm2,%xmm1
0000000000000160 mulpd 0x000a0578(%rip),%xmm1
0000000000000168 addpd %xmm0,%xmm1
000000000000016c movupd (%rdi),%xmm0
0000000000000170 mulpd 0x000a0578(%rip),%xmm0
0000000000000178 leave
0000000000000179 addpd %xmm1,%xmm0
000000000000017d ret
The original function is defined as:
static CCTK_REAL_VEC PDstandardNth11_impl(CCTK_REAL const* restrict const u,
ptrdiff_t const dj, ptrdiff_t const dk) __attribute__((pure))
__attribute__((noinline)) __attribute__((unused));
static CCTK_REAL_VEC PDstandardNth11_impl(CCTK_REAL const* restrict const u,
ptrdiff_t const dj, ptrdiff_t const dk)
{ return
kmadd(ToReal(30),vec_loadu_maybe3(0,0,0,(u)[(0)+dj*(0)+dk*(0)]),kmadd(ToReal(-16),kadd(vec_loadu_maybe3(-1,0,0,(u)[(-1)+dj*(0)+dk*(0)]),vec_loadu_maybe3(1,0,0,(u)[(1)+dj*(0)+dk*(0)])),kadd(vec_loadu_maybe3(-2,0,0,(u)[(-2)+dj*(0)+dk*(0)]),vec_loadu_maybe3(2,0,0,(u)[(2)+dj*(0)+dk*(0)]))));
}
where CCTK_REAL is double, and CCTK_REAL_VEC is __m128d, the SSE2 vector of
doubles. The function body contains macros that translate directly to Intel
SSE2 vector instructions.
The code was compiled with gcc 4.5.1 with the options
g++-mp-4.5 -g3 -m128bit-long-double -march=native -std=gnu++0x -O3
-funsafe-loop-optimizations -fsee -ftree-loop-linear -ftree-loop-im -fivopts
-fvect-cost-model -funroll-loops -funroll-all-loops
-fvariable-expansion-in-unroller -fprefetch-loop-arrays -ffast-math
-fassociative-math -freciprocal-math -fno-trapping-math -fexcess-precision=fast
-fopenmp -Wall -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align
-Woverloaded-virtual
I attach the complete pre-processed and bzipped source code. The source code
itself is auto-generated.
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug rtl-optimization/47010] Missed optimization: x86-64 prologue not deleted
2010-12-19 2:43 [Bug rtl-optimization/47010] New: Missed optimization: x86-64 prologue not deleted schnetter at gmail dot com
@ 2010-12-28 14:51 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2010-12-28 14:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47010
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-12-28 14:50:51 UTC ---
I think it sets up a frame to have possible spills of xmm registers land in
aligned stack slots.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-12-28 14:51 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-19 2:43 [Bug rtl-optimization/47010] New: Missed optimization: x86-64 prologue not deleted schnetter at gmail dot com
2010-12-28 14:51 ` [Bug rtl-optimization/47010] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).