public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
@ 2012-05-14 9:39 dominiq at lps dot ens.fr
2012-05-14 9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14 9:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
Bug #: 53340
Summary: [4.8 Regression] rnflow.f90 is ~20% slower after
revision 187092
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: dominiq@lps.ens.fr
CC: rguenth@gcc.gnu.org, ubizjak@gmail.com
On x86_64-apple-darwin10, rnflow.f90 is ~20% slower after revision 187092
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187091/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
22.038u 0.352s 0:22.52 99.3% 0+0k 2+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187092/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
27.480u 0.349s 0:27.83 99.9% 0+0k 0+0io 0pf+0w
The slowdown comes from the optimization of cptrf2
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187092/bin/gfortran -c -O3
-ffast-math -funroll-loops timctr.f90 cmpcpt.f90 cptrf2.f90 dger.f90 dgetri.f90
dswap.f90 dtrsm.f90 evlrnf.f90 idamax.f90 main.f90 mattrs.f90 cmpmat.f90
dgemm.f90 dgetf2.f90 dlaswp.f90 dtrmm.f90 dtrti2.f90 extpic.f90 ilaenv.f90
matcnt.f90 reaseq.f90 xerbla.f90 cptrf1.f90 dgemv.f90 dgetrf.f90 dscal.f90
dtrmv.f90 dtrtri.f90 gentrs.f90 lsame.f90 matsim.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null27.567u 0.349s 0:27.92
99.9% 0+0k 0+0io 0pf+0w[macbook] test/dbg_rnflow%
/opt/gcc/gcc4.8p-187091/bin/gfortran -c -O3 -ffast-math -funroll-loops
cptrf2.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
22.136u 0.345s 0:22.48 99.9% 0+0k 0+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187091/bin/gfortran -c -O2
cptrf2.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
21.453u 0.348s 0:21.80 99.9% 0+0k 0+0io 0pf+0w
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
@ 2012-05-14 9:50 ` dominiq at lps dot ens.fr
2012-05-14 10:02 ` dominiq at lps dot ens.fr
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14 9:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
--- Comment #1 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-14 09:44:33 UTC ---
Created attachment 27399
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27399
source cptrf2.f90 extracted from rnflow.f90
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
2012-05-14 9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
@ 2012-05-14 10:02 ` dominiq at lps dot ens.fr
2012-05-14 10:52 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14 10:02 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
--- Comment #2 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-14 09:49:22 UTC ---
If I understand correctly the profiling, the slowdown comes from the first
inlined function minlst. The fast assembly is
L45:
movss (%r10), %xmm10
leal -1(%rsi), %edi
movss -4(%r10), %xmm11
comiss %xmm10, %xmm6
movss -8(%r10), %xmm12
minss %xmm10, %xmm6
movss -12(%r10), %xmm13
cmova %esi, %edx
comiss %xmm11, %xmm6
minss %xmm11, %xmm6
cmova %edi, %edx
comiss %xmm12, %xmm6
minss %xmm12, %xmm6
leal -2(%rsi), %edi
cmova %edi, %edx
comiss %xmm13, %xmm6
leal -3(%rsi), %edi
minss %xmm13, %xmm6
cmova %edi, %edx
subl $4, %esi
subq $16, %r10
cmpl %r8d, %esi
jne L45
while the slow one is
L39:
movslq %edx, %r9
movss -4(%rdi,%r9,4), %xmm9
leal -1(%r8), %r9d
comiss (%rbx), %xmm9
cmova %r8d, %edx
movslq %edx, %r14
movss -4(%rdi,%r14,4), %xmm10
comiss -4(%rbx), %xmm10
cmova %r9d, %edx
leal -2(%r8), %r9d
movslq %edx, %r14
movss -4(%rdi,%r14,4), %xmm11
comiss -8(%rbx), %xmm11
cmova %r9d, %edx
leal -3(%r8), %r9d
movslq %edx, %r14
movss -4(%rdi,%r14,4), %xmm12
comiss -12(%rbx), %xmm12
cmova %r9d, %edx
subl $4, %r8d
subq $16, %rbx
cmpl %r10d, %r8d
jne L39
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
2012-05-14 9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
2012-05-14 10:02 ` dominiq at lps dot ens.fr
@ 2012-05-14 10:52 ` rguenth at gcc dot gnu.org
2012-05-14 11:38 ` rguenth at gcc dot gnu.org
2012-05-14 11:40 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 10:52 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2012-05-14
AssignedTo|unassigned at gcc dot |rguenth at gcc dot gnu.org
|gnu.org |
Target Milestone|--- |4.8.0
Ever Confirmed|0 |1
--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 10:38:51 UTC ---
Ouch. Mine.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
` (2 preceding siblings ...)
2012-05-14 10:52 ` rguenth at gcc dot gnu.org
@ 2012-05-14 11:38 ` rguenth at gcc dot gnu.org
2012-05-14 11:40 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 11:38 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 11:37:02 UTC ---
Author: rguenth
Date: Mon May 14 11:36:58 2012
New Revision: 187457
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187457
Log:
2012-05-14 Richard Guenther <rguenther@suse.de>
PR tree-optimization/53340
* tree-ssa-pre.c (op_valid_in_sets): Fix error in last commit.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-pre.c
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
` (3 preceding siblings ...)
2012-05-14 11:38 ` rguenth at gcc dot gnu.org
@ 2012-05-14 11:40 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 11:40 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
--- Comment #5 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 11:37:18 UTC ---
Fixed.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-05-14 11:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-14 9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
2012-05-14 9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
2012-05-14 10:02 ` dominiq at lps dot ens.fr
2012-05-14 10:52 ` rguenth at gcc dot gnu.org
2012-05-14 11:38 ` rguenth at gcc dot gnu.org
2012-05-14 11:40 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).