public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
@ 2012-05-14  9:39 dominiq at lps dot ens.fr
  2012-05-14  9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14  9:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

             Bug #: 53340
           Summary: [4.8 Regression] rnflow.f90 is ~20% slower after
                    revision 187092
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: dominiq@lps.ens.fr
                CC: rguenth@gcc.gnu.org, ubizjak@gmail.com


On x86_64-apple-darwin10, rnflow.f90 is ~20% slower after revision 187092

[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187091/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
22.038u 0.352s 0:22.52 99.3%    0+0k 2+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187092/bin/gfortran -O3 -ffast-math
-funroll-loops rnflow.f90
[macbook] test/dbg_rnflow% time a.out > /dev/null
27.480u 0.349s 0:27.83 99.9%    0+0k 0+0io 0pf+0w

The slowdown comes from the optimization of cptrf2

[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187092/bin/gfortran -c -O3
-ffast-math -funroll-loops timctr.f90 cmpcpt.f90 cptrf2.f90 dger.f90 dgetri.f90
dswap.f90 dtrsm.f90 evlrnf.f90 idamax.f90 main.f90 mattrs.f90 cmpmat.f90
dgemm.f90 dgetf2.f90 dlaswp.f90 dtrmm.f90 dtrti2.f90 extpic.f90 ilaenv.f90
matcnt.f90 reaseq.f90 xerbla.f90 cptrf1.f90 dgemv.f90 dgetrf.f90 dscal.f90
dtrmv.f90 dtrtri.f90 gentrs.f90 lsame.f90 matsim.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null27.567u 0.349s 0:27.92
99.9%    0+0k 0+0io 0pf+0w[macbook] test/dbg_rnflow%
/opt/gcc/gcc4.8p-187091/bin/gfortran -c -O3 -ffast-math -funroll-loops
cptrf2.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
22.136u 0.345s 0:22.48 99.9%    0+0k 0+0io 0pf+0w
[macbook] test/dbg_rnflow% /opt/gcc/gcc4.8p-187091/bin/gfortran -c -O2
cptrf2.f90
[macbook] test/dbg_rnflow% makeo ; time a.out > /dev/null
21.453u 0.348s 0:21.80 99.9%    0+0k 0+0io 0pf+0w


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
  2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
@ 2012-05-14  9:50 ` dominiq at lps dot ens.fr
  2012-05-14 10:02 ` dominiq at lps dot ens.fr
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14  9:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

--- Comment #1 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-14 09:44:33 UTC ---
Created attachment 27399
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27399
source cptrf2.f90 extracted from rnflow.f90


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
  2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
  2012-05-14  9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
@ 2012-05-14 10:02 ` dominiq at lps dot ens.fr
  2012-05-14 10:52 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-05-14 10:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

--- Comment #2 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-05-14 09:49:22 UTC ---
If I understand correctly the profiling, the slowdown comes from the first
inlined function minlst. The fast assembly is

L45:
    movss   (%r10), %xmm10
    leal    -1(%rsi), %edi
    movss   -4(%r10), %xmm11
    comiss  %xmm10, %xmm6
    movss   -8(%r10), %xmm12
    minss   %xmm10, %xmm6
    movss   -12(%r10), %xmm13
    cmova   %esi, %edx
    comiss  %xmm11, %xmm6
    minss   %xmm11, %xmm6
    cmova   %edi, %edx
    comiss  %xmm12, %xmm6
    minss   %xmm12, %xmm6
    leal    -2(%rsi), %edi
    cmova   %edi, %edx
    comiss  %xmm13, %xmm6
    leal    -3(%rsi), %edi
    minss   %xmm13, %xmm6
    cmova   %edi, %edx
    subl    $4, %esi
    subq    $16, %r10
    cmpl    %r8d, %esi
    jne     L45

while the slow one is

L39:
    movslq  %edx, %r9
    movss   -4(%rdi,%r9,4), %xmm9
    leal    -1(%r8), %r9d
    comiss  (%rbx), %xmm9
    cmova   %r8d, %edx
    movslq  %edx, %r14
    movss   -4(%rdi,%r14,4), %xmm10
    comiss  -4(%rbx), %xmm10
    cmova   %r9d, %edx
    leal    -2(%r8), %r9d
    movslq  %edx, %r14
    movss   -4(%rdi,%r14,4), %xmm11
    comiss  -8(%rbx), %xmm11
    cmova   %r9d, %edx
    leal    -3(%r8), %r9d
    movslq  %edx, %r14
    movss   -4(%rdi,%r14,4), %xmm12
    comiss  -12(%rbx), %xmm12
    cmova   %r9d, %edx
    subl    $4, %r8d
    subq    $16, %rbx
    cmpl    %r10d, %r8d
    jne     L39


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
  2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
  2012-05-14  9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
  2012-05-14 10:02 ` dominiq at lps dot ens.fr
@ 2012-05-14 10:52 ` rguenth at gcc dot gnu.org
  2012-05-14 11:38 ` rguenth at gcc dot gnu.org
  2012-05-14 11:40 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 10:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2012-05-14
         AssignedTo|unassigned at gcc dot       |rguenth at gcc dot gnu.org
                   |gnu.org                     |
   Target Milestone|---                         |4.8.0
     Ever Confirmed|0                           |1

--- Comment #3 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 10:38:51 UTC ---
Ouch.  Mine.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
  2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
                   ` (2 preceding siblings ...)
  2012-05-14 10:52 ` rguenth at gcc dot gnu.org
@ 2012-05-14 11:38 ` rguenth at gcc dot gnu.org
  2012-05-14 11:40 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 11:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 11:37:02 UTC ---
Author: rguenth
Date: Mon May 14 11:36:58 2012
New Revision: 187457

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187457
Log:
2012-05-14  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/53340
    * tree-ssa-pre.c (op_valid_in_sets): Fix error in last commit.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-pre.c


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/53340] [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092
  2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
                   ` (3 preceding siblings ...)
  2012-05-14 11:38 ` rguenth at gcc dot gnu.org
@ 2012-05-14 11:40 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-14 11:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53340

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #5 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-14 11:37:18 UTC ---
Fixed.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-05-14 11:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-14  9:39 [Bug tree-optimization/53340] New: [4.8 Regression] rnflow.f90 is ~20% slower after revision 187092 dominiq at lps dot ens.fr
2012-05-14  9:50 ` [Bug tree-optimization/53340] " dominiq at lps dot ens.fr
2012-05-14 10:02 ` dominiq at lps dot ens.fr
2012-05-14 10:52 ` rguenth at gcc dot gnu.org
2012-05-14 11:38 ` rguenth at gcc dot gnu.org
2012-05-14 11:40 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).