[Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/31021]  New: gfortran 20% slower than ifort on CP2K computational kernel
@ 2007-03-02  8:38 jv244 at cam dot ac dot uk
  2007-03-02  8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
                   ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02  8:38 UTC (permalink / raw)
  To: gcc-bugs

I've extracted the computational kernel of CP2K (see PR 29975) for easier
benchmarking. Together with required utility routines to turn it into a
self-contained program and data to test it, I have made it available here:

http://www.pci.unizh.ch/vandevondele/tmp/extracted_collocate.tgz

the summary is that (yesterday's trunk) gfortran is about 20% slower than ifort
(ifort (IFORT) 9.1 20060707) on my machine. To reproduce, untar the above link,
and use (after specifying the relevant FC in the Makefile)
make
make run

a run takes a few seconds, and yields 
gfortran '-O3 -march=native -ffast-math -ffree-form -ftree-vectorize':
 # of primitives       154502
 # computational kernel timings            5
 Kernel time   4.612288
 Kernel time   4.616289
 [...]
ifort  -xP -O3 -free
 # of primitives       154502
 # computational kernel timings            5
 Kernel time   3.796237
 Kernel time   3.800237
[...]

which is in this case 21.5% slower. I haven't found any options that made
gfortran much faster (in fact timings are very unsensitive to the options
used), and it is unrelated to any IPO (I actually notice ifort now that is
slightly faster at -O2). Since this might be relevant, timings are on:

vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
stepping        : 6

The computational time is ~80% due to a single routine (collocate_core in
grid_fast.F), which in turn is dominated by the inner loops in the select case
statement, and of those, the one over ig is (should be) dominant. For example,
the loop starting at line 216 of grid_fast.F. If I look at the asm for this
loop (with my best guess of what that loop might be, I have little experience),
my main observation is that it contains 36 mov* instructions with intel and 51
mov* instructions with gfortran (and the same number of mulsd and addsd), which
could explain the slowdown. I'll attach the respective asm.

I'm of course happy to try other compile flags for gfortran, and also hints on
how to rewrite the kernels in order to get better performance with  gfortran
would be much appreciated.

-- 
           Summary: gfortran 20% slower than ifort on CP2K computational
                    kernel
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jv244 at cam dot ac dot uk

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
@ 2007-03-02  8:39 ` jv244 at cam dot ac dot uk
  2007-03-02  8:40 ` jv244 at cam dot ac dot uk
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02  8:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from jv244 at cam dot ac dot uk  2007-03-02 08:39 -------
Created an attachment (id=13131)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13131&action=view)
gfortran kernel asm 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
  2007-03-02  8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
@ 2007-03-02  8:40 ` jv244 at cam dot ac dot uk
  2007-03-02  9:39 ` burnus at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02  8:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from jv244 at cam dot ac dot uk  2007-03-02 08:39 -------
Created an attachment (id=13132)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13132&action=view)
ifort kernel asm


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
  2007-03-02  8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
  2007-03-02  8:40 ` jv244 at cam dot ac dot uk
@ 2007-03-02  9:39 ` burnus at gcc dot gnu dot org
  2007-03-02  9:55 ` jv244 at cam dot ac dot uk
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: burnus at gcc dot gnu dot org @ 2007-03-02  9:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from burnus at gcc dot gnu dot org  2007-03-02 09:38 -------
On my "AMD Athlon(tm) 64 X2 Dual Core Processor 4800+", gfortran is in x86_64
mode only 13% slower:
gfortran: Kernel time 5.872366, real 0m33.121s; user 0m32.898s; sys 0m0.088s.
Ifort:    Kernel time 5.244328, real 0m28.893s, user 0m28.758s, sys 0m0.076s.
Options: "ifort -xP -O3 -xW -free" and "gfortran -O3 -march=native -ffast-math
-ffree-form -ftree-vectorize -funroll-loops".

For grid_fast.F, one difference is which loops are vectorized; ifort vectorizes
the loops in line 44, 469, 483 and 496, gfortran only vectorizes the loops in
line 496 and 469; for the other ones:

grid_fast.F:44: note: not vectorized: complicated access pattern.
          DO lz=1,lz_max(lxy)
             lxyz=lxyz+1
             pyx(1,lxy)=pyx(1,lxy)+pzyx(lxyz)*polz(lxyz,kg)
             pyx(2,lxy)=pyx(2,lxy)+pzyx(lxyz)*polz(lxyz,kg2)
          ENDDO

grid_fast.F:483: note: not vectorized: can't determine dependence between
(*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
              DO icoef=1,coef_max
                 coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
                 coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
                 coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
                 coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
              ENDDO


-- 

burnus at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu dot
                   |                            |org
           Keywords|                            |missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (2 preceding siblings ...)
  2007-03-02  9:39 ` burnus at gcc dot gnu dot org
@ 2007-03-02  9:55 ` jv244 at cam dot ac dot uk
  2007-03-02 18:15 ` jv244 at cam dot ac dot uk
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02  9:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from jv244 at cam dot ac dot uk  2007-03-02 09:55 -------
(In reply to comment #3)
> On my "AMD Athlon(tm) 64 X2 Dual Core Processor 4800+", gfortran is in x86_64
> mode only 13% slower:
> gfortran: Kernel time 5.872366, real 0m33.121s; user 0m32.898s; sys 0m0.088s.
> Ifort:    Kernel time 5.244328, real 0m28.893s, user 0m28.758s, sys 0m0.076s.
> Options: "ifort -xP -O3 -xW -free" and "gfortran -O3 -march=native -ffast-math
> -ffree-form -ftree-vectorize -funroll-loops".
> 
> For grid_fast.F, one difference is which loops are vectorized; ifort vectorizes
> the loops in line 44, 469, 483 and 496, gfortran only vectorizes the loops in
> line 496 and 469; for the other ones:
> 
> grid_fast.F:44: note: not vectorized: complicated access pattern.
>           DO lz=1,lz_max(lxy)
>              lxyz=lxyz+1
>              pyx(1,lxy)=pyx(1,lxy)+pzyx(lxyz)*polz(lxyz,kg)
>              pyx(2,lxy)=pyx(2,lxy)+pzyx(lxyz)*polz(lxyz,kg2)
>           ENDDO

this might matter a bit, but this is not in an inner loop, so I don't think it
accounts for a lot of time. Having it vectorized would be good of course.

> 
> grid_fast.F:483: note: not vectorized: can't determine dependence between
> (*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
>               DO icoef=1,coef_max
>                  coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
>                  coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
>                  coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
>                  coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
>               ENDDO
> 

This part, which is in the default part of the switch statement should only be
executed in rare cases. I doubt it matters much in the overall timings. Also,
this loop has very short trips (i.e. coef_max should, for the provided input,
be at most 5).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (3 preceding siblings ...)
  2007-03-02  9:55 ` jv244 at cam dot ac dot uk
@ 2007-03-02 18:15 ` jv244 at cam dot ac dot uk
  2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 18:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from jv244 at cam dot ac dot uk  2007-03-02 18:15 -------
> > 
> > grid_fast.F:483: note: not vectorized: can't determine dependence between
> > (*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
> >               DO icoef=1,coef_max
> >                  coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
> >                  coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
> >                  coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
> >                  coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
> >               ENDDO
> > 
> 
> This part, which is in the default part of the switch statement should only be
> executed in rare cases. I doubt it matters much in the overall timings. Also,
> this loop has very short trips (i.e. coef_max should, for the provided input,
> be at most 5).

I verified that the default branch is indeed not called frequently enough for
this to matter. However, by deleting all other cases (equivalent, but
specialized code), I can time that case, and find:
gfortran: 6.636415
ifort: 5.252329
which means ifort is about 26% faster for the 'case default' branch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (4 preceding siblings ...)
  2007-03-02 18:15 ` jv244 at cam dot ac dot uk
@ 2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
  2008-05-10 12:30 ` jv244 at cam dot ac dot uk
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2008-05-10 12:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from fxcoudert at gcc dot gnu dot org  2008-05-10 12:16 -------
With current trunk, I see current mainline gfortran being 5% faster than Intel
10.0 on a Dual-Core AMD Opteron(tm) Processor 2212 at 2GHz. Joost, on your
particular setup, does this still run too slow?


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fxcoudert at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (5 preceding siblings ...)
  2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
@ 2008-05-10 12:30 ` jv244 at cam dot ac dot uk
  2008-05-10 13:44 ` jv244 at cam dot ac dot uk
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-05-10 12:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from jv244 at cam dot ac dot uk  2008-05-10 12:30 -------
(In reply to comment #6)
> With current trunk, I see current mainline gfortran being 5% faster than Intel
> 10.0 on a Dual-Core AMD Opteron(tm) Processor 2212 at 2GHz. Joost, on your
> particular setup, does this still run too slow?

Right now, the testcase in comment 1 still is 20% slower ifort/gcc.
This is, however, with gfortran 4.3.0. Furthermore, it matters on which CPU you
run this (in particular Intel vs. AMD). 

To summarize:processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
stepping        : 6

ifort (IFORT) 9.1 20060707
Kernel time   3.812238

gcc version 4.3.0 (GCC)
Kernel time   4.5482836

I'll try to build trunk on this machine and test again, but it might not be for
today.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (6 preceding siblings ...)
  2008-05-10 12:30 ` jv244 at cam dot ac dot uk
@ 2008-05-10 13:44 ` jv244 at cam dot ac dot uk
  2009-02-06 21:34 ` steven at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-05-10 13:44 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from jv244 at cam dot ac dot uk  2008-05-10 13:43 -------
(In reply to comment #7)
> This is, however, with gfortran 4.3.0. 

Trunk is marginally faster than 4.3.0, still about 20% slower than ifort
Kernel time   4.5042820


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (7 preceding siblings ...)
  2008-05-10 13:44 ` jv244 at cam dot ac dot uk
@ 2009-02-06 21:34 ` steven at gcc dot gnu dot org
  2009-02-07  7:50 ` jv244 at cam dot ac dot uk
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-06 21:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from steven at gcc dot gnu dot org  2009-02-06 21:34 -------
Confirmed with gcc 4.3.  Where do we stand today?


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-02-06 21:34:18
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (8 preceding siblings ...)
  2009-02-06 21:34 ` steven at gcc dot gnu dot org
@ 2009-02-07  7:50 ` jv244 at cam dot ac dot uk
  2009-06-20  9:59 ` jv244 at cam dot ac dot uk
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-02-07  7:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from jv244 at cam dot ac dot uk  2009-02-07 07:50 -------
(In reply to comment #9)
> Confirmed with gcc 4.3.  Where do we stand today?

same place:

gfortran -O3 -march=native -ffast-math -ffree-form -ftree-vectorize
gcc version 4.4.0 20090207 (experimental) (GCC)
> ./a.out
 # of primitives       154502
 # computational kernel timings            5
 Kernel time   4.4882798
 Kernel time   4.4922795
 Kernel time   4.4882793

ifort -v
Version 9.1
./a.out
 # of primitives       154502
 # computational kernel timings            5
 Kernel time   3.800237
 Kernel time   3.792237
 Kernel time   3.796237


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (9 preceding siblings ...)
  2009-02-07  7:50 ` jv244 at cam dot ac dot uk
@ 2009-06-20  9:59 ` jv244 at cam dot ac dot uk
  2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
  2009-06-20 11:37 ` jv244 at cam dot ac dot uk
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-06-20  9:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jv244 at cam dot ac dot uk  2009-06-20 09:59 -------
some more progress with 4.5.0, but not quite there yet:

./a.out
 # of primitives       154502
 # computational kernel timings            5
 Kernel time   4.3522720
 Kernel time   4.3562722
 Kernel time   4.3522720
 Kernel time   4.3522720
 Kernel time   4.3562717


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (10 preceding siblings ...)
  2009-06-20  9:59 ` jv244 at cam dot ac dot uk
@ 2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
  2009-06-20 11:37 ` jv244 at cam dot ac dot uk
  12 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-06-20 10:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenth at gcc dot gnu dot org  2009-06-20 10:46 -------
Usual things to try are: -fno-tree-pre, -fno-ivopts, -fschedule-insns (on top
of the usuall -O3 -ffast-math -funroll-loops setting, of course).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
  2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
                   ` (11 preceding siblings ...)
  2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
@ 2009-06-20 11:37 ` jv244 at cam dot ac dot uk
  12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-06-20 11:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from jv244 at cam dot ac dot uk  2009-06-20 11:37 -------
(In reply to comment #12)
> Usual things to try are: -fno-tree-pre, -fno-ivopts, -fschedule-insns (on top
> of the usuall -O3 -ffast-math -funroll-loops setting, of course).

-O3 -march=native -ffast-math -ffree-form -ftree-vectorize: 4.3482709

added on top of the above independently:
-funroll-loops: 4.2682667
-fschedule-insns: 4.3962746
-fno-tree-pre: 4.4682798
-fno-ivopts: 4.8963070
-funroll-loops -fno-ivopts: 4.7722988
-funroll-loops -fschedule-insns: 4.4242764

so best so far is:

-O3 -march=native -ffast-math -ffree-form -ftree-vectorize -funroll-loops:
4.2682667


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
       [not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
  2013-03-27 11:34 ` rguenth at gcc dot gnu.org
  2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-03-29  8:15 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 17+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-29  8:15 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |37150

--- Comment #16 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-29 08:15:30 UTC ---
I believe this is actually testing the same kernel (maybe a slightly older
variant) as in PR37150. I would rather revisit this once PR37150 has been
fixed.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
       [not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
  2013-03-27 11:34 ` rguenth at gcc dot gnu.org
@ 2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
  2013-03-29  8:15 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 17+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-27 11:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Joost.VandeVondele at mat
                   |                            |dot ethz.ch

--- Comment #15 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-27 11:47:12 UTC ---
New URL:

https://www.dropbox.com/s/g28kdvatrgeu6hm/extracted_collocate.tgz

(contains nearly 2Mb of data needed to run the testcase).

the difference between trunk and ifort has become smaller. I'm now seeing only
5% difference (on a different CPU).

3.50946712 vs. 3.354490

I adjusted in the Makefile the ifort option to use -xHost.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
       [not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
@ 2013-03-27 11:34 ` rguenth at gcc dot gnu.org
  2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
  2013-03-29  8:15 ` Joost.VandeVondele at mat dot ethz.ch
  2 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-27 11:34 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27 11:34:28 UTC ---
Testcase is lost, the URL does no longer work.  Can you please attach it here?


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-03-29  8:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-02  8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
2007-03-02  8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
2007-03-02  8:40 ` jv244 at cam dot ac dot uk
2007-03-02  9:39 ` burnus at gcc dot gnu dot org
2007-03-02  9:55 ` jv244 at cam dot ac dot uk
2007-03-02 18:15 ` jv244 at cam dot ac dot uk
2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
2008-05-10 12:30 ` jv244 at cam dot ac dot uk
2008-05-10 13:44 ` jv244 at cam dot ac dot uk
2009-02-06 21:34 ` steven at gcc dot gnu dot org
2009-02-07  7:50 ` jv244 at cam dot ac dot uk
2009-06-20  9:59 ` jv244 at cam dot ac dot uk
2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
2009-06-20 11:37 ` jv244 at cam dot ac dot uk
     [not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
2013-03-27 11:34 ` rguenth at gcc dot gnu.org
2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29  8:15 ` Joost.VandeVondele at mat dot ethz.ch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).