public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel
@ 2007-03-02 8:38 jv244 at cam dot ac dot uk
2007-03-02 8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
` (12 more replies)
0 siblings, 13 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 8:38 UTC (permalink / raw)
To: gcc-bugs
I've extracted the computational kernel of CP2K (see PR 29975) for easier
benchmarking. Together with required utility routines to turn it into a
self-contained program and data to test it, I have made it available here:
http://www.pci.unizh.ch/vandevondele/tmp/extracted_collocate.tgz
the summary is that (yesterday's trunk) gfortran is about 20% slower than ifort
(ifort (IFORT) 9.1 20060707) on my machine. To reproduce, untar the above link,
and use (after specifying the relevant FC in the Makefile)
make
make run
a run takes a few seconds, and yields
gfortran '-O3 -march=native -ffast-math -ffree-form -ftree-vectorize':
# of primitives 154502
# computational kernel timings 5
Kernel time 4.612288
Kernel time 4.616289
[...]
ifort -xP -O3 -free
# of primitives 154502
# computational kernel timings 5
Kernel time 3.796237
Kernel time 3.800237
[...]
which is in this case 21.5% slower. I haven't found any options that made
gfortran much faster (in fact timings are very unsensitive to the options
used), and it is unrelated to any IPO (I actually notice ifort now that is
slightly faster at -O2). Since this might be relevant, timings are on:
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
stepping : 6
The computational time is ~80% due to a single routine (collocate_core in
grid_fast.F), which in turn is dominated by the inner loops in the select case
statement, and of those, the one over ig is (should be) dominant. For example,
the loop starting at line 216 of grid_fast.F. If I look at the asm for this
loop (with my best guess of what that loop might be, I have little experience),
my main observation is that it contains 36 mov* instructions with intel and 51
mov* instructions with gfortran (and the same number of mulsd and addsd), which
could explain the slowdown. I'll attach the respective asm.
I'm of course happy to try other compile flags for gfortran, and also hints on
how to rewrite the kernels in order to get better performance with gfortran
would be much appreciated.
--
Summary: gfortran 20% slower than ifort on CP2K computational
kernel
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jv244 at cam dot ac dot uk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
@ 2007-03-02 8:39 ` jv244 at cam dot ac dot uk
2007-03-02 8:40 ` jv244 at cam dot ac dot uk
` (11 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 8:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from jv244 at cam dot ac dot uk 2007-03-02 08:39 -------
Created an attachment (id=13131)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13131&action=view)
gfortran kernel asm
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
2007-03-02 8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
@ 2007-03-02 8:40 ` jv244 at cam dot ac dot uk
2007-03-02 9:39 ` burnus at gcc dot gnu dot org
` (10 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 8:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from jv244 at cam dot ac dot uk 2007-03-02 08:39 -------
Created an attachment (id=13132)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13132&action=view)
ifort kernel asm
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
2007-03-02 8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
2007-03-02 8:40 ` jv244 at cam dot ac dot uk
@ 2007-03-02 9:39 ` burnus at gcc dot gnu dot org
2007-03-02 9:55 ` jv244 at cam dot ac dot uk
` (9 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: burnus at gcc dot gnu dot org @ 2007-03-02 9:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from burnus at gcc dot gnu dot org 2007-03-02 09:38 -------
On my "AMD Athlon(tm) 64 X2 Dual Core Processor 4800+", gfortran is in x86_64
mode only 13% slower:
gfortran: Kernel time 5.872366, real 0m33.121s; user 0m32.898s; sys 0m0.088s.
Ifort: Kernel time 5.244328, real 0m28.893s, user 0m28.758s, sys 0m0.076s.
Options: "ifort -xP -O3 -xW -free" and "gfortran -O3 -march=native -ffast-math
-ffree-form -ftree-vectorize -funroll-loops".
For grid_fast.F, one difference is which loops are vectorized; ifort vectorizes
the loops in line 44, 469, 483 and 496, gfortran only vectorizes the loops in
line 496 and 469; for the other ones:
grid_fast.F:44: note: not vectorized: complicated access pattern.
DO lz=1,lz_max(lxy)
lxyz=lxyz+1
pyx(1,lxy)=pyx(1,lxy)+pzyx(lxyz)*polz(lxyz,kg)
pyx(2,lxy)=pyx(2,lxy)+pzyx(lxyz)*polz(lxyz,kg2)
ENDDO
grid_fast.F:483: note: not vectorized: can't determine dependence between
(*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
DO icoef=1,coef_max
coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
ENDDO
--
burnus at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |burnus at gcc dot gnu dot
| |org
Keywords| |missed-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (2 preceding siblings ...)
2007-03-02 9:39 ` burnus at gcc dot gnu dot org
@ 2007-03-02 9:55 ` jv244 at cam dot ac dot uk
2007-03-02 18:15 ` jv244 at cam dot ac dot uk
` (8 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 9:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jv244 at cam dot ac dot uk 2007-03-02 09:55 -------
(In reply to comment #3)
> On my "AMD Athlon(tm) 64 X2 Dual Core Processor 4800+", gfortran is in x86_64
> mode only 13% slower:
> gfortran: Kernel time 5.872366, real 0m33.121s; user 0m32.898s; sys 0m0.088s.
> Ifort: Kernel time 5.244328, real 0m28.893s, user 0m28.758s, sys 0m0.076s.
> Options: "ifort -xP -O3 -xW -free" and "gfortran -O3 -march=native -ffast-math
> -ffree-form -ftree-vectorize -funroll-loops".
>
> For grid_fast.F, one difference is which loops are vectorized; ifort vectorizes
> the loops in line 44, 469, 483 and 496, gfortran only vectorizes the loops in
> line 496 and 469; for the other ones:
>
> grid_fast.F:44: note: not vectorized: complicated access pattern.
> DO lz=1,lz_max(lxy)
> lxyz=lxyz+1
> pyx(1,lxy)=pyx(1,lxy)+pzyx(lxyz)*polz(lxyz,kg)
> pyx(2,lxy)=pyx(2,lxy)+pzyx(lxyz)*polz(lxyz,kg2)
> ENDDO
this might matter a bit, but this is not in an inner loop, so I don't think it
accounts for a lot of time. Having it vectorized would be good of course.
>
> grid_fast.F:483: note: not vectorized: can't determine dependence between
> (*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
> DO icoef=1,coef_max
> coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
> coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
> coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
> coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
> ENDDO
>
This part, which is in the default part of the switch statement should only be
executed in rare cases. I doubt it matters much in the overall timings. Also,
this loop has very short trips (i.e. coef_max should, for the provided input,
be at most 5).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (3 preceding siblings ...)
2007-03-02 9:55 ` jv244 at cam dot ac dot uk
@ 2007-03-02 18:15 ` jv244 at cam dot ac dot uk
2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
` (7 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2007-03-02 18:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from jv244 at cam dot ac dot uk 2007-03-02 18:15 -------
> >
> > grid_fast.F:483: note: not vectorized: can't determine dependence between
> > (*coef_447)[D.1967_2320] and (*coef_447)[D.1967_2320]
> > DO icoef=1,coef_max
> > coef(icoef,1)=coef(icoef,1)+alpha(icoef,lx)*g1
> > coef(icoef,2)=coef(icoef,2)+alpha(icoef,lx)*g2
> > coef(icoef,3)=coef(icoef,3)+alpha(icoef,lx)*g1k
> > coef(icoef,4)=coef(icoef,4)+alpha(icoef,lx)*g2k
> > ENDDO
> >
>
> This part, which is in the default part of the switch statement should only be
> executed in rare cases. I doubt it matters much in the overall timings. Also,
> this loop has very short trips (i.e. coef_max should, for the provided input,
> be at most 5).
I verified that the default branch is indeed not called frequently enough for
this to matter. However, by deleting all other cases (equivalent, but
specialized code), I can time that case, and find:
gfortran: 6.636415
ifort: 5.252329
which means ifort is about 26% faster for the 'case default' branch.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (4 preceding siblings ...)
2007-03-02 18:15 ` jv244 at cam dot ac dot uk
@ 2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
2008-05-10 12:30 ` jv244 at cam dot ac dot uk
` (6 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2008-05-10 12:17 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from fxcoudert at gcc dot gnu dot org 2008-05-10 12:16 -------
With current trunk, I see current mainline gfortran being 5% faster than Intel
10.0 on a Dual-Core AMD Opteron(tm) Processor 2212 at 2GHz. Joost, on your
particular setup, does this still run too slow?
--
fxcoudert at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fxcoudert at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (5 preceding siblings ...)
2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
@ 2008-05-10 12:30 ` jv244 at cam dot ac dot uk
2008-05-10 13:44 ` jv244 at cam dot ac dot uk
` (5 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-05-10 12:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from jv244 at cam dot ac dot uk 2008-05-10 12:30 -------
(In reply to comment #6)
> With current trunk, I see current mainline gfortran being 5% faster than Intel
> 10.0 on a Dual-Core AMD Opteron(tm) Processor 2212 at 2GHz. Joost, on your
> particular setup, does this still run too slow?
Right now, the testcase in comment 1 still is 20% slower ifort/gcc.
This is, however, with gfortran 4.3.0. Furthermore, it matters on which CPU you
run this (in particular Intel vs. AMD).
To summarize:processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
stepping : 6
ifort (IFORT) 9.1 20060707
Kernel time 3.812238
gcc version 4.3.0 (GCC)
Kernel time 4.5482836
I'll try to build trunk on this machine and test again, but it might not be for
today.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (6 preceding siblings ...)
2008-05-10 12:30 ` jv244 at cam dot ac dot uk
@ 2008-05-10 13:44 ` jv244 at cam dot ac dot uk
2009-02-06 21:34 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-05-10 13:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from jv244 at cam dot ac dot uk 2008-05-10 13:43 -------
(In reply to comment #7)
> This is, however, with gfortran 4.3.0.
Trunk is marginally faster than 4.3.0, still about 20% slower than ifort
Kernel time 4.5042820
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (7 preceding siblings ...)
2008-05-10 13:44 ` jv244 at cam dot ac dot uk
@ 2009-02-06 21:34 ` steven at gcc dot gnu dot org
2009-02-07 7:50 ` jv244 at cam dot ac dot uk
` (3 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-06 21:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from steven at gcc dot gnu dot org 2009-02-06 21:34 -------
Confirmed with gcc 4.3. Where do we stand today?
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-02-06 21:34:18
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (8 preceding siblings ...)
2009-02-06 21:34 ` steven at gcc dot gnu dot org
@ 2009-02-07 7:50 ` jv244 at cam dot ac dot uk
2009-06-20 9:59 ` jv244 at cam dot ac dot uk
` (2 subsequent siblings)
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-02-07 7:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from jv244 at cam dot ac dot uk 2009-02-07 07:50 -------
(In reply to comment #9)
> Confirmed with gcc 4.3. Where do we stand today?
same place:
gfortran -O3 -march=native -ffast-math -ffree-form -ftree-vectorize
gcc version 4.4.0 20090207 (experimental) (GCC)
> ./a.out
# of primitives 154502
# computational kernel timings 5
Kernel time 4.4882798
Kernel time 4.4922795
Kernel time 4.4882793
ifort -v
Version 9.1
./a.out
# of primitives 154502
# computational kernel timings 5
Kernel time 3.800237
Kernel time 3.792237
Kernel time 3.796237
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (9 preceding siblings ...)
2009-02-07 7:50 ` jv244 at cam dot ac dot uk
@ 2009-06-20 9:59 ` jv244 at cam dot ac dot uk
2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
2009-06-20 11:37 ` jv244 at cam dot ac dot uk
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-06-20 9:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from jv244 at cam dot ac dot uk 2009-06-20 09:59 -------
some more progress with 4.5.0, but not quite there yet:
./a.out
# of primitives 154502
# computational kernel timings 5
Kernel time 4.3522720
Kernel time 4.3562722
Kernel time 4.3522720
Kernel time 4.3522720
Kernel time 4.3562717
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (10 preceding siblings ...)
2009-06-20 9:59 ` jv244 at cam dot ac dot uk
@ 2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
2009-06-20 11:37 ` jv244 at cam dot ac dot uk
12 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-06-20 10:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from rguenth at gcc dot gnu dot org 2009-06-20 10:46 -------
Usual things to try are: -fno-tree-pre, -fno-ivopts, -fschedule-insns (on top
of the usuall -O3 -ffast-math -funroll-loops setting, of course).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
` (11 preceding siblings ...)
2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
@ 2009-06-20 11:37 ` jv244 at cam dot ac dot uk
12 siblings, 0 replies; 17+ messages in thread
From: jv244 at cam dot ac dot uk @ 2009-06-20 11:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from jv244 at cam dot ac dot uk 2009-06-20 11:37 -------
(In reply to comment #12)
> Usual things to try are: -fno-tree-pre, -fno-ivopts, -fschedule-insns (on top
> of the usuall -O3 -ffast-math -funroll-loops setting, of course).
-O3 -march=native -ffast-math -ffree-form -ftree-vectorize: 4.3482709
added on top of the above independently:
-funroll-loops: 4.2682667
-fschedule-insns: 4.3962746
-fno-tree-pre: 4.4682798
-fno-ivopts: 4.8963070
-funroll-loops -fno-ivopts: 4.7722988
-funroll-loops -fschedule-insns: 4.4242764
so best so far is:
-O3 -march=native -ffast-math -ffree-form -ftree-vectorize -funroll-loops:
4.2682667
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
[not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
2013-03-27 11:34 ` rguenth at gcc dot gnu.org
2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
@ 2013-03-29 8:15 ` Joost.VandeVondele at mat dot ethz.ch
2 siblings, 0 replies; 17+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-29 8:15 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |37150
--- Comment #16 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-29 08:15:30 UTC ---
I believe this is actually testing the same kernel (maybe a slightly older
variant) as in PR37150. I would rather revisit this once PR37150 has been
fixed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
[not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
2013-03-27 11:34 ` rguenth at gcc dot gnu.org
@ 2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29 8:15 ` Joost.VandeVondele at mat dot ethz.ch
2 siblings, 0 replies; 17+ messages in thread
From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-27 11:47 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Joost.VandeVondele at mat
| |dot ethz.ch
--- Comment #15 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-27 11:47:12 UTC ---
New URL:
https://www.dropbox.com/s/g28kdvatrgeu6hm/extracted_collocate.tgz
(contains nearly 2Mb of data needed to run the testcase).
the difference between trunk and ifort has become smaller. I'm now seeing only
5% difference (on a different CPU).
3.50946712 vs. 3.354490
I adjusted in the Makefile the ifort option to use -xHost.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/31021] gfortran 20% slower than ifort on CP2K computational kernel
[not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
@ 2013-03-27 11:34 ` rguenth at gcc dot gnu.org
2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29 8:15 ` Joost.VandeVondele at mat dot ethz.ch
2 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-27 11:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31021
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |WAITING
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27 11:34:28 UTC ---
Testcase is lost, the URL does no longer work. Can you please attach it here?
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-03-29 8:15 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-02 8:38 [Bug rtl-optimization/31021] New: gfortran 20% slower than ifort on CP2K computational kernel jv244 at cam dot ac dot uk
2007-03-02 8:39 ` [Bug rtl-optimization/31021] " jv244 at cam dot ac dot uk
2007-03-02 8:40 ` jv244 at cam dot ac dot uk
2007-03-02 9:39 ` burnus at gcc dot gnu dot org
2007-03-02 9:55 ` jv244 at cam dot ac dot uk
2007-03-02 18:15 ` jv244 at cam dot ac dot uk
2008-05-10 12:17 ` fxcoudert at gcc dot gnu dot org
2008-05-10 12:30 ` jv244 at cam dot ac dot uk
2008-05-10 13:44 ` jv244 at cam dot ac dot uk
2009-02-06 21:34 ` steven at gcc dot gnu dot org
2009-02-07 7:50 ` jv244 at cam dot ac dot uk
2009-06-20 9:59 ` jv244 at cam dot ac dot uk
2009-06-20 10:46 ` rguenth at gcc dot gnu dot org
2009-06-20 11:37 ` jv244 at cam dot ac dot uk
[not found] <bug-31021-4@http.gcc.gnu.org/bugzilla/>
2013-03-27 11:34 ` rguenth at gcc dot gnu.org
2013-03-27 11:47 ` Joost.VandeVondele at mat dot ethz.ch
2013-03-29 8:15 ` Joost.VandeVondele at mat dot ethz.ch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).