public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
@ 2009-01-14 18:26 burnus at gcc dot gnu dot org
2009-01-14 18:42 ` Sebastian Pop
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 18:26 UTC (permalink / raw)
To: gcc-bugs
This is with gfortran 4.4.0 20090114 [trunk revision 143364] and the Polyhedron
test suite, http://www.polyhedron.co.uk/MFL6VW74649
on AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ running openSUSE Factory
x86-64.
First the good news: No ICE and no result-checking failures.
The geometric average shows 4% longer runtime for -floop*, which is dominated
by the 70% slower gas_dyn. Other programs are faster such as capacita (by 6%)
or slower such as test_fpu (by 13%). [I picked the extrema; mostly the changes
seem to be only slightly above noise with a slight tendency to better
performance.]
I was using the following options, which yielded the stated run time for
gas_dyn:
gfortran -march=opteron -ffast-math -funroll-loops -ftree-vectorize
-ftree-loop-linear -msse3 -O3
-> Runtime = 11.73 seconds
gfortran -march=opteron -floop-interchange -floop-strip-mine -floop-block
-ffast-math -funroll-loops -ftree-vectorize -ftree-loop-linear -msse3 -O3
-> Runtime = 19.95033
--
Summary: [Graphite] 70% slower using -floop* than without
graphite (gas_dyn of Polyhedron)
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: burnus at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
2009-01-14 18:42 ` Sebastian Pop
@ 2009-01-14 18:42 ` sebpop at gmail dot com
2009-01-14 18:45 ` burnus at gcc dot gnu dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: sebpop at gmail dot com @ 2009-01-14 18:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from sebpop at gmail dot com 2009-01-14 18:42 -------
Subject: Re: New: [Graphite] 70% slower using -floop* than without graphite
(gas_dyn of Polyhedron)
Hi,
Thanks for this report. Please also test with the code of graphite
branch that contains a
patch that schedules several scalar optimizations that can improve the
quality of the code generated.
Thanks,
Sebastian Pop
--
AMD - GNU Tools
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
@ 2009-01-14 18:42 ` Sebastian Pop
2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Sebastian Pop @ 2009-01-14 18:42 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
Hi,
Thanks for this report. Please also test with the code of graphite
branch that contains a
patch that schedules several scalar optimizations that can improve the
quality of the code generated.
Thanks,
Sebastian Pop
--
AMD - GNU Tools
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
2009-01-14 18:42 ` Sebastian Pop
2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
@ 2009-01-14 18:45 ` burnus at gcc dot gnu dot org
2009-01-14 21:16 ` burnus at gcc dot gnu dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 18:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from burnus at gcc dot gnu dot org 2009-01-14 18:45 -------
The culprit is -floop-block (which is already enabled by default in the
graphite branch with -O2). Using only -floop-interchange -floop-strip-mine I
get a run time inbetween (16.5s, single run). Using only -fgraphite-identity I
also get ~16.5s.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
` (2 preceding siblings ...)
2009-01-14 18:45 ` burnus at gcc dot gnu dot org
@ 2009-01-14 21:16 ` burnus at gcc dot gnu dot org
2009-01-17 18:12 ` dominiq at lps dot ens dot fr
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 21:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from burnus at gcc dot gnu dot org 2009-01-14 21:16 -------
> Thanks for this report. Please also test with the code of graphite
> branch that contains a patch that schedules several scalar optimizations
> that can improve the quality of the code generated.
For the geometric mean I see a 1% improvement over trunk+graphite. gas_dyn is
now 18.70s. However, capacita is slower on the branch (only 4% instead of 6%
faster than the trunk w/o graphite); channel is 2% slower on the branch than
trunk's graphite. Except of these two, the branch is faster. (Though the load
of other programs might be also a bit less than for the trunk run [w/ and w/o
-floop*].)
Thanks for your Graphite work!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
` (3 preceding siblings ...)
2009-01-14 21:16 ` burnus at gcc dot gnu dot org
@ 2009-01-17 18:12 ` dominiq at lps dot ens dot fr
2009-04-24 21:38 ` burnus at gcc dot gnu dot org
2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org
6 siblings, 0 replies; 8+ messages in thread
From: dominiq at lps dot ens dot fr @ 2009-01-17 18:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from dominiq at lps dot ens dot fr 2009-01-17 18:12 -------
I have similar results as comment #0 on i686-apple-darwin9 (Core2) trunk
revision 143468:
================================================================================
Date & Time : 17 Jan 2009 17:41:32
Test Name : pbharness
Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops -fgraphite
-fgraphite-identity -floop-block -floop-strip-mine -floop-interchange
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 300.0
Target Error % : 0.200
Minimum Repeats : 2
Maximum Repeats : 5
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 4.86 42560 12.31 5 0.2625
aermod 87.72 1270544 30.36 5 0.2338
air 5.73 77336 8.38 5 0.0536
capacita 4.13 72760 45.60 2 0.0055
channel 1.69 30456 2.71 2 0.0368
doduc 11.71 200024 42.88 2 0.0501
fatigue 4.26 76736 12.91 2 0.0852
gas_dyn 5.83 692200 22.24 5 0.4693
induct 10.17 177072 34.38 2 0.1440
linpk 1.67 42536 28.21 5 0.3051
mdbx 3.43 73000 14.79 2 0.0068
nf 14.82 112264 32.25 2 0.1612
protein 9.92 114136 45.90 2 0.1961
rnflow 11.24 171464 37.49 2 0.0960
test_fpu 9.49 154224 13.06 2 0.1263
tfft 1.15 26432 2.88 5 0.2609
Geometric Mean Execution Time = 18.18 seconds
================================================================================
...
Finished Testing 16 benchmarks - 16 passed, and 0 failed
compared to
================================================================================
Date & Time : 17 Jan 2009 18:03:59
Test Name : pbharness
Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 -o %n
Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times : 300.0
Target Error % : 0.200
Minimum Repeats : 2
Maximum Repeats : 5
Benchmark Compile Executable Ave Run Number Estim
Name (secs) (bytes) (secs) Repeats Err %
--------- ------- ---------- ------- ------- ------
ac 2.38 42560 12.33 5 0.3327
aermod 86.86 1270544 29.86 2 0.0151
air 5.53 77336 8.39 5 0.2713
capacita 3.40 72760 55.49 5 0.5426
channel 1.98 38648 2.27 2 0.0000
doduc 11.42 200024 42.93 2 0.1456
fatigue 4.94 89024 10.83 5 0.2533
gas_dyn 6.61 708584 10.38 2 0.1541
induct 9.95 181168 34.41 2 0.0727
linpk 1.50 42536 27.98 2 0.0804
mdbx 3.30 73000 14.81 2 0.0911
nf 24.30 161416 32.06 4 0.1922
protein 10.54 126424 46.18 2 0.1646
rnflow 10.93 179616 36.00 2 0.0014
test_fpu 10.26 166512 12.45 2 0.0723
tfft 1.10 26432 2.86 5 0.2793
Geometric Mean Execution Time = 17.05 seconds
================================================================================
The 70% for gas_dyn turns to be more than a factor 2, and capacita is faster by
almost 20% with floop-block.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
` (4 preceding siblings ...)
2009-01-17 18:12 ` dominiq at lps dot ens dot fr
@ 2009-04-24 21:38 ` burnus at gcc dot gnu dot org
2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org
6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-04-24 21:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from burnus at gcc dot gnu dot org 2009-04-24 21:38 -------
Current gas_dyn runtime:
16.161 - 4.5, graphite (options, see comment 0)
13.122 - 4.5, no graphite
-> Now 20% slower
16.399 - 4.4, graphite
(Why the run time is longer than in January, I don't understand; the computer
is the same.)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/38846] [Graphite] 35% slower using -floop* than without graphite (gas_dyn of Polyhedron)
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
` (5 preceding siblings ...)
2009-04-24 21:38 ` burnus at gcc dot gnu dot org
@ 2010-02-23 15:29 ` burnus at gcc dot gnu dot org
6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-02-23 15:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from burnus at gcc dot gnu dot org 2010-02-23 15:29 -------
Result with current trunk (4.5.0 2010-02-23 Rev. 156999)
In a nutshell: gas_dyn is still slower - now 35% instead of 70%.
-fgraphite-identity has normal speed (or a tiny bit faster?!?), but all other
options (-floop-interchange, -floop-strip-mine, and -floop-block) cause the
slow down. -- nf is 6% slower and the rest looks fine.
* * *
System: AMD Athlon 64 X2 Dual Core Processor 4800+ @ 2.4 GHz
Base options: gfortran -march=opteron -ffast-math -funroll-loops
-ftree-vectorize -ftree-loop-linear -msse3 -O3
LTO uses additionally the options "-flto -fwhole-program -fno-protect-parens"
Used graphite options: "-floop-interchange -floop-strip-mine -floop-block"
LTO No LTO
ac 1% faster (13.16s vs. 13.31s) = (13.29s vs. 13.35s)
aermod = (30.69s vs. 30.87s) = (34.32s vs. 34.58s)
air = (15.64s vs. 15.68s) 2% faster (15.72s vs. 15.68s)
capacita 6% SLOWER (86.92s vs. 82.14s) 2% faster (81.66s vs. 82.92s)
channel = (15.36s vs. 15.28s) 3% faster (15.26s vs. 15.71s)
doduc 5% faster (40.97s vs. 43.05s) 5% faster (40.28s vs. 42.51s)
fatigue 2% SLOWER ( 7.21s vs. 7.08s) 4% SLOWER ( 9.99s vs. 9.57s)
gas_dyn 35% SLOWER (15.35s vs. 11.36s) 37% SLOWER (15.36s vs. 11.19s)
induct 14% faster (29.15s vs. 33.90s) 24% faster (28.07s vs. 37.08s)
linpk 1% faster (30.34s vs. 30.68s) 2% faster (30.40s vs. 31.03s)
mdbx 2% faster (20.15s vs. 20.56s) = (19.40s vs. 19.48s)
nf 6% SLOWER (33.49s vs. 31.49s) 6% SLOWER (33.61s vs. 31.62s)
protein = (64.51s vs. 64.24s) 3% SLOWER (65.65s vs. 63.84s)
rnflow 2% faster (36.07s vs. 36.82s) = (35.10s vs. 34.96s)
test_fpu = (21.93s vs. 21.85s) 2% faster (20.76s vs. 21.28s)
tfft 2% faster ( 8.25s vs. 8.43s) 1% faster (8.22s vs. 8.33s)
Geo.Mean 1% SLOWER (23.66s vs. 23.42s) = (24.00s vs. 23.99s)
* * *
gas_dyn.f90 only results:
A) w/o Graphite
real 0m11.281s user 0m11.013s sys 0m0.044s
B) w/ -fgraphite-identity
real 0m10.622s user 0m10.533s sys 0m0.080s
C) w/ -floop-interchange
real 0m15.077s user 0m14.785s sys 0m0.068s
D) w/ -floop-strip-mine
real 0m15.818s user 0m15.205s sys 0m0.052s
E) w/ -floop-block
real 0m15.349s user 0m15.249s sys 0m0.080s
F) w/ -floop-interchange -floop-strip-mine
real 0m15.740s user 0m15.589s sys 0m0.044s
G) w/ -floop-interchange -floop-strip-mine -floop-block
real 0m15.658s user 0m15.333s sys 0m0.040s
--
burnus at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[Graphite] 70% slower using |[Graphite] 35% slower using
|-floop* than without |-floop* than without
|graphite (gas_dyn of |graphite (gas_dyn of
|Polyhedron) |Polyhedron)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-02-23 15:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
2009-01-14 18:42 ` Sebastian Pop
2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
2009-01-14 18:45 ` burnus at gcc dot gnu dot org
2009-01-14 21:16 ` burnus at gcc dot gnu dot org
2009-01-17 18:12 ` dominiq at lps dot ens dot fr
2009-04-24 21:38 ` burnus at gcc dot gnu dot org
2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).