public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/38846]  New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
@ 2009-01-14 18:26 burnus at gcc dot gnu dot org
  2009-01-14 18:42 ` Sebastian Pop
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 18:26 UTC (permalink / raw)
  To: gcc-bugs

This is with gfortran 4.4.0 20090114 [trunk revision 143364] and the Polyhedron
test suite, http://www.polyhedron.co.uk/MFL6VW74649
on AMD Athlon(tm) 64 X2 Dual Core Processor 4800+  running openSUSE Factory
x86-64.

First the good news: No ICE and no result-checking failures.

The geometric average shows 4% longer runtime for -floop*, which is dominated
by the 70% slower gas_dyn. Other programs are faster such as capacita (by 6%)
or slower such as test_fpu (by 13%). [I picked the extrema; mostly the changes
seem to be only slightly above noise with a slight tendency to better
performance.]

I was using the following options, which yielded the stated run time for
gas_dyn:

gfortran -march=opteron -ffast-math -funroll-loops -ftree-vectorize
-ftree-loop-linear -msse3 -O3
-> Runtime = 11.73 seconds

gfortran -march=opteron  -floop-interchange -floop-strip-mine -floop-block
-ffast-math -funroll-loops -ftree-vectorize -ftree-loop-linear -msse3 -O3
-> Runtime = 19.95033


-- 
           Summary: [Graphite] 70% slower using -floop* than without
                    graphite (gas_dyn of Polyhedron)
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: burnus at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
  2009-01-14 18:42 ` Sebastian Pop
@ 2009-01-14 18:42 ` sebpop at gmail dot com
  2009-01-14 18:45 ` burnus at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: sebpop at gmail dot com @ 2009-01-14 18:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from sebpop at gmail dot com  2009-01-14 18:42 -------
Subject: Re:  New: [Graphite] 70% slower using -floop* than without graphite
(gas_dyn of Polyhedron)

Hi,

Thanks for this report.  Please also test with the code of graphite
branch that contains a
patch that schedules several scalar optimizations that can improve the
quality of the code generated.

Thanks,
Sebastian Pop
--
AMD - GNU Tools


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
@ 2009-01-14 18:42 ` Sebastian Pop
  2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Sebastian Pop @ 2009-01-14 18:42 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

Hi,

Thanks for this report.  Please also test with the code of graphite
branch that contains a
patch that schedules several scalar optimizations that can improve the
quality of the code generated.

Thanks,
Sebastian Pop
--
AMD - GNU Tools


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
  2009-01-14 18:42 ` Sebastian Pop
  2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
@ 2009-01-14 18:45 ` burnus at gcc dot gnu dot org
  2009-01-14 21:16 ` burnus at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 18:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from burnus at gcc dot gnu dot org  2009-01-14 18:45 -------
The culprit is -floop-block (which is already enabled by default in the
graphite branch with -O2). Using only -floop-interchange -floop-strip-mine I
get a run time inbetween (16.5s, single run). Using only -fgraphite-identity I
also get ~16.5s.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2009-01-14 18:45 ` burnus at gcc dot gnu dot org
@ 2009-01-14 21:16 ` burnus at gcc dot gnu dot org
  2009-01-17 18:12 ` dominiq at lps dot ens dot fr
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-01-14 21:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from burnus at gcc dot gnu dot org  2009-01-14 21:16 -------
> Thanks for this report.  Please also test with the code of graphite
> branch that contains a patch that schedules several scalar optimizations
> that can improve the quality of the code generated.

For the geometric mean I see a 1% improvement over trunk+graphite. gas_dyn is
now 18.70s. However, capacita is slower on the branch (only 4% instead of 6%
faster than the trunk w/o graphite); channel is 2% slower on the branch than
trunk's graphite. Except of these two, the branch is faster. (Though the load
of other programs might be also a bit less than for the trunk run [w/ and w/o
-floop*].)

Thanks for your Graphite work!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2009-01-14 21:16 ` burnus at gcc dot gnu dot org
@ 2009-01-17 18:12 ` dominiq at lps dot ens dot fr
  2009-04-24 21:38 ` burnus at gcc dot gnu dot org
  2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: dominiq at lps dot ens dot fr @ 2009-01-17 18:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from dominiq at lps dot ens dot fr  2009-01-17 18:12 -------
I have similar results as comment #0 on i686-apple-darwin9 (Core2) trunk
revision 143468:

================================================================================
Date & Time     : 17 Jan 2009 17:41:32
Test Name       : pbharness
Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops -fgraphite
-fgraphite-identity -floop-block -floop-strip-mine -floop-interchange
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      4.86       42560     12.31       5  0.2625
      aermod     87.72     1270544     30.36       5  0.2338
         air      5.73       77336      8.38       5  0.0536
    capacita      4.13       72760     45.60       2  0.0055
     channel      1.69       30456      2.71       2  0.0368
       doduc     11.71      200024     42.88       2  0.0501
     fatigue      4.26       76736     12.91       2  0.0852
     gas_dyn      5.83      692200     22.24       5  0.4693
      induct     10.17      177072     34.38       2  0.1440
       linpk      1.67       42536     28.21       5  0.3051
        mdbx      3.43       73000     14.79       2  0.0068
          nf     14.82      112264     32.25       2  0.1612
     protein      9.92      114136     45.90       2  0.1961
      rnflow     11.24      171464     37.49       2  0.0960
    test_fpu      9.49      154224     13.06       2  0.1263
        tfft      1.15       26432      2.88       5  0.2609

Geometric Mean Execution Time =      18.18 seconds

================================================================================
...
Finished Testing  16 benchmarks -  16 passed, and   0 failed

compared to

================================================================================
Date & Time     : 17 Jan 2009 18:03:59
Test Name       : pbharness
Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops
-ftree-loop-linear -fomit-frame-pointer -finline-limit=600 --param
min-vect-loop-bound=2 -o %n
Benchmarks      : ac aermod air capacita channel doduc fatigue gas_dyn induct
linpk mdbx nf protein rnflow test_fpu tfft
Maximum Times   :      300.0
Target Error %  :      0.200
Minimum Repeats :     2
Maximum Repeats :     5

   Benchmark   Compile  Executable   Ave Run  Number   Estim
        Name    (secs)     (bytes)    (secs) Repeats   Err %
   ---------   -------  ----------   ------- -------  ------
          ac      2.38       42560     12.33       5  0.3327
      aermod     86.86     1270544     29.86       2  0.0151
         air      5.53       77336      8.39       5  0.2713
    capacita      3.40       72760     55.49       5  0.5426
     channel      1.98       38648      2.27       2  0.0000
       doduc     11.42      200024     42.93       2  0.1456
     fatigue      4.94       89024     10.83       5  0.2533
     gas_dyn      6.61      708584     10.38       2  0.1541
      induct      9.95      181168     34.41       2  0.0727
       linpk      1.50       42536     27.98       2  0.0804
        mdbx      3.30       73000     14.81       2  0.0911
          nf     24.30      161416     32.06       4  0.1922
     protein     10.54      126424     46.18       2  0.1646
      rnflow     10.93      179616     36.00       2  0.0014
    test_fpu     10.26      166512     12.45       2  0.0723
        tfft      1.10       26432      2.86       5  0.2793

Geometric Mean Execution Time =      17.05 seconds

================================================================================

The 70% for gas_dyn turns to be more than a factor 2, and capacita is faster by
almost 20% with floop-block.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2009-01-17 18:12 ` dominiq at lps dot ens dot fr
@ 2009-04-24 21:38 ` burnus at gcc dot gnu dot org
  2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2009-04-24 21:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from burnus at gcc dot gnu dot org  2009-04-24 21:38 -------
Current gas_dyn runtime:
  16.161 - 4.5, graphite (options, see comment 0)
  13.122 - 4.5, no graphite
-> Now 20% slower
  16.399 - 4.4, graphite
(Why the run time is longer than in January, I don't understand; the computer
is the same.)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug middle-end/38846] [Graphite] 35% slower using -floop* than without graphite (gas_dyn of Polyhedron)
  2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2009-04-24 21:38 ` burnus at gcc dot gnu dot org
@ 2010-02-23 15:29 ` burnus at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu dot org @ 2010-02-23 15:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from burnus at gcc dot gnu dot org  2010-02-23 15:29 -------
Result with current trunk (4.5.0 2010-02-23 Rev. 156999)

In a nutshell: gas_dyn is still slower - now 35% instead of 70%.
-fgraphite-identity has normal speed (or a tiny bit faster?!?), but all other
options (-floop-interchange, -floop-strip-mine, and -floop-block) cause the
slow down. -- nf is 6% slower and the rest looks fine.

 * * *

System: AMD Athlon 64 X2 Dual Core Processor 4800+ @ 2.4 GHz
Base options: gfortran -march=opteron -ffast-math -funroll-loops
-ftree-vectorize -ftree-loop-linear -msse3 -O3

LTO uses additionally the options "-flto -fwhole-program -fno-protect-parens"
Used graphite options: "-floop-interchange -floop-strip-mine -floop-block"

          LTO                             No LTO
ac        1% faster (13.16s vs. 13.31s)   =         (13.29s vs. 13.35s)
aermod    =         (30.69s vs. 30.87s)   =         (34.32s vs. 34.58s)
air       =         (15.64s vs. 15.68s)   2% faster (15.72s vs. 15.68s)
capacita  6% SLOWER (86.92s vs. 82.14s)   2% faster (81.66s vs. 82.92s)
channel   =         (15.36s vs. 15.28s)   3% faster (15.26s vs. 15.71s)
doduc     5% faster (40.97s vs. 43.05s)   5% faster (40.28s vs. 42.51s)
fatigue   2% SLOWER ( 7.21s vs.  7.08s)   4% SLOWER ( 9.99s vs.  9.57s)
gas_dyn  35% SLOWER (15.35s vs. 11.36s)  37% SLOWER (15.36s vs. 11.19s)
induct   14% faster (29.15s vs. 33.90s)  24% faster (28.07s vs. 37.08s)
linpk     1% faster (30.34s vs. 30.68s)   2% faster (30.40s vs. 31.03s)
mdbx      2% faster (20.15s vs. 20.56s)   =         (19.40s vs. 19.48s)
nf        6% SLOWER (33.49s vs. 31.49s)   6% SLOWER (33.61s vs. 31.62s)
protein   =         (64.51s vs. 64.24s)   3% SLOWER (65.65s vs. 63.84s)
rnflow    2% faster (36.07s vs. 36.82s)   =         (35.10s vs. 34.96s)
test_fpu  =         (21.93s vs. 21.85s)   2% faster (20.76s vs. 21.28s)
tfft      2% faster ( 8.25s vs.  8.43s)   1% faster (8.22s vs.  8.33s)

Geo.Mean  1% SLOWER (23.66s vs. 23.42s)   =         (24.00s vs. 23.99s)

 * * *

gas_dyn.f90 only results:
A) w/o Graphite
real    0m11.281s  user    0m11.013s  sys     0m0.044s

B) w/ -fgraphite-identity
real    0m10.622s  user    0m10.533s  sys     0m0.080s

C) w/ -floop-interchange
real    0m15.077s  user    0m14.785s  sys     0m0.068s

D) w/ -floop-strip-mine
real    0m15.818s  user    0m15.205s  sys     0m0.052s

E) w/ -floop-block
real    0m15.349s  user    0m15.249s  sys     0m0.080s

F) w/ -floop-interchange -floop-strip-mine
real    0m15.740s  user    0m15.589s  sys     0m0.044s

G) w/ -floop-interchange -floop-strip-mine -floop-block
real    0m15.658s  user    0m15.333s  sys     0m0.040s


-- 

burnus at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[Graphite] 70% slower using |[Graphite] 35% slower using
                   |-floop* than without        |-floop* than without
                   |graphite (gas_dyn of        |graphite (gas_dyn of
                   |Polyhedron)                 |Polyhedron)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38846


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-02-23 15:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-14 18:26 [Bug middle-end/38846] New: [Graphite] 70% slower using -floop* than without graphite (gas_dyn of Polyhedron) burnus at gcc dot gnu dot org
2009-01-14 18:42 ` Sebastian Pop
2009-01-14 18:42 ` [Bug middle-end/38846] " sebpop at gmail dot com
2009-01-14 18:45 ` burnus at gcc dot gnu dot org
2009-01-14 21:16 ` burnus at gcc dot gnu dot org
2009-01-17 18:12 ` dominiq at lps dot ens dot fr
2009-04-24 21:38 ` burnus at gcc dot gnu dot org
2010-02-23 15:29 ` [Bug middle-end/38846] [Graphite] 35% " burnus at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).