public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Strange results on code speed
@ 1997-12-10  5:31 Laurent Bonnaud
  0 siblings, 0 replies; 3+ messages in thread
From: Laurent Bonnaud @ 1997-12-10  5:31 UTC (permalink / raw)
  To: egcs; +Cc: bonnaud

>  You might want to retry your tests with -O4 or so.  At -O2 function
>inlining is not done, I believe, and that might well make a big
>difference.

According to the doc, -O3 is the highest optimization level available
and -O3 does inlining indeed.  In fact i used the -O2 option on
purpose, as the code is written with inline directives where they are
needed.  Nevertheless, i tried your suggestion, but the speed
difference is not significant.

-- 
Laurent.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Strange results on code speed
  1997-12-09 16:25 Laurent Bonnaud
@ 1997-12-10  0:16 ` Mark Mitchell
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Mitchell @ 1997-12-10  0:16 UTC (permalink / raw)
  To: egcs

Lauren --

  You might want to retry your tests with -O4 or so.  At -O2 function
inlining is not done, I believe, and that might well make a big
difference.

-- 
Mark Mitchell		mmitchell@usa.net
Stanford University	http://www.stanford.edu


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Strange results on code speed
@ 1997-12-09 16:25 Laurent Bonnaud
  1997-12-10  0:16 ` Mark Mitchell
  0 siblings, 1 reply; 3+ messages in thread
From: Laurent Bonnaud @ 1997-12-09 16:25 UTC (permalink / raw)
  To: egcs

Hi,

since i read about a new optimization in the egcs-971207 snapshot, i
tried two C++ benchmarks :

ftp://ftp.kai.com/pub/benchmarks/stepanov_v1p2.C

and

ftp://ftp.kai.com/pub/benchmarks/oopack_v1p7.C

which measure the relative speed of the same computations coded in C
and in various C++ styles (ideally it should be 1.0).

I got strange results.  To sum up, the new snapshot performs incredibly
better than egcs-1.0 on the Stepanov benchmark, but only on
sparc-solaris (and not on i586-linux which is even worse than g++ 2.7).
On the other benchmark, results vary wildly with the architecture, but
the overall result is a bit disappointing and is better for the i86.
The good news is that this new snapshot fixes all bugs i reported.
Thanks everybody for your work !

 * stepanov_v1p2.C :

   - on an Ultrasparc 170MHz

egcs-971207 -O2  (the very good result)

test      absolute   additions      ratio with
number    time       per second     test0

 0        0.92sec    54.35M         1.00
 1        0.92sec    54.35M         1.00
 2        0.92sec    54.35M         1.00
 3        1.24sec    40.32M         1.35
 4        1.22sec    40.98M         1.33
 5        1.23sec    40.65M         1.34
 6        1.23sec    40.65M         1.34
 7        1.23sec    40.65M         1.34
 8        1.23sec    40.65M         1.34
 9        1.23sec    40.65M         1.34
10        1.23sec    40.65M         1.34
11        1.23sec    40.65M         1.34
12        1.22sec    40.98M         1.33
mean:     1.15sec    43.50M         1.25

egcs-1.0 -O2  (still better than g++-2.7.2.1 except for test1)
 0        0.92sec    54.35M         1.00
 1        3.65sec    13.70M         3.97
 2        4.26sec    11.74M         4.63
 3        3.96sec    12.63M         4.30
 4        5.48sec     9.12M         5.96
 5        3.96sec    12.63M         4.30
 6        5.49sec     9.11M         5.97
 7        5.18sec     9.65M         5.63
 8        5.48sec     9.12M         5.96
 9        5.17sec     9.67M         5.62
10        6.06sec     8.25M         6.59
11        3.76sec    13.30M         4.09
12        6.40sec     7.81M         6.96
mean:     4.24sec    11.78M         4.61

   - on a 586 166MHz

egcs-971207 -O2  (not as good as on sparc)
 0        3.63sec    13.77M         1.00
 1        6.67sec     7.50M         1.84
 2        5.72sec     8.74M         1.58
 3        8.01sec     6.24M         2.21
 4        6.09sec     8.21M         1.68
 5        8.20sec     6.10M         2.26
 6        6.10sec     8.20M         1.68
 7        9.76sec     5.12M         2.69
 8        9.52sec     5.25M         2.62
 9        9.72sec     5.14M         2.68
10        9.51sec     5.26M         2.62
11       11.61sec     4.31M         3.20
12        9.72sec     5.14M         2.68
mean:     7.69sec     6.50M         2.12

egcs-1.0 -O2  (slightly worse except for tests 8, 10 and 12)
 0        3.63sec    13.77M         1.00
 1        6.67sec     7.50M         1.84
 2        7.43sec     6.73M         2.05
 3        9.85sec     5.08M         2.71
 4        8.38sec     5.97M         2.31
 5        9.71sec     5.15M         2.67
 6        8.38sec     5.97M         2.31
 7       10.08sec     4.96M         2.78
 8        8.83sec     5.66M         2.43
 9        9.83sec     5.09M         2.71
10        8.76sec     5.71M         2.41
11       12.84sec     3.89M         3.54
12        9.14sec     5.47M         2.52
mean:     8.43sec     5.93M         2.32

g++-2.7.2.3 -O2 (the C speed is much better, but the C++ optimization makes up for it !)
 0        2.29sec    21.83M         1.00
 1        3.81sec    13.12M         1.66
 2        8.76sec     5.71M         3.83
 3       10.08sec     4.96M         4.40
 4       14.27sec     3.50M         6.23
 5       14.11sec     3.54M         6.16
 6       18.27sec     2.74M         7.98
 7       14.83sec     3.37M         6.48
 8       19.78sec     2.53M         8.64
 9       14.84sec     3.37M         6.48
10       19.79sec     2.53M         8.64
11       14.83sec     3.37M         6.48
12       18.88sec     2.65M         8.24
mean:    11.59sec     4.31M         5.06

 * oopack_v1p7.C 

   - on an Ultrasparc 170MHz

egcs-971207 -O2 
                         Seconds       Mflops         
Test       Iterations     C    OOP     C    OOP  Ratio
----       ----------  -----------  -----------  -----
Max              5000    0.3   0.4   16.7  12.2    1.4
Matrix             50    0.5   2.5   27.2   5.1    5.3
Complex          2000    0.3   1.5   45.7  10.7    4.3
Iterator         5000    0.2   0.6   47.6  17.2    2.8

egcs-971207 -O2 -mcpu=v8
Max              5000    0.2   0.4   20.8  12.5    1.7
Matrix             50    0.5   0.5   26.6  26.0    1.0
Complex          2000    0.4   1.1   42.1  14.4    2.9
Iterator         5000    0.2   0.7   41.7  15.2    2.8

The -mcpu=v8 option improves dramatically the Matrix test, but only for
the OOP style (same thing for g++ 2.7) !  For the Complex test, the
switch has opposite effects for both styles !

   - on a 586 166MHz the result is better :

egcs-971207 -O2 -mcpu=pentium
Max              5000    0.6   0.6    9.1   8.3    1.1
Matrix             50    0.7   0.9   17.4  13.6    1.3
Complex          2000    0.8   1.4   20.5  11.1    1.8
Iterator         5000    0.5   0.6   18.5  16.4    1.1


So what do you think ?  If egcs could always perform as well on i86 as
on the sparc and vice-versa, depending on the code, it would really .

-- 
Laurent.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~1997-12-10  5:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-12-10  5:31 Strange results on code speed Laurent Bonnaud
  -- strict thread matches above, loose matches on Subject: below --
1997-12-09 16:25 Laurent Bonnaud
1997-12-10  0:16 ` Mark Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).