* EGCS vs GCC performance
@ 1998-01-12 20:31 Dave Steffen
1998-01-14 4:17 ` jbuck
0 siblings, 1 reply; 12+ messages in thread
From: Dave Steffen @ 1998-01-12 20:31 UTC (permalink / raw)
To: egcs
Hi Folks,
If anybody's interested, I just did a (very) informal performance
test between EGCS 1.0.1 and GCC 2.7.2. The test involved compiling
and executing some heavily-templated numerical code on a HP 715
running HP-UX 9.05.
The code in both cases was the same, although some of the library
material was different; I had to hack a few things to work with
EGCS (or, more accurately, _un-hack_ some gcc things... i.e. I had
to update some syntax to reflect the standard).
The result, in a nutshell, is that EGCS outperforms GCC
significantly in both compile-time and run-time. The downside is
that the executable is bigger - but for numerical work that's
rarely important.
____________________ COMPILATION ____________________
I executed "time make" using GCC:
g++ -ansi -frepo -O3 -I/usr/local/lib/TNT -I/usr/local/lib/C++ -c kubo.C
(... etc etc)
real 5m40.990s
user 4m54.180s
sys 0m30.970s
A lot of time was used recompiling the source to get the
templates right; this took seven iterations. (The -frepo flag
is convenient, but it does take a while the first time.
If all the .rpo files already exist, it's much much faster:
real 1m48.070s
user 1m29.930s
sys 0m9.020s
The same build process with EGCS:
real 1m47.720s
user 1m35.070s
sys 0m8.440s
Which is identical to gcc with the template repository already
built. OTOH this is a build "from scratch", so this still
represents a significant improvement for template-heavy code
(which mine is). Using one of the "manual" template mechanisms
with gcc would probably match this, but then there's the extra
programmer time involved... ;-)
____________________ EXECUTABLE SIZE ____________________
-rwxr-xr-x 1 steffend users 1107024 Jan 12 16:07 kubo* (gcc)
-rwxr-xr-x 1 steffend users 1572944 Jan 12 16:12 kubo* (egcs)
This is irrelevant for my work, but I thought it was
interesting.
____________________ EXECUTION TIME ____________________
Roughly speaking, the code diagonalizes a 20x20 matrix of
complex<double>s by calling fortran routines out of the LAPACK
library. Obviously this time will be identical for both
compilers. Then there's a _whole lot_ of matrix and vector
multiplication as we fold, spindle, and mutilate the
eigenvectors; this is mostly C++ code (with some more LAPACK
calls here and there).
I ran twice with each executable:
GCC:
real 9m8.840s real 9m13.590s
user 9m5.760s user 9m10.060s
sys 0m1.540s sys 0m1.550s
EGCS:
real 6m17.120s real 7m0.460s
user 5m23.060s user 6m55.750s
sys 0m33.760 sys 0m1.550s
This made me very happy; a 20-30% decrease in runtime is
significant, as most of my runs take a day or two. This will
also be a big help in convincing my advisor (an unrepentant
fortran programmer) that C++ is a good language for numerical
work! ;-)
This is obviously not an exhaustive test. Will your mileage vary? Of
course. But it's a good indication that, even aside from language
standard issues, EGCS is a better compiler than GCC.
Thanks, guys!
--------------------------------------------------------------------------
Dave Steffen Wave after wave will flow with the tide
Dept. of Physics And bury the world as it does
Colorado State University Tide after tide will flow and recede
steffend@lamar.colostate.edu Leaving life to go on as it was...
- Peart / RUSH
"The reason that our people suffer in this way....
is that our ancestors failed to rule wisely". -General Choi, Hong Hi
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-12 20:31 EGCS vs GCC performance Dave Steffen
@ 1998-01-14 4:17 ` jbuck
1998-01-15 16:06 ` David Edelsohn
1998-01-15 16:30 ` Jeffrey A Law
0 siblings, 2 replies; 12+ messages in thread
From: jbuck @ 1998-01-14 4:17 UTC (permalink / raw)
To: Dave Steffen; +Cc: egcs
> If anybody's interested, I just did a (very) informal performance
> test between EGCS 1.0.1 and GCC 2.7.2. The test involved compiling
> and executing some heavily-templated numerical code on a HP 715
> running HP-UX 9.05.
...
> The result, in a nutshell, is that EGCS outperforms GCC
> significantly in both compile-time and run-time.
HP, if I understand correctly, is the platform that has benefited the
most from the Haifa scheduler. The story isn't as great on some other
platforms; ix86/Pentium performance seems to have actually gotten worse
in some cases according to several reports. But I'm sure this will be
addressed soon.
> I executed "time make" using GCC:
>
> g++ -ansi -frepo -O3 -I/usr/local/lib/TNT -I/usr/local/lib/C++ -c kubo.C
>
> (... etc etc)
>
> real 5m40.990s
> user 4m54.180s
> sys 0m30.970s
>
> A lot of time was used recompiling the source to get the
> templates right; this took seven iterations. (The -frepo flag
> is convenient, but it does take a while the first time.
This is why I dislike -frepo. If you're willing to trade larger
object files (and possibly a larger executable if on your platform
the linker cannot eliminate duplicate functions) in exchange for
much faster compile/link time, just don't use -frepo.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-14 4:17 ` jbuck
@ 1998-01-15 16:06 ` David Edelsohn
1998-01-15 16:30 ` Jeffrey A Law
1 sibling, 0 replies; 12+ messages in thread
From: David Edelsohn @ 1998-01-15 16:06 UTC (permalink / raw)
To: jbuck; +Cc: egcs
>>>>> jbuck writes:
>> If anybody's interested, I just did a (very) informal performance
>> test between EGCS 1.0.1 and GCC 2.7.2. The test involved compiling
>> and executing some heavily-templated numerical code on a HP 715
>> running HP-UX 9.05.
jbuck> ...
>> The result, in a nutshell, is that EGCS outperforms GCC
>> significantly in both compile-time and run-time.
jbuck> HP, if I understand correctly, is the platform that has benefited the
jbuck> most from the Haifa scheduler. The story isn't as great on some other
jbuck> platforms; ix86/Pentium performance seems to have actually gotten worse
jbuck> in some cases according to several reports. But I'm sure this will be
jbuck> addressed soon.
Neither GCC nor EGCS machine descriptions include any instruction
or function unit information for any x86 processor. Without any
information, there is little that the Haifa scheduler can do. I do not
know why Haifa makes it slightly worse than the old scheduler. With info
for Pentiums and AMD and Cyrix, Haifa should show significant improvement.
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-14 4:17 ` jbuck
1998-01-15 16:06 ` David Edelsohn
@ 1998-01-15 16:30 ` Jeffrey A Law
1998-01-17 23:02 ` Marc Lehmann
1998-01-18 21:51 ` Harvey J. Stein
1 sibling, 2 replies; 12+ messages in thread
From: Jeffrey A Law @ 1998-01-15 16:30 UTC (permalink / raw)
To: jbuck; +Cc: Dave Steffen, egcs
In message < 199801131736.JAA04529@atrus.synopsys.com >you write:
>
> > If anybody's interested, I just did a (very) informal performance
> > test between EGCS 1.0.1 and GCC 2.7.2. The test involved compiling
> > and executing some heavily-templated numerical code on a HP 715
> > running HP-UX 9.05.
> ...
> > The result, in a nutshell, is that EGCS outperforms GCC
> > significantly in both compile-time and run-time.
>
> HP, if I understand correctly, is the platform that has benefited the
> most from the Haifa scheduler.
Yes. If I remember right 8% or so was the average improvement due to
haifa alone (for FP intensive code). With the alias analysis and other
HP opts that have gone in since gcc-2.7 20-25% improvment isn't unrealistic.
> The story isn't as great on some other
> platforms; ix86/Pentium performance seems to have actually gotten worse
> in some cases according to several reports. But I'm sure this will be
> addressed soon.
Right. Actually for the x86 the first thing we're being nailed by
is the alignment of doubles (at least that's my understanding).
We also need to throttle some of the loop opts to avoid holding too
many computable givs in registers through the life of a loop. It was
a problem in older versions of gcc, but it's much more pronounced in
egcs because egcs does a much better job at finding GIVs than older
versions of gcc.
And scheduling problems. The x86 port really needs some work before it'll
be generally profitable to enable instruction scheduling.
jeff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-15 16:30 ` Jeffrey A Law
@ 1998-01-17 23:02 ` Marc Lehmann
1998-01-18 21:51 ` Harvey J. Stein
1 sibling, 0 replies; 12+ messages in thread
From: Marc Lehmann @ 1998-01-17 23:02 UTC (permalink / raw)
To: egcs
On Wed, Jan 14, 1998 at 11:39:40AM -0700, Jeffrey A Law wrote:
> Right. Actually for the x86 the first thing we're being nailed by
> is the alignment of doubles (at least that's my understanding).
well, most people still compile integer code.. the problem with double
alignment is that the difference in speed can get very very large, so it's
immediately noticable when sth. gets wrong (and egcs will be blamed ;(
> And scheduling problems. The x86 port really needs some work before it'll
> be generally profitable to enable instruction scheduling.
pgcc has enabled instruction scheduling for a long time... the
i386.md file seems not to be too different, and generally the
speed improvement with the old scheduler is 3-5%, even more
with -frisc (== a crude way of splitting instructions)
i'm currently hacking on the haifa scheduler parameters, but
without a pppro available, this will only help pentiums
(if at all).
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / pcg@goof.com |e|
-=====/_/_//_/\_,_/ /_/\_\ --+
The choice of a GNU generation |
|
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-18 21:51 ` Harvey J. Stein
@ 1998-01-18 0:13 ` Jeffrey A Law
1998-01-19 10:12 ` Richard Henderson
1 sibling, 0 replies; 12+ messages in thread
From: Jeffrey A Law @ 1998-01-18 0:13 UTC (permalink / raw)
To: Harvey J. Stein; +Cc: egcs, hjstein
In message < m2g1mmp5fk.fsf@blinky.bfr.co.il >you write:
> What about on the Alphas? Is the Haifa scheduler configured properly
> for the Dec Alpha? Or is one better off currently disabling Haifa?
I believe it should be profitable on Alphas; Richard Henderson would
know for sure.
jeff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-15 16:30 ` Jeffrey A Law
1998-01-17 23:02 ` Marc Lehmann
@ 1998-01-18 21:51 ` Harvey J. Stein
1998-01-18 0:13 ` Jeffrey A Law
1998-01-19 10:12 ` Richard Henderson
1 sibling, 2 replies; 12+ messages in thread
From: Harvey J. Stein @ 1998-01-18 21:51 UTC (permalink / raw)
To: law; +Cc: hjstein
Jeffrey A Law <law@cygnus.com> writes:
> And scheduling problems. The x86 port really needs some work before it'll
> be generally profitable to enable instruction scheduling.
What about on the Alphas? Is the Haifa scheduler configured properly
for the Dec Alpha? Or is one better off currently disabling Haifa?
--
Harvey J. Stein
Berger Financial Research
hjstein@bfr.co.il
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-18 21:51 ` Harvey J. Stein
1998-01-18 0:13 ` Jeffrey A Law
@ 1998-01-19 10:12 ` Richard Henderson
1998-01-20 2:21 ` Harvey J. Stein
1 sibling, 1 reply; 12+ messages in thread
From: Richard Henderson @ 1998-01-19 10:12 UTC (permalink / raw)
To: Harvey J. Stein; +Cc: law, egcs, hjstein
On Sun, Jan 18, 1998 at 10:11:43AM +0200, Harvey J. Stein wrote:
> What about on the Alphas? Is the Haifa scheduler configured properly
> for the Dec Alpha? Or is one better off currently disabling Haifa?
It is configured properly for EV5 machines for certain.
There were reports that it slowed down EV4 machines, but I've
tweeked things since then -- I'm curious for feedback there.
r~
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-19 10:12 ` Richard Henderson
@ 1998-01-20 2:21 ` Harvey J. Stein
1998-01-20 10:07 ` Richard Henderson
1998-01-20 14:54 ` Toon Moene
0 siblings, 2 replies; 12+ messages in thread
From: Harvey J. Stein @ 1998-01-20 2:21 UTC (permalink / raw)
To: Richard Henderson; +Cc: hjstein
Richard Henderson <rth@cygnus.com> writes:
> On Sun, Jan 18, 1998 at 10:11:43AM +0200, Harvey J. Stein wrote:
> > What about on the Alphas? Is the Haifa scheduler configured properly
> > for the Dec Alpha? Or is one better off currently disabling Haifa?
>
> It is configured properly for EV5 machines for certain.
What about EV56s? I presume that the same config would be fine? And
as of what date is the above the case? With egcs-1.0 on redhat 4.2 I
found egcs producing code which ran about the same as gcc 2.7.2.1,
sometimes a little slower (-O2 -mcpu=21164). I tried to build
egcs-971225 & pre 1.0.1, but was unable to (because redhat 4.2's
binutils (v 2.7.0.2-4) was too old, I guess). Presumably, for the
same reason I couldn't use -mcpu=21164a with egcs-971225. How much of
a difference does -mcpu=21164 vs -mcpu=21164a make?
Thanks,
Harvey J. Stein
Berger Financial Research
hjstein@bfr.co.il
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-20 2:21 ` Harvey J. Stein
@ 1998-01-20 10:07 ` Richard Henderson
1998-01-20 14:54 ` Toon Moene
1 sibling, 0 replies; 12+ messages in thread
From: Richard Henderson @ 1998-01-20 10:07 UTC (permalink / raw)
To: Harvey J. Stein; +Cc: Richard Henderson, law, egcs, hjstein
On Tue, Jan 20, 1998 at 11:27:00AM +0200, Harvey J. Stein wrote:
> What about EV56s?
They have the same scheduling characteristics as ev5.
> And as of what date is the above the case? With egcs-1.0 on redhat 4.2 I
> found egcs producing code which ran about the same as gcc 2.7.2.1,
> sometimes a little slower (-O2 -mcpu=21164).
The last tweeks went in on 971223. Dunno what to tell you about
slowdowns -- if you can examine the differences between the two
executables and come up with theories on why it is slower in your
case we may be able to fix it.
> How much of
> a difference does -mcpu=21164 vs -mcpu=21164a make?
As always it depends on the program, but on code that does a lot
of character manipulation I've heard up to 20%.
r~
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-20 14:54 ` Toon Moene
@ 1998-01-20 14:54 ` Richard Henderson
0 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 1998-01-20 14:54 UTC (permalink / raw)
To: Toon Moene; +Cc: Richard Henderson, egcs, hjstein
On Tue, Jan 20, 1998 at 10:51:13PM +0100, Toon Moene wrote:
> Richard, are you aware of the following bug report [by
> kanazawa@flab.fujitsu.co.jp (Kanazawa Yuzi) to egcs-bugs, d.d. Wed,
> 7 Jan 1998 20:25:05 +0900 (JST)]
[...]
> This problem was caused by `TARGET_CPU_DEFAULT' definition created
> by gcc/configure. In my case, TARGET_CPU_DEFAULT was defined like
> this.
>
> tm.h:
> #define TARGET_CPU_DEFAULT MASK_CPU_EV5|MASK_GAS
That has been fixed.
Thu Jan 1 15:40:15 1998 Richard Henderson <rth@cygnus.com>
* configure.in: Put parenthesis around TARGET_CPU_DEFAULT's value.
* configure: Update.
r~
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EGCS vs GCC performance
1998-01-20 2:21 ` Harvey J. Stein
1998-01-20 10:07 ` Richard Henderson
@ 1998-01-20 14:54 ` Toon Moene
1998-01-20 14:54 ` Richard Henderson
1 sibling, 1 reply; 12+ messages in thread
From: Toon Moene @ 1998-01-20 14:54 UTC (permalink / raw)
To: Richard Henderson; +Cc: egcs, hjstein
Richard Henderson <rth@cygnus.com> writes:
>> It is configured properly for EV5 machines for certain.
Harvey Stein writes:
> What about EV56s? I presume that the same config would
> be fine? And as of what date is the above the case?
> With egcs-1.0 on redhat 4.2 I found egcs producing code
> which ran about the same as gcc 2.7.2.1, sometimes a
> little slower (-O2 -mcpu=21164). I tried to build
> egcs-971225 & pre 1.0.1, but was unable to (because redhat
> 4.2's binutils (v 2.7.0.2-4) was too old, I guess).
> Presumably, for the same reason I couldn't use
> -mcpu=21164a with egcs-971225. How much of a difference
> does -mcpu=21164 vs -mcpu=21164a make?
Richard, are you aware of the following bug report [by
kanazawa@flab.fujitsu.co.jp (Kanazawa Yuzi) to egcs-bugs, d.d. Wed,
7 Jan 1998 20:25:05 +0900 (JST)]
As far as I can see, his criticism about TARGET_CPU_DEFAULT is
right, although I'm not sure enough of my knowledge about the C
preference rules for bit operations to say that his conclusions are
correct.
Here is his report:
====>
I compiled snapshot 971225 on RedHat 5.0 alpha
(alphaev5-unknown-linux-gnu-gcc). When I used the compiler, I noticed
that it generated code much slower than before. It appeared that the
compiler optimized the code for EV6, though the configuration said the
cpu is EV5.
This problem was caused by `TARGET_CPU_DEFAULT' definition created
by gcc/configure. In my case, TARGET_CPU_DEFAULT was defined like
this.
tm.h:
#define TARGET_CPU_DEFAULT MASK_CPU_EV5|MASK_GAS
It makes `alpha_cpu' wrong because the variable is set by the
following statement.
config/alpha/alpha.c:
alpha_cpu
= TARGET_CPU_DEFAULT & MASK_CPU_EV6 ? PROCESSOR_EV6
: (TARGET_CPU_DEFAULT & MASK_CPU_EV5 ? PROCESSOR_EV5 :
PROCESSOR_EV4);
Since TARGET_CPU_DEFAULT is not protected by parenthesis,
`TARGET_CPU_DEFAULT & MASK_CPU_EV6' is always true. Thus alpha_cpu is
set to PROCESSOR_EV6.
Here is my patch. Note that with this fix rather ugly definition
like (MASK_SUPPORT_ARCH|((MASK_CPU_EV5|MASK_BWX|TASK_MAX)|MASK_GAS))
can be generated. Does anybody mind that?
Wed Jan 7 14:08:47 1998 Kanazawa Yuzi <kanazawa@flab.fujitsu.co.jp>
* gcc/configure.in: Add parenthesis to some
target_cpu_default2 definitions.
--- gcc/configure.in.orig Thu Dec 25 08:55:51 1997
+++ gcc/configure.in Wed Jan 7 14:08:47 1998
@@ -2672,13 +2672,13 @@
alpha*-*-*)
case $machine in
alphaev6*)
-
target_cpu_default2="MASK_CPU_EV6|MASK_BXW|MASK_CIX|MASK_MAX"
+
target_cpu_default2="(MASK_CPU_EV6|MASK_BXW|MASK_CIX|MASK_MAX)"
;;
alphapca56*)
-
target_cpu_default2="MASK_CPU_EV5|MASK_BWX|TASK_MAX"
+
target_cpu_default2="(MASK_CPU_EV5|MASK_BWX|TASK_MAX)"
;;
alphaev56*)
-
target_cpu_default2="MASK_CPU_EV5|MASK_BWX"
+
target_cpu_default2="(MASK_CPU_EV5|MASK_BWX)"
;;
alphaev5*)
target_cpu_default2="MASK_CPU_EV5"
@@ -2691,7 +2691,7 @@
then
target_cpu_default2="MASK_GAS"
else
-
target_cpu_default2="${target_cpu_default2}|MASK_GAS"
+
target_cpu_default2="(${target_cpu_default2}|MASK_GAS)"
fi
fi
;;
====>
The usage of explicit parentheses is consistent the default-default
target ...
HTH,
Toon.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~1998-01-20 14:54 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-01-12 20:31 EGCS vs GCC performance Dave Steffen
1998-01-14 4:17 ` jbuck
1998-01-15 16:06 ` David Edelsohn
1998-01-15 16:30 ` Jeffrey A Law
1998-01-17 23:02 ` Marc Lehmann
1998-01-18 21:51 ` Harvey J. Stein
1998-01-18 0:13 ` Jeffrey A Law
1998-01-19 10:12 ` Richard Henderson
1998-01-20 2:21 ` Harvey J. Stein
1998-01-20 10:07 ` Richard Henderson
1998-01-20 14:54 ` Toon Moene
1998-01-20 14:54 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).