public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "whaley at cs dot utsa dot edu" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3
Date: Wed, 31 May 2006 14:13:00 -0000	[thread overview]
Message-ID: <20060531141259.6116.qmail@sourceware.org> (raw)
In-Reply-To: <bug-27827-12761@http.gcc.gnu.org/bugzilla/>



------- Comment #8 from whaley at cs dot utsa dot edu  2006-05-31 14:12 -------
Subject: Re:  gcc 4 produces worse x87 code on all platforms than gcc 3

Uros,

>IMO the fact that gcc 3.x beats 4.x on this code could be attributed to pure luck.

As far as understanding from first principles, performance on a modern x86
(which is busy doing OOE, register renaming, CISC/RISC translation, operand
fusion and fission, etc) is *always* a blind accident, IMHO :)   I've
hand-tuned code for the x87 for a *long* time (and written my own compilation
framework), and it has been my experience that only by trying different
schedules, instruction selection, etc. can you get decent performing code.  gcc
actually does an amazing job of x87 performance when it's working right, and I
always figured it had to empirically tweaked to get that level of performance. 
The fact that x87 performance always drops off at major releases (return to
first principles over discovered best-cases) seems to verify this . . .

So, I agree with you that the difference does not seem to have some big plan
behind it, but I want to stress that it is nonetheless critical: it happens to
all x87 codes on every x86 machine (I have so far tried Pentium-D, Athlon 64
X2, and P4e), and it happens no matter what optimized code I feed gcc 4.  Note
that ATLAS is not a static library, but rather uses a code generator to tune
matrix multiplication.  What this means is that ATLAS tries thousands of
different source implementations in trying to find one that will run the
fastest on the given architecture/compiler (the code generator does things like
tiling, register blocking, unroll & jam, software pipelining, unrolling, all at
the ANSI C source level, in an attempt to find the combo that the compiler/arch
likes etc).  On no x86 architecture I've installed on can gcc 4 compete with
gcc 3.  Thus, out of literally thousands of implementations on each platform,
gcc 4 cannot find one that it can compete with gcc 3's best 
 case.  I cannot, of course, send you thousands of codes and say "see all of
these are inferior", but they are, and the case I sent is not the worst.  For
instance, for single precision gemm on the Athlon 64, the kernel tuned for gcc
4 (best case of thousands taken) runs at 56.7% of the performance of the gcc
3-tuned kernel.  Nor does using SSE fix things: gcc 4 is still far slower using
SSE than gcc 3 using the x87 on all platforms, and for single precision, the
gap is worse than between x87 implementations!

Thanks,
Clint


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827


  parent reply	other threads:[~2006-05-31 14:13 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-31  0:33 [Bug rtl-optimization/27827] New: " hiclint at gmail dot com
2006-05-31  0:35 ` [Bug rtl-optimization/27827] " pinskia at gcc dot gnu dot org
2006-05-31  0:36 ` hiclint at gmail dot com
2006-05-31  0:42 ` [Bug target/27827] " pinskia at gcc dot gnu dot org
2006-05-31  0:50 ` hiclint at gmail dot com
2006-05-31  0:55 ` pinskia at gcc dot gnu dot org
2006-05-31  1:09 ` whaley at cs dot utsa dot edu
2006-05-31 10:57 ` uros at kss-loka dot si
2006-05-31 14:13 ` whaley at cs dot utsa dot edu [this message]
2006-06-01  8:43 ` uros at kss-loka dot si
2006-06-01 16:03 ` whaley at cs dot utsa dot edu
2006-06-01 16:26 ` whaley at cs dot utsa dot edu
2006-06-01 18:43 ` whaley at cs dot utsa dot edu
2006-06-07 22:39 ` whaley at cs dot utsa dot edu
2006-06-14  3:04 ` whaley at cs dot utsa dot edu
2006-06-24 18:11 ` whaley at cs dot utsa dot edu
2006-06-24 19:13 ` rguenth at gcc dot gnu dot org
2006-06-25 13:35 ` whaley at cs dot utsa dot edu
2006-06-25 23:05 ` rguenth at gcc dot gnu dot org
2006-06-26  1:12 ` whaley at cs dot utsa dot edu
2006-06-26  7:53 ` uros at kss-loka dot si
2006-06-26 16:02 ` whaley at cs dot utsa dot edu
2006-06-27  6:05 ` uros at kss-loka dot si
2006-06-27 14:37 ` whaley at cs dot utsa dot edu
2006-06-27 17:47 ` whaley at cs dot utsa dot edu
2006-06-28 17:37 ` [Bug target/27827] [4.0/4.1/4.2 Regression] " steven at gcc dot gnu dot org
2006-06-28 20:18 ` whaley at cs dot utsa dot edu
2006-06-29  4:18 ` hjl at lucon dot org
2006-06-29  6:43 ` whaley at cs dot utsa dot edu
2006-07-04 13:15 ` whaley at cs dot utsa dot edu
2006-07-05 17:55 ` mmitchel at gcc dot gnu dot org
2006-08-04  7:46 ` bonzini at gnu dot org
2006-08-04 16:24 ` whaley at cs dot utsa dot edu
2006-08-05  7:21 ` bonzini at gnu dot org
2006-08-05 14:24 ` whaley at cs dot utsa dot edu
2006-08-05 17:16 ` bonzini at gnu dot org
2006-08-05 18:26 ` whaley at cs dot utsa dot edu
2006-08-06 15:03 ` [Bug target/27827] [4.0/4.1 " whaley at cs dot utsa dot edu
2006-08-07  6:19 ` bonzini at gnu dot org
2006-08-07 15:32 ` whaley at cs dot utsa dot edu
2006-08-07 16:47 ` whaley at cs dot utsa dot edu
2006-08-07 16:58 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-07 17:19 ` whaley at cs dot utsa dot edu
2006-08-07 18:19 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-07 20:35 ` dorit at il dot ibm dot com
2006-08-07 21:57 ` whaley at cs dot utsa dot edu
2006-08-08  2:59 ` whaley at cs dot utsa dot edu
2006-08-08  6:15 ` hubicka at gcc dot gnu dot org
2006-08-08  6:28   ` Jan Hubicka
2006-08-08  6:29 ` hubicka at ucw dot cz
2006-08-08  7:05 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-08 16:44 ` whaley at cs dot utsa dot edu
2006-08-08 18:36 ` whaley at cs dot utsa dot edu
2006-08-09  4:34 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-09 14:33 ` whaley at cs dot utsa dot edu
2006-08-09 15:52 ` whaley at cs dot utsa dot edu
2006-08-09 16:08 ` whaley at cs dot utsa dot edu
2006-08-09 19:10   ` Dorit Nuzman
2006-08-09 19:10 ` dorit at il dot ibm dot com
2006-08-09 21:33 ` whaley at cs dot utsa dot edu
2006-08-09 21:46   ` Andrew Pinski
2006-08-09 21:46 ` pinskia at physics dot uc dot edu
2006-08-09 23:02 ` whaley at cs dot utsa dot edu
2006-08-10  6:52 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-10 14:08 ` whaley at cs dot utsa dot edu
2006-08-10 14:29 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-10 15:16 ` whaley at cs dot utsa dot edu
2006-08-10 15:22 ` paolo dot bonzini at lu dot unisi dot ch
2006-08-11  9:19 ` uros at kss-loka dot si
2006-08-11 13:26 ` bonzini at gcc dot gnu dot org
2006-08-11 14:10 ` [Bug target/27827] [4.0 " bonzini at gnu dot org
2006-08-11 15:22 ` whaley at cs dot utsa dot edu
2006-08-23 10:36 ` oliver dot jennrich at googlemail dot com
2006-10-07 10:06 ` steven at gcc dot gnu dot org
2007-02-13  2:59 ` pinskia at gcc dot gnu dot org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060531141259.6116.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).