public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "whaley at cs dot utsa dot edu" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
Date: Tue, 19 Dec 2006 16:04:00 -0000	[thread overview]
Message-ID: <20061219160432.31588.qmail@sourceware.org> (raw)
In-Reply-To: <bug-30255-12761@http.gcc.gnu.org/bugzilla/>



------- Comment #9 from whaley at cs dot utsa dot edu  2006-12-19 16:04 -------
Ian,

Thanks for the info.  I see I failed to consider the cross-register moves you
mentioned.  However, can't those be moved through memory, where something
destined for a 64-bit register is first written from the 80-bit reg with
round-down?  Thus, you only do the round down when you have to change register
sets.  In a code compiled with -mfpmath=387, I would think that would occur
pretty much only at function epilogue for the return value . . .  Anyway, I see
how, depending on the framework, this may be more complicated than it seemed. 
However, my own compilation experience is that cross-precision/type conversions
are always complicated?

>All in all it's pretty hard for me to get excited about better support for
>80387 when all modern x87 chips support SSE2 which is more consistent and
>faster.  See the option -mfpmath=sse.

First, it is consistant only in that it always has 64-bit precision.  This is
like prefering a car that can only achieve 30 MPH to one that can go to 60, but
only for short stretches, and must sometimes slow down to 30.  The first is
more consistant, but hardly to be prefered :)

It is certainly the case that the x87 is of decreasing importance.  However,
scalar SSE (the default with gcc) does *not* in general on the present
generation run as fast as the x87 (I believe this common misconception comes
from conflating vector and scalar performance; on AMDs, even vector performance
is less than x87 for double precision).  

In particular, single precision scalar SSE seems to be much slower than x87
code, and double precision seems to be slightly slower *even when all 16 SSE
regs are used, in contrast to the crappy 8-reg x87 stack*.  Without proof, I
ascribe the closer double performance to the availability of movlpd, which
provides a low-cost scalar load not enjoyed by single precision (which must use
movss).  The only platform where scalar SSE *may* be competitive or better is
Core2Duo, and I haven't had a chance to do benchmarks there to see.  Note that
there is one performance advantage that x87 code will pretty much always have,
even once the archs improve their scalar SSE performance: it's much more
compact due to being defined earlier in the CISC instruction set, which can
massively reduce your instruction load on heavily unrolled loops, and allow
more instructions to fit in the selection window.

Now, if the performance were even (rather than x87 being faster), numerical
guys would still sometimes prefer the x87, in order to get that free extra
precision.  If 10,000 flops are done in 80-bit precision, your worst-case error
is roughly epsilon.  If they are done in 64-bit (SSE), your worst-case error is
10,000*epsilon.  Which would you prefer if you were in the space ship whose
flight path was being calculated? :)

Thanks,
Clint


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


  parent reply	other threads:[~2006-12-19 16:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
2006-12-18 20:16 ` [Bug target/30255] " pinskia at gcc dot gnu dot org
2006-12-18 20:43 ` whaley at cs dot utsa dot edu
2006-12-18 21:17 ` whaley at cs dot utsa dot edu
2006-12-18 22:04 ` pinskia at gcc dot gnu dot org
2006-12-18 22:14 ` whaley at cs dot utsa dot edu
2006-12-18 23:03 ` pinskia at gcc dot gnu dot org
2006-12-19  0:32 ` whaley at cs dot utsa dot edu
2006-12-19 14:57 ` ian at airs dot com
2006-12-19 16:04 ` whaley at cs dot utsa dot edu [this message]
2006-12-19 17:18 ` whaley at cs dot utsa dot edu
2006-12-27 16:22 ` rguenth at gcc dot gnu dot org
     [not found] <bug-30255-4@http.gcc.gnu.org/bugzilla/>
2014-02-16 13:13 ` jackie.rosen at hushmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061219160432.31588.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).