[Bug target/30255] register spills in x87 unit need to be 80-bit, not 64

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
       [not found] <bug-30255-4@http.gcc.gnu.org/bugzilla/>
@ 2014-02-16 13:13 ` jackie.rosen at hushmail dot com
  0 siblings, 0 replies; 12+ messages in thread
From: jackie.rosen at hushmail dot com @ 2014-02-16 13:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

Jackie Rosen <jackie.rosen at hushmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jackie.rosen at hushmail dot com

--- Comment #12 from Jackie Rosen <jackie.rosen at hushmail dot com> ---
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (9 preceding siblings ...)
  2006-12-19 17:18 ` whaley at cs dot utsa dot edu
@ 2006-12-27 16:22 ` rguenth at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-12-27 16:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from rguenth at gcc dot gnu dot org  2006-12-27 16:21 -------
Just to mention it - you can use 'long double' to force 80bit spills.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (8 preceding siblings ...)
  2006-12-19 16:04 ` whaley at cs dot utsa dot edu
@ 2006-12-19 17:18 ` whaley at cs dot utsa dot edu
  2006-12-27 16:22 ` rguenth at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-19 17:18 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #10 from whaley at cs dot utsa dot edu  2006-12-19 17:18 -------
Guys,

In the interests of full disclosure, I did some quick timings on the Core2Duo,
and as I kind of suspected, scalar SSE crushed x87 there.  I was pretty sure
scalar SSE could achieve 2 flop/cycle, while Intel kept the x87 at 1
flop/cycle, and that's what my timings show.  So, it does appear likely that
the only people using the x87 in the future on the Intel will be people who
need the extra precision (and those people would really like this fix, I will
point out :).  All other Intel archs (P4, PIII, etc) do 1 flop cycle for both
scalar SSE and x87.

On the AMDs, both x87 and scalar SSE can achieve 2 flop/cycle, with x87 running
somewhat faster, with only a slight advantage in double precision, and a more
commanding one in single.  It looks like the next generation of AMDs will
increase the maximal flop rate of vector SSE, but it does not look like they
will increase the max flop rate of scalar SSE, so this may continue to be the
case going forward . . .

Cheers,
Clint

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (7 preceding siblings ...)
  2006-12-19 14:57 ` ian at airs dot com
@ 2006-12-19 16:04 ` whaley at cs dot utsa dot edu
  2006-12-19 17:18 ` whaley at cs dot utsa dot edu
  2006-12-27 16:22 ` rguenth at gcc dot gnu dot org
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-19 16:04 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #9 from whaley at cs dot utsa dot edu  2006-12-19 16:04 -------
Ian,

Thanks for the info.  I see I failed to consider the cross-register moves you
mentioned.  However, can't those be moved through memory, where something
destined for a 64-bit register is first written from the 80-bit reg with
round-down?  Thus, you only do the round down when you have to change register
sets.  In a code compiled with -mfpmath=387, I would think that would occur
pretty much only at function epilogue for the return value . . .  Anyway, I see
how, depending on the framework, this may be more complicated than it seemed. 
However, my own compilation experience is that cross-precision/type conversions
are always complicated?

>All in all it's pretty hard for me to get excited about better support for
>80387 when all modern x87 chips support SSE2 which is more consistent and
>faster.  See the option -mfpmath=sse.

First, it is consistant only in that it always has 64-bit precision.  This is
like prefering a car that can only achieve 30 MPH to one that can go to 60, but
only for short stretches, and must sometimes slow down to 30.  The first is
more consistant, but hardly to be prefered :)

It is certainly the case that the x87 is of decreasing importance.  However,
scalar SSE (the default with gcc) does *not* in general on the present
generation run as fast as the x87 (I believe this common misconception comes
from conflating vector and scalar performance; on AMDs, even vector performance
is less than x87 for double precision).  

In particular, single precision scalar SSE seems to be much slower than x87
code, and double precision seems to be slightly slower *even when all 16 SSE
regs are used, in contrast to the crappy 8-reg x87 stack*.  Without proof, I
ascribe the closer double performance to the availability of movlpd, which
provides a low-cost scalar load not enjoyed by single precision (which must use
movss).  The only platform where scalar SSE *may* be competitive or better is
Core2Duo, and I haven't had a chance to do benchmarks there to see.  Note that
there is one performance advantage that x87 code will pretty much always have,
even once the archs improve their scalar SSE performance: it's much more
compact due to being defined earlier in the CISC instruction set, which can
massively reduce your instruction load on heavily unrolled loops, and allow
more instructions to fit in the selection window.

Now, if the performance were even (rather than x87 being faster), numerical
guys would still sometimes prefer the x87, in order to get that free extra
precision.  If 10,000 flops are done in 80-bit precision, your worst-case error
is roughly epsilon.  If they are done in 64-bit (SSE), your worst-case error is
10,000*epsilon.  Which would you prefer if you were in the space ship whose
flight path was being calculated? :)

Thanks,
Clint

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (6 preceding siblings ...)
  2006-12-19  0:32 ` whaley at cs dot utsa dot edu
@ 2006-12-19 14:57 ` ian at airs dot com
  2006-12-19 16:04 ` whaley at cs dot utsa dot edu
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: ian at airs dot com @ 2006-12-19 14:57 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #8 from ian at airs dot com  2006-12-19 14:57 -------
I think I agree that if we spill an 80387 register to the stack, and then load
the value back into an 80387 register, that we should spill all 80 bits, rather
than implicitly converting to DFmode or SFmode.

This would unfortunately be rather difficult to implement in the context of
gcc's register allocator, because it is perfectly normal for gcc to spill
values from one type of register and reload them into a different type of
register.  Thus the value might move between an 80387 register, a pair of
ordinary x86 registers, and an SSE/SSE2 register, all in the same function.  It
would just depend on how the value was being used.

Currently gcc simply says the value is DFmode or SFmode, and more or less
ignores the fact that it is being represented as an 80-bit value in an 80387
register.  To implement this suggestion we would need to add a new notion: the
mode of the spill value.  And we would need to support secondary reloads to
convert 80-bit spill values as required.  That sounds rather complicated, but
if we didn't do that, then I think we would still be inconsistent in some
cases.  I don't see any point to making this change unless we can always be
consistent.

All in all it's pretty hard for me to get excited about better support for
80387 when all modern x87 chips support SSE2 which is more consistent and
faster.  See the option -mfpmath=sse.

-- 

ian at airs dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ian at airs dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (5 preceding siblings ...)
  2006-12-18 23:03 ` pinskia at gcc dot gnu dot org
@ 2006-12-19  0:32 ` whaley at cs dot utsa dot edu
  2006-12-19 14:57 ` ian at airs dot com
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-19  0:32 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #7 from whaley at cs dot utsa dot edu  2006-12-19 00:31 -------
>Depends on what you mean by fixable by the programmer because most people don't
know anything about precusion issues.  

Most people don't know programming at all, so I guess you are suggesting that
errors that are fixable at the source-code level must nonetheless always be
fixed by the compiler?   More to the point, the people who truly care about
precision *are* often aware of these kinds of fixes, but they are helpless in
this case, unlike for bug 323 (which is why they should not be conflated).

My point was that for bug 323, there is something the user can do to fix, and
that something does not hurt overall performance or accuracy.  Since the
problem I reported is caused completely by gcc, impacts accuracy in the same
way as reordering (which gcc prohibits), and there is nothing that the user can
do to fix without drastic loss of performance or accuracy, gcc is the only
place it can be fixed.  This problem is a narrow discrete case that can clearly
be fixed by gcc, whereas 323 is a broad class of problems which cannot be fixed
without adding to the C language the concept of mixed precisions within a type.
 Therefore, I strongly believe that it is perfectly valid to say that 323
cannot be solved in gcc, but clearly untrue to say that about this case, and so
this bug report should have been closed as "we don't care", not as "duplicate".

>This was a design of the x86 back-end because it gives a nice speed.

The fix I suggested would only slow spill (note: I mean gcc-spilled code, not
explicit load/stores by the programmer) code, and would therefore make
noticable performance difference in very few cases.  Note that unlike the
straw-man of bug 323 I am *not* advocating gcc handle all extra precision
behavior, just its undetectable spill rounding.  

If the performance issue is greater than I suspect, obviously there could be a
flag for this behavior.  I find it a bit anomolous that a compiler that is so
picky about bit-level accuracy that it forbids reordering operations without a
special flag, feels free to randomly round in an algorithm, even though the fix
would not hurt performance as much as not performing reordering optimizations
does, and introduces the same type of error.  That it does so on the most
common platform on earth just adds to the beauty :)

Thanks,
Clint

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (4 preceding siblings ...)
  2006-12-18 22:14 ` whaley at cs dot utsa dot edu
@ 2006-12-18 23:03 ` pinskia at gcc dot gnu dot org
  2006-12-19  0:32 ` whaley at cs dot utsa dot edu
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-12-18 23:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from pinskia at gcc dot gnu dot org  2006-12-18 23:02 -------
>I cannot, of course, force you to admit it, but 323 is a bug fixable by the
> programmer, and this one is not. 

Depends on what you mean by fixable by the programmer because most people don't
know anything about precusion issues.  This was a design of the x86 back-end
because it gives a nice speed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (3 preceding siblings ...)
  2006-12-18 22:04 ` pinskia at gcc dot gnu dot org
@ 2006-12-18 22:14 ` whaley at cs dot utsa dot edu
  2006-12-18 23:03 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-18 22:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from whaley at cs dot utsa dot edu  2006-12-18 22:14 -------
I cannot, of course, force you to admit it, but 323 is a bug fixable by the
programmer, and this one is not.  The other requires a lot of work in the
compiler, and this does not.  So, viewing them as the same can be done, in the
same way that all x87/gcc bugs are the same, or all precision bugs are the
same, but since neither their genesis or solution are the same, it is
misleading to do so.  Saying you don't care to fix it is an honest answer,
closing it because it is a duplicate of a much larger and harder problem for
which known workarounds exist is not.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
                   ` (2 preceding siblings ...)
  2006-12-18 21:17 ` whaley at cs dot utsa dot edu
@ 2006-12-18 22:04 ` pinskia at gcc dot gnu dot org
  2006-12-18 22:14 ` whaley at cs dot utsa dot edu
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-12-18 22:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2006-12-18 22:04 -------
The problem with register spilling and what PR 323 is talking about is all the
same issue really, it is just exposed differently.

*** This bug has been marked as a duplicate of 323 ***


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
  2006-12-18 20:16 ` [Bug target/30255] " pinskia at gcc dot gnu dot org
  2006-12-18 20:43 ` whaley at cs dot utsa dot edu
@ 2006-12-18 21:17 ` whaley at cs dot utsa dot edu
  2006-12-18 22:04 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-18 21:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from whaley at cs dot utsa dot edu  2006-12-18 21:16 -------
BTW, in case it isn't obvious, here's the fix that I typically use for problems
like bug 323 that I cannot when it is gcc itself that is unpredictably spilling
the computation:

void test(double x, double y)
{
  const double y2 = x + 1.0;
  volatile double v[2];
  v[0] = y2;
  v[1] = y;
  if (v[0] != v[1]) printf("error\n");
}

The idea being that the volatile keyword prevents gcc from getting rid of the
store/load cycle, which forces the round-down.  This allows me to still do this
kind of comparison, w/o the speed loss of associated with -ffloat-store (the
compare itself becomes slow due to the store/load, but the body of the code
runs as fast as normal), or the loss of precision associated with always
rounding to 64 bit, as when you change the x87 control word.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
  2006-12-18 20:16 ` [Bug target/30255] " pinskia at gcc dot gnu dot org
@ 2006-12-18 20:43 ` whaley at cs dot utsa dot edu
  2006-12-18 21:17 ` whaley at cs dot utsa dot edu
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: whaley at cs dot utsa dot edu @ 2006-12-18 20:43 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #2 from whaley at cs dot utsa dot edu  2006-12-18 20:43 -------
Hi,

While it may be decided not to fix this problem, this is not a duplicate of bug
323, and so it should be closed for another reason if you want to ignore it. 
323 has a problem because of the function call, where a programmer knows that a
round-down can occur by examining the code.  This problem is due to register
spilling, and so no amount of source examination can figure out if this could
occur.  Therefore, 323 can be worked around by the knowledgable user, and this
one cannot.  Also, the 323 would require a pragmas or something to prevent,
whereas this problem could be completely avoided merely by spilling the 80-bit
value when gcc decides to spill.  Since this problem cannot be worked around,
and has a much more discrete fix, it is very different indeed from the much
harder to fix 323.

Thanks,
Clint

-- 

whaley at cs dot utsa dot edu changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|DUPLICATE                   |

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64
  2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
@ 2006-12-18 20:16 ` pinskia at gcc dot gnu dot org
  2006-12-18 20:43 ` whaley at cs dot utsa dot edu
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-12-18 20:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from pinskia at gcc dot gnu dot org  2006-12-18 20:16 -------


*** This bug has been marked as a duplicate of 323 ***


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-02-16 13:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-30255-4@http.gcc.gnu.org/bugzilla/>
2014-02-16 13:13 ` [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64 jackie.rosen at hushmail dot com
2006-12-18 20:08 [Bug target/30255] New: " whaley at cs dot utsa dot edu
2006-12-18 20:16 ` [Bug target/30255] " pinskia at gcc dot gnu dot org
2006-12-18 20:43 ` whaley at cs dot utsa dot edu
2006-12-18 21:17 ` whaley at cs dot utsa dot edu
2006-12-18 22:04 ` pinskia at gcc dot gnu dot org
2006-12-18 22:14 ` whaley at cs dot utsa dot edu
2006-12-18 23:03 ` pinskia at gcc dot gnu dot org
2006-12-19  0:32 ` whaley at cs dot utsa dot edu
2006-12-19 14:57 ` ian at airs dot com
2006-12-19 16:04 ` whaley at cs dot utsa dot edu
2006-12-19 17:18 ` whaley at cs dot utsa dot edu
2006-12-27 16:22 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).