public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-17 14:38 Toon Moene
  1998-12-17 15:30 ` Harvey J. Stein
  1998-12-18 12:50 ` Dave Love
  0 siblings, 2 replies; 65+ messages in thread
From: Toon Moene @ 1998-12-17 14:38 UTC (permalink / raw)
  To: egcs

Toon> However, what I'm challenging is that we should burden
 Toon> the compiler with these considerations *by default* 

> Indeed.

> I haven't followed all this, but am somewhat bemused 
> by it.

I must say that I am *not* amused.  This discussion goes into a
direction that will leave us with a compiler that, although
numerical-politically correct, will be generating such slow code as to
be totally unuseable.

> There are frequent complaints about Fortran due to the > x86 register business.  All the ones I've checked have > been covered by the advice in the g77 manual to link
> code frobbing the control word (which allows us to
> pass paranoia).  Anyone know of exceptions?

Exactly.  That's the test we should be aiming for - not something
someone comes up with who hasn't taken the time to read the relevant
(numerical analysis) texts.

Sorry to be so harsh, but I'm trying to save a compiler here.

To put it all in a one-liner:

The compiler can't - and won't - save you from doing a numerical
analysis class.

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 65+ messages in thread
* RE: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-17 14:37 tprince
  1998-12-17 15:15 ` Stephen L Moshier
  0 siblings, 1 reply; 65+ messages in thread
From: tprince @ 1998-12-17 14:37 UTC (permalink / raw)
  To: bosch, egcs, hjstein, moshier

          -Reply


>>Since my proposal is not likely to be implemented anytime
soon, I
suggest we try patching gcc to force programs to start in 64-bit
mode and otherwise not worry about it ("set once and forget") as
you've been proposing.<<

Can't this be made optional, either a flag which links the
modified startup, or a system function call?
Dr. Timothy C. Prince
Consulting Engineer
Solar Turbines, a Caterpillar Company
alternate e-mail: tprince@computer.org

           To:                                              INTERNET - IBMMAIL
                                                            N3356140 - IBMMAIL

^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-17 11:27 Brad Lucier
  1998-12-17 14:51 ` Marc Lehmann
  0 siblings, 1 reply; 65+ messages in thread
From: Brad Lucier @ 1998-12-17 11:27 UTC (permalink / raw)
  To: burley, egcs; +Cc: lucier

I agree with Craig (I think ;-).

To get consistent, predictable, useable floating-point results, it is
absolutely necessary that spilled floating-point registers be stored in
memory in a format such that spilling and reading a value back into a
register does not change one bit (in the technical sense) of the number.
If that means spilling FP registers to 80 bit temporaries aligned to
128-bit (or 64bit) boundaries, then so be it.

I say this as someone who has built highly accurate elementary function
routine libraries, together with test libraries for those routines, built
test libraries for last-bit accuracy floating-point IO routines, etc.
The programmer needs absolute control over the precision and range of
all results, including intermediate results, for him/her to be successful
at things like this.

Brad Lucier      lucier@math.purdue.edu

^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-16 13:52 Toon Moene
  1998-12-17 10:06 ` Craig Burley
  1998-12-17 11:20 ` Dave Love
  0 siblings, 2 replies; 65+ messages in thread
From: Toon Moene @ 1998-12-16 13:52 UTC (permalink / raw)
  To: egcs

Edward, Joe and Craig,

I'm not going to address all the posts separately - lets just
concentrate on Edward's:

> Also, you can use genuinely _better_ algorithms when you can rely on 
> something very close to IEEE, and that is currently pretty hard on 
> x86 with gcc.  And a touch of extended precision can really lead to 
> algorithms that give huge performance improvements (factors of 20-40
> for normal eqns. v. QR for least squares), although those examples
> are beyond the current discussion.

I'll grant you that one - I sure never researched the boundaries of IEEE
754 arithmetic.  However, what I'm challenging is that we should burden
the compiler with these considerations *by default* (I don't mind if
it's hidden behind a compile time option like -pedantic-numerics).

> And Mr. Buck's example does happen in real code.

Yes, but that doesn't make it correct, even on a strict, one size fits
all IEEE 754 machine.  The point is that the following code:

      REAL FUNCTION FINDROOT(FIRSTGUESS)
  10  FINDROOT=<expression involving FIRSTGUESS>
      IF (FINDROOT .EQ. FIRSTGUESS) RETURN
      FIRSTGUESS = FINDROOT
      GOTO 10
      END

simply is not guaranteed to work (I discussed this on comp.compilers
some months ago).  For an arbitrary choice of FIRSTGUESS and <expression
involving FIRSTGUESS> one _cannot_ prove that this won't eternally
oscillate between two numbers just one bit apart *in any precision*.

So this is the wrong way to solve such a problem.

The correct termination comparison is:

      IF (ABS(FINDROOT - FIRSTGUESS) .LT.
     ,    TOLERANCE * FIRSTGUESS) RETURN

with a suitable value of TOLERANCE (dependent on whether computations
are with 32, 64, or 80 bit REALS (Fortran 90 offers intrinsics to
parametrise this).

>  - My main concern is that there is a grid spacing that will render the
>  - basic equation of geostrophy badly approximated in 32-bit arithmetic:
> 
> And at the moment, that entirely depends on which variables happen
> to be spilled and which don't.

[ Sorry, I meant: My *only* concern is ... ]

No, because _we_ know what we're doing because we estimated the error
propagation in an independent way.

What I tried to get across is that it is not *reasonable* to punish
32-bit applications with the burden of either 1) unaligned 80 bit spills
or 2) aligned 80 bit spills that are 4 times as large as necesssary.

Yes, I know that you believe that floating point register spills are
scarce.  You probably also believe that "Real Programmers are not afraid
of 5 page long DO loops" was meant as a joke.

Cheers,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-15 12:24 Toon Moene
  1998-12-15 12:55 ` Joe Buck
                   ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Toon Moene @ 1998-12-15 12:24 UTC (permalink / raw)
  To: egcs

> Using 80-bit spills is a quick approximation to 
> extending the FP stack into memory, and it should give > some of the benefit with very little  (hopefully) 
> hassle.  Of course, 80 bits is wider than the normal 
> spill, so it eats more memory bandwidth, cache space, 
> etc.  Anyone who's that concerned will bend over 
> backwards to avoid spills anyways.

Am I the only one - apart from Harvey J. Stein and Tim Prince - who
finds this whole discussion unreal ?  Surely, 80 bit temporaries might
seem a neat hack to a numerical analyst like Dr. Kahan, but the ordinary
computational physicist or chemist knows better than to choose "poorly
conditioned" algorithms.

My main concern is that there is a grid spacing that will render the
basic equation of geostrophy badly approximated in 32-bit arithmetic:

	 1  dp
	--- -- = f v
	rho dx

p	pressure
rho	mass of air per unit volume (1 kg / m^3)
x	distance
f	Coriolis parameter (10^-4 s^-1)
v	wind speed

You can do the math (p ~ 10^5 kg m^-1 s^-2, v ~ 10 m/s, what dx will
make dp < 10^-3 p ?)

If that's the case we have to rethink our finite difference code for the
first time in 13 years and use a trick like subtracting a basic state
from the equations - big deal.

The last thing I need is to have egcs slowed down to a crawl by having
it spill unaligned 80-bit temporaries for something that shouldn't be
larger than 32 bits in the first place.

Please make this and other "accuracy" options a "-pedantic-numerics"
one.

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-15 12:10 Geert Bosch
  1998-12-15 13:09 ` Jeffrey A Law
  0 siblings, 1 reply; 65+ messages in thread
From: Geert Bosch @ 1998-12-15 12:10 UTC (permalink / raw)
  To: Jeffrey A Law, Joe Buck, law; +Cc: egcs, hjstein, moshier, tprince

On Tue, 15 Dec 1998 11:12:01 -0700, Jeffrey A Law wrote:

  For integer, we need to know where the parens are to preserve integer overflow
  semantics in languages like Ada for similar transformations

I'd like to comment on this issue for Ada, and I'll explain what the current
situation is in GNAT and how (small?) compiler changes could improve efficiency 
of overflow checking.

Currently overflow checking is not done by default in GNAT (GNU Ada95 compiler).
To be fully standards conforming you need to run the compiler with the -gnato 
flag which enables these checks. The reason these checks are disabled is that 
they are inefficient, at least on 32-bit targets.

The compiler does all arithmetic that needs to be checked for overflows using a
wider type. Regular 32-bit integers are calculated using 64-bits. To get efficient
overflow checks, the compiler should be able to take advantage of overflow
bits in the status register and raise an exception when an overflow is
detected. This would be a place where the backend could help, although I don't
know exactly how this should be implemented.

Reordering integer additions is fine for Ada-95, as exceptions do not need 
to be exact as long as they occur in the same block. It is also allowed to
not raise an exception at all if the final result is mathematically correct,
even if intermediate values would have overflowed. Also when some operation
would have no external effect in the absense of checks, the compiler is allowed
to remove the checks and as a result usually is able to remove the operation
as well.

With checks off, the behavior in GNAT is the same as with C. Formally, it
would still be allowed to detect overflows or range checks and raise an
exception. Suppressing checks only means that the implementation should
not impose extra overhead because of the checks. 

Regards,
   Geert

PS. This description is informal, for the exact details see the Ada RM.
    (ISO/IEC/ANSI 8652:1995, http://www.adahome.com/Resources/refs/rm95.html )



^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-15  1:45 Geert Bosch
  1998-12-15  3:34 ` Harvey J. Stein
                   ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Geert Bosch @ 1998-12-15  1:45 UTC (permalink / raw)
  To: Harvey J. Stein, moshier; +Cc: egcs, hjstein, tprince

On 14 Dec 1998 11:51:23 +0200, Harvey J. Stein wrote:

  Reasonable floating point code should expect that reordering
  operations will produce slightly different results due to round off
  error, and should be tolerant of the optimizer doing such.  Especially
  given how little control the programmer has over exactly how
  computations are ordered.

Many useful fpt algorithms rely on ordering of operations to be honored, 
and a compiler evaluating  B + (A - B) as (B + A) - B or even as A 
is seriously broken for numerical stuff.

Having spills to memory retain full precision is very useful as this allows
one to prove much more about fpt code. Here is an example of what I mean,
using a decimal fpt type with 4 digits for extended precision in registers 
and 3 for the in-memory precision of a variable. (Examples using binary
64-bit and 80-bit fpt types are similar but harder to read.)

Calculate  S = (10.0 + 0.454) - (0.454 + 10.0), spilling one partial sum to T.

    Case 1)                Case 2)                  Case 3)

    10.0                   10.0                     10.0
     0.454 +                0.454 +                  0.454 +
    -----                  ------                   -----
    10.5                   10.45                    10.45

T = 10.5               T = 10.45                T = 10.4
                0.454                  0.454                    0.454
               10.0  +                10.00  +                 10.00  +
               -----                  ------                   ------
    10.5  - << 10.5        10.45 - << 10.45         10.45 - << 10.45
    -----                  -----                    -----
S =  0.00              S =  0.00                S = -0.05


Case 1 does not use extended registers and rounds at every operation.
  This is completely IEEE conformant behaviour.

Case 2 uses extended registers and same precision for spilled value.
  This is not IEEE-conformant, but guarantees consistent rounding behaviour.
  In particular the relative error is never more than that of case 1.
  For most algorithms this will work fine, double rounding will only
  occur on the final assignment. This is not ideal, but now worst-case
  is one double rounding per statement instead of one per operation.
  If assignments are forced to go to memory (using volatile var's for example),
  fpt behaviour is independent of optimization level. 

Case 3 uses extended registers, but lower precision for spilled value.
  This is the worst case and is what is causing problems right now.
  The intermediate values while evaluating the expression may be subject 
  to double rounding errors. People who care about right answers often 
  turn off optimization, but ironically this makes the problems only worse! 

Regards,
   Geert


^ permalink raw reply	[flat|nested] 65+ messages in thread
* Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-13 18:23 Stephen L Moshier
  1998-12-14  1:52 ` Harvey J. Stein
  0 siblings, 1 reply; 65+ messages in thread
From: Stephen L Moshier @ 1998-12-13 18:23 UTC (permalink / raw)
  To: tprince, egcs

The extra-precise registers are supposed to be a feature, not a bug.
Neither the computer language nor the compiler has a way to say
"this is an extra-precise register" so there is some inconvenience
using the feature.  It can't be made consistent.  The harder you look,
the more contradictions you find.

If you don't believe that, the alternative that makes sense is to
ask for straight IEEE behavior.  You can't get IEEE behavior without
setting the coprocessor rounding precision.  After you set the
rounding precision, all the other bugs disappear except for a rare
hardware bug or two.  The hardware bugs are dealt with by a trap
handler in the operating system, in the time honored fashion
of Intel, Borland, or Microsoft.

So there could be a straightforward plan to make x86 obey IEEE.
It's doubtful there could be a workable plan to fix the extra-precise
registers; anyway, they are a feature, no fix is needed!

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~1998-12-20 11:28 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-17 14:38 FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86 Toon Moene
1998-12-17 15:30 ` Harvey J. Stein
1998-12-18  1:54   ` Toon Moene
1998-12-18  3:05     ` Harvey J. Stein
1998-12-18  9:01       ` Toon Moene
1998-12-18 15:59       ` Richard Henderson
1998-12-18 13:26   ` Marc Lehmann
1998-12-18 12:50 ` Dave Love
  -- strict thread matches above, loose matches on Subject: below --
1998-12-17 14:37 tprince
1998-12-17 15:15 ` Stephen L Moshier
1998-12-17 11:27 Brad Lucier
1998-12-17 14:51 ` Marc Lehmann
1998-12-19  0:17   ` Craig Burley
1998-12-19  6:42     ` Emil Hallin
1998-12-19 14:26       ` Dave Love
1998-12-16 13:52 Toon Moene
1998-12-17 10:06 ` Craig Burley
1998-12-17 12:16   ` Harvey J. Stein
1998-12-19  0:29     ` Craig Burley
1998-12-17 11:20 ` Dave Love
1998-12-15 12:24 Toon Moene
1998-12-15 12:55 ` Joe Buck
1998-12-15 15:05 ` Edward Jason Riedy
1998-12-16 10:05 ` Craig Burley
1998-12-15 12:10 Geert Bosch
1998-12-15 13:09 ` Jeffrey A Law
1998-12-15  1:45 Geert Bosch
1998-12-15  3:34 ` Harvey J. Stein
1998-12-16 10:36   ` Craig Burley
1998-12-16 12:47     ` Harvey J. Stein
1998-12-17 10:22       ` Craig Burley
1998-12-17 14:54         ` Marc Lehmann
1998-12-19  0:27           ` Craig Burley
1998-12-19  5:06             ` Stephen L Moshier
1998-12-15  6:43 ` Stephen L Moshier
1998-12-16 10:14   ` Craig Burley
1998-12-15  9:29 ` Joe Buck
1998-12-15 10:14   ` Jeffrey A Law
1998-12-16  8:32     ` Sylvain Pion
1998-12-16  9:20       ` Craig Burley
1998-12-13 18:23 Stephen L Moshier
1998-12-14  1:52 ` Harvey J. Stein
1998-12-14 14:56   ` Edward Jason Riedy
1998-12-14 17:20     ` Joe Buck
1998-12-14 18:51       ` Edward Jason Riedy
1998-12-14 21:54         ` Craig Burley
1998-12-15 14:31           ` Edward Jason Riedy
1998-12-15 17:11         ` Jamie Lokier
1998-12-16  0:26           ` Harvey J. Stein
1998-12-16  9:33             ` Craig Burley
1998-12-16 12:18               ` Harvey J. Stein
1998-12-16  9:38           ` Craig Burley
1998-12-16 12:25           ` Marc Lehmann
1998-12-16 12:50             ` Tim Hollebeek
1998-12-16 13:04               ` Harvey J. Stein
1998-12-16 14:01               ` Marc Lehmann
1998-12-17 11:26                 ` Dave Love
1998-12-17 15:06                   ` Marc Lehmann
1998-12-18 12:50                     ` Dave Love
1998-12-19 14:09                       ` Marc Lehmann
1998-12-20 11:28                         ` Dave Love
1998-12-20 11:24               ` Dave Love
1998-12-16 23:11           ` Joern Rennecke
1998-12-17  6:07             ` Jamie Lokier
1998-12-14 22:54       ` Craig Burley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).