From mboxrd@z Thu Jan  1 00:00:00 1970
From: Craig Burley <burley@gnu.org>
To: toon@moene.indiv.nluug.nl
Cc: burley@gnu.org
Subject: Re: FWD: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
Date: Wed, 16 Dec 1998 10:05:00 -0000
Message-id: <199812161805.NAA14993@melange.gnu.org>
References: <3676C529.FE8214F@moene.indiv.nluug.nl>
X-SW-Source: 1998-12/msg00588.html

>Am I the only one - apart from Harvey J. Stein and Tim Prince - who
>finds this whole discussion unreal ?  Surely, 80 bit temporaries might
>seem a neat hack to a numerical analyst like Dr. Kahan, but the ordinary
>computational physicist or chemist knows better than to choose "poorly
>conditioned" algorithms.

Then we have an awful lot of "extraordinary" people sending email
asking why FP doesn't work as expected in g77, gcc, egcs, and so on,
and campaigning for various corrections to Java, IEEE 854, x86, or
whatever!

>My main concern is that there is a grid spacing that will render the
>basic equation of geostrophy badly approximated in 32-bit arithmetic:
>
>	 1  dp
>	--- -- = f v
>	rho dx
>
>p	pressure
>rho	mass of air per unit volume (1 kg / m^3)
>x	distance
>f	Coriolis parameter (10^-4 s^-1)
>v	wind speed
>
>You can do the math (p ~ 10^5 kg m^-1 s^-2, v ~ 10 m/s, what dx will
>make dp < 10^-3 p ?)

I can't figure out what you're saying.  How will *not* randomly spilling
a computed 80-bit intermediate value to a chopped-down 64-bit result
make your code stop working, exactly?

>If that's the case we have to rethink our finite difference code for the
>first time in 13 years and use a trick like subtracting a basic state
>from the equations - big deal.

I still don't know what you're saying.  How does fixing this longstanding
bug in gcc/g77 break your code, exactly, in terms I, as *not* a
math expert, can understand?

>The last thing I need is to have egcs slowed down to a crawl by having
>it spill unaligned 80-bit temporaries for something that shouldn't be
>larger than 32 bits in the first place.

We don't *know* that it'll slow down to a crawl.  Spilling outside
of function-call return values seems to be rather rare; spilling return
values probably happens less when optimization is turned on, and,
besides, you're doing a *call* already!

I don't like that stack frames will get somewhat larger, though.

>Please make this and other "accuracy" options a "-pedantic-numerics"
>one.

Sounds like you're arguing in favor of the default being extreme speed
at the expense of correct, consistent behavior.

IMO, your experience represents that now-rare breed of programmers:
People Who Know What They're Doing.

And, if shops like yours can rewrite all your code, at huge expense,
to no longer depend on language support for 64-bit integers, to get
it to run on the fastest multi-million-dollar supercomputer you could
get ahold of...

...then you can darn well use the `-fchop-fp-spills' option I've
proposed (though, in my original proposal, I hadn't yet proposed a
name for it) when you decide you want your working code to run
faster.

I think that's pretty fair to ask of you, rather than ask the millions
of people who we want to use gcc, g77, and so on, over the next few
years, to use a special option to get their fast code to start
working in the first place!

In particular, I'd rather people who use g77 to do their numerical
work be able to remain experts in their fields, rather than have to
become experts in compiler code generation, which they'll have to be
to know what options to use.  They might have to increase their
computer expertise to get things to run *faster*, but, generally,
I think we should not require people to be experts on deep-down
details of how particular pieces of software do their job just to
get their straightforwardly-written code to *work* in the first place.

I know, "late answers are wrong answers", but wrong answers remain
wrong answers no matter how quickly one obtains them.  And since most
people have far less expertise and resources than shops like yours,
Toon, we don't want to burden them by telling them all the options
they must use to get code to work "as expected", all of which carry a
red flag saying "this will slow down your code, but if you know what
you're doing you can avoid it", which will cause *most* of these
programmers to say "of course I know what I'm doing", forget the
option, and get wrong results.

I think we'd be far better off, thinking globally and into the future,
if the defaults tended towards correctness and consistency, and the
options that changed behavior generally said things like "if you know
your code never depends on ..., you can use this option to possibly
get better performance".

We're less likely to get spurious bug reports using this approach, at
least -- compare the number of spurious bug reports we've gotten from
people *using* -ffast-math versus those *forgetting* to use -ffloat-store,
for example!

Ideally, the philosophy I'm promoting above would extend to ensuring
much more consistency across *all* GNU platforms.  As I've said before,
this'd mean defaulting to -mieee (or even -mieee-with-inexact) on
Alphas, completely emulating IEEE software on a few older machines,
and so on, and I don't think the industry would welcome the kind of
performance drop we'd get as a worthwhile tradeoff for the small amount
of extra consistency...at least, not right now, and, besides, if users
want Java, they know where to get it (at least for the moment, in
theory, when the moon is just overhead ;-).

But there's a *clear* widespread lack of understanding among today's
programmers that f(x) < f(y) does not imply f(y) > f(x) even when
f has no side effects or external references and x and y are constants.

I think it's easier for us to at least try to live with this lack of
understanding by fixing the compiler to meet the expectations of
this huge audience, than to try and teach them all to at least use
some option like -pedantic-numerics, much less teach them all about
why internal compiler code generation can produce such amazing
results.

>From my compiler-internals perspective, I'm flummoxed as to why *anyone*
with knowledge of the issues would claim that 80-bit values should be
randomly chopped down to 64 bits by the compiler as a *default*.  From that
perspective, performance simply isn't an issue -- if it was, we could
simply not spill *anything* and just re-use random data and get *great*
performance, if consistent results were so unimportant to us.

The FP register stack contains 80-bit registers.  *Exactly* why should
we not spill them *correctly*?

        tq vm, (burley)