Re: What is acceptable for -ffast-math? A numerical viewpoint

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  5:48 dewar
  2001-08-05 11:36 ` Laurent Guerby
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-05  5:48 UTC (permalink / raw)
  To: dewar, guerby; +Cc: gcc, jthorn

<<3. unless 2 showed up unfixable practical problems, the optimization
is enabled under -funsafe-fast-math

Do you actively oppose to this proposal?
>>

Yes, it should also be the case that the field testing should show that
there is some gain. There is no point in having non-"correctness"
preserving transformations unless there is at least some evidence of
an advantage. Probably the best thing is to first put in the optimization
under its own special switch. Then if there is a real gain, and no
unacceptable surprises, then it can be made part of -ffast-math.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05  5:48 What is acceptable for -ffast-math? A numerical viewpoint dewar
@ 2001-08-05 11:36 ` Laurent Guerby
  0 siblings, 0 replies; 50+ messages in thread
From: Laurent Guerby @ 2001-08-05 11:36 UTC (permalink / raw)
  To: dewar; +Cc: gcc, jthorn, guerby

dewar@gnat.com wrote:
<<
Yes, it should also be the case that the field testing should show that
there is some gain. There is no point in having non-"correctness"
preserving transformations unless there is at least some evidence of
an advantage. Probably the best thing is to first put in the optimization
under its own special switch. Then if there is a real gain, and no
unacceptable surprises, then it can be made part of -ffast-math.
>>

You didn't read my 3 lines message, performance gain is to be reported
in step 2:

<<
2. volunteers field test it and report on performance and precision
                                        **PERFORMANCE**
>>

Do you actively oppose to this proposal after rereading it?

-- 
Laurent Guerby <guerby@acm.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-06 18:51 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-06 18:51 UTC (permalink / raw)
  To: gcc, kth

<<If gcc does this transformation without knowing a lot
about all of the numbers, then I'd suggust disabling it.
Does it do this with integers? I'd really hope not.
>>

No, of course not, this would be an invalid transformation for integers,
and no one would argue otherwise.

For reals, it is not so clear that everyone argues against such transformations
and in particular, if you really think that any transformation that is valid
for real arithmetic is OK, your viewpoint is not clear on this particular
issue (yes I know the "but don't do anything absurd" exception, but I don't
know how to define that, yet alone implement it :-)

<<It would make even less sense if you allowed K to go to zero.
You could then transform any division using this optimization
through the steps "a/b" => "(a*K)/(b*K)" => "(a*0)/(b*0)" =>
"0/0".
>>

Of course the transformation is invalid if K = 0.0, because then the
transformation is not valid in real arithmetic. Optimizers have to prove
necessary preconditions before they blindly do transformations!

<<If you allow a transform like this, you would also have to allow
a transform of "(a/K) / (b/K)", which is essentially the same
transform (just invert the constant). Now allow K to go to 0,
or to infinity.
>>

Again, I didn't bother to be explicit in my earlier note, because I
thought it was obvious, but I was only talking about cases where K
is non-zero and finite, since otherwise the transformation is just
plain wrong in both real arithmetic and floating-point arithmetic.

Let's at least assume (please let this be right) that no one proposes
a transformation that is valid in neither the floating-point model
being used, nor in real arithmetic?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-06  2:23 dewar
  2001-08-06  9:50 ` Kevin Handy
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-06  2:23 UTC (permalink / raw)
  To: gcc, ross.s

<<Can we please take it as given that no transformation will be applied
without something at least vaguely resembling a rational argument that
it could be an actual optimisation?
>>

Once you allow a compiler to make transformation X, then the interaction
of transformations is often such that the idea of "rational argument" is
highly dubious. I already gave an argument showing why it is perfectly
believeable that a compiler would transform

a / b

into

(a*K) / (b*K)

Sure, it is a somewhat unlikely combination of circumstances, but if there
is one thing that anyone who has worked on optimizing compilers knows, it is
that unlikely combinations of things do happen in ways that are often very
hard to anticipate, and optimzers are not in the business of "well this
speeds up most programs, and miscompiles others, but we can't tell you
the difference".

So I am afraid that we can't take that as a given, ...

Your position seems to be that a knowledgable programmer should right the
program in a manner such that it has guaranteed correct semantics regardless
of whether the floating-point model is real arithmetic or IEEE arithmetic,
or some mixture. That's extremely difficult to do. You might want to review
some of the arguments over the Brown model of floating-point as used in 
Ada 83. This model does in fact go part of the way to this position (though
it comes nowhere near allowing arbitrary transformations), and even with this
much more limited view (for one thing it allows arbitrary extra precision
all the time), a number of numerical analysts have argued that it makes it
impossible to analyze programs properly from an error limit propagation
point of view.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-06  2:23 dewar
@ 2001-08-06  9:50 ` Kevin Handy
  0 siblings, 0 replies; 50+ messages in thread
From: Kevin Handy @ 2001-08-06  9:50 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> <<Can we please take it as given that no transformation will be applied
> without something at least vaguely resembling a rational argument that
> it could be an actual optimisation?
> >>
> 
> Once you allow a compiler to make transformation X, then the interaction
> of transformations is often such that the idea of "rational argument" is
> highly dubious. I already gave an argument showing why it is perfectly
> believeable that a compiler would transform
> 
> a / b
> 
> into
> 
> (a*K) / (b*K)

If gcc does this transformation without knowing a lot
about all of the numbers, then I'd suggust disabling it.
Does it do this with integers? I'd really hope not.

It would make even less sense if you allowed K to go to zero.
You could then transform any division using this optimization
through the steps "a/b" => "(a*K)/(b*K)" => "(a*0)/(b*0)" =>
"0/0".

If you allow a transform like this, you would also have to allow
a transform of "(a/K) / (b/K)", which is essentially the same
transform (just invert the constant). Now allow K to go to 0,
or to infinity.

Showing a transform is broken by using an interation with a
broken transform is also not helpful. You might as well argue that
if "a+b" were optimized to "a*b", then "x+y/1.0" could be
transformed into "x*y", therefor you can't translate "a/1.0"
into "a".

What if the compiler translated "(a*K) / (b*K)" into "a/b".
This is a much more likely transformation the optimizer might
make than the former. Problems with that transform would be
more intresting and useful.

> Sure, it is a somewhat unlikely combination of circumstances, but if there
> is one thing that anyone who has worked on optimizing compilers knows, it is
> that unlikely combinations of things do happen in ways that are often very
> hard to anticipate, and optimzers are not in the business of "well this
> speeds up most programs, and miscompiles others, but we can't tell you
> the difference".

I'd really like to know of possible problems in normal calculations,
instead of those that are trying to push the FPU against its limits 
(where they shouldn't be using fast-math anyway). And problems with 
interactions in the proposed optimizations, not interactions with 
possible future ones that are broken by themselves.

[I'm really not trying to sound nasty in this e-mail, just hoping to
learn about real problems with proposed optimizations, not about 
broken non-proposed transformations]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05 18:03 dewar
  2001-08-05 20:16 ` Ross Smith
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-05 18:03 UTC (permalink / raw)
  To: gcc, ross.s

<<And you're still asserting that and simply expecting me to take your
word for it, presumably because you're a Real Expert and I'm a mere peon
who writes (ugh) games. I keep asking you to provide evidence for that
assertion and you keep failing to do so.
>>

OK, my assertion is that 

IF you accept the principle that the compiler is allowed yto perform
any transformation that would be valid in real arithmetic,

THEN, all programs can raise overflow right away.

Why? Because you can transform

 a / b to 

 (a * K) / (b * K)

for any large K, causing overflow. If you go back over previous notes, you
will see how that could in fact occur in practice.

Now if you say this is "blatantly extreme" or "gratutious"

and that what you really meant to say was

A compiler is allowed to perform any transformation that would be valid
in real arithmetic unless the transformation is blatantly extreme or
gratuitous.

then you have not helped much, since you have not defined what exactly
you mean by blatantly extreme or gratuitous, and in fact this is what
the whole thread is about. Not everyone is going to agree on what is
BE or GR, and we have to discuss specific examples to reach a consensus.
Any extreme position here is unsupportable, and there is no simple
principle to set a dividing line.

<<"If analysis shows this is reliable." But your example was one where it
was blatantly obvious that it _wasn't_ reliable. It's the same old
problem again: you keep coming up with examples that rely on the
assumption that I'm incapable of doing the analysis.
>>

No, you apparently missed the point of the NR example.

If the arithmetic is predictable, and the compiler maps the high level
language into the set of IEEE operations that could reasonably be expected
(A criterion met by many compilers), then the analysis is straightforward,
and indeed this is very common coding, and quite correct under the
assumption of "sensible" arithmetic.

The point was not that the NR loop can be disrupted by reasonable
transformations like extra precision on an ia32, but rather to point
out the much more radical transformation that recognizes that this
loop will never terminate in real arithmetic. Of course that transformation
is indeed gratuitous and blatantly extreme. I only gave the example to
point out that the absolute real arithmetic standard, though well defined
is unsupportable.

Indeed the example served its purpose, since you are no longer taking the
absolute real arithmetic rule as gospel, you have added the
gratuitous and blatantly extreme exceptions.

So we are once more moved away from the (useless) excercise of trying to
establish simple predicates here, and arrive back at a subjective 
standard which will require case-by-case discussion (which was my point
in the first place).

<<the code and serve no plausible optimisation purpose (e.g. multiplying
both sides by some large factor), and (and I want to emphasise this yet
again because it's the root of your misapprehension) _without assuming
that the coder doesn't understand floating point as well as you do_.
>>

Two more points, please reread my previous example which showed that 
multiplying both sides by a large factor *can* serve some plausibvle
optimization purpose.

Second, the issue is not how well the coder understands floating-point,
it has to do with trying to come up with a well defined set of transformations
that your coder who understands floating-point well can deal with.

Being an expert in floating-point arithmetic is pretty useless if you haven't
a clear definition of the floating-point model you are working with.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05 18:03 dewar
@ 2001-08-05 20:16 ` Ross Smith
  0 siblings, 0 replies; 50+ messages in thread
From: Ross Smith @ 2001-08-05 20:16 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> Now if you say this is "blatantly extreme" or "gratutious"
> 
> and that what you really meant to say was
> 
> A compiler is allowed to perform any transformation that would be valid
> in real arithmetic unless the transformation is blatantly extreme or
> gratuitous.
> 
> then you have not helped much, since you have not defined what exactly
> you mean by blatantly extreme or gratuitous, and in fact this is what
> the whole thread is about.

Actually, what I should have said was "allowed to perform any
_optimisation_ that would be valid in real arithmetic". Since the whole
thread was about optimisation options, I simply took it for granted that
transformations that didn't have any plausible optimisation value were
out of consideration. I guess I should have known better.

("When something goes without saying, it's been my experience that
you're better off if you say it." -- Carolyn Kaiser)

Can we please take it as given that no transformation will be applied
without something at least vaguely resembling a rational argument that
it could be an actual optimisation?

> The point was not that the NR loop can be disrupted by reasonable
> transformations like extra precision on an ia32, but rather to point
> out the much more radical transformation that recognizes that this
> loop will never terminate in real arithmetic.

You're right in that I misunderstood the example, but now that I see
what you meant I still say the example supports my point and not yours.
If I was aware that the compiler was going to perform pretend-its-real
transformations, I wouldn't write a loop that expects to converge on a
single value.

Expecting it to converge relies on precise knowledge of the FP model. It
would be a reasonable thing to do in the presence of -mieee (under
certain conditions). It would not be a reasonable thing to do in the
presence of -fpretend-its-real. The reasonable thing to do would be to
use "Cray-safe algorithms" that can be trusted to be robust against FP
perturbations.

In this particular example, I might test for convergence to within some
(carefully chosen) epsilon. Or perhaps I'd analyse the range of actual
parameters I expected to use and see if I could arrive at a reasonable
fixed constant for the number of iterations (which gives the additional
advantage of potentially allowing the loop to be unrolled and avoid
branches altogether).

> Of course that transformation
> is indeed gratuitous and blatantly extreme.

Actually, I don't agree that it is. Granted an _infinite_ loop is a bit
dubious, but I'd view that as the programmer's fault for not being aware
of what they've asked the compiler to do, not as the compiler being
over-aggressive. Certainly similar transformations on finite loops,
along the lines of loop unrolling and partial evaluation (or even
complete evaluation if the arguments are known constants) at compile
time, are entirely reasonable things to do (modulo the suitability of
loop unrolling to the particular FPU architecture).

If the optimiser that was capable of such things behaved pathologically
in the presence of a theoretically infinite loop written by a naive
programmer, I'd file it under "serves you right".

Hmmm ... looking back at what I just wrote, it suddenly strikes me that
all the discussion about two different classes of programmers in this
thread so far has been misguided. There aren't two kinds of programmer
under consideration here; there are _three_:

(1) Programmers who understand the nuances of FP pretty well and want
predictable behaviour according to a known model, because they've
analysed their problems carefully and designed their algorithms that
way.

(2) Programmers who understand the nuances of FP pretty well and want
speed at the expense of predictable behaviour at the ULP level, because
they've analysed their problems carefully and designed their algorithms
that way.

(3) Programmers who don't know FP intimately and write their code
without careful analysis, and then wonder why (1.0/5.0)*5.0 != 1.0.

Most of the arguments here have followed from a failure to properly
recognise the difference between groups 2 and 3. I don't think we've
heard from anybody in group 3 here (hardly surprising, given the nature
of the list; presumably they're the overwhelming majority Out There),
but some of us in group 2 are getting tired of being treated as though
we were in group 3.

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05 15:37 dewar
  2001-08-05 16:46 ` Ross Smith
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-05 15:37 UTC (permalink / raw)
  To: gcc, ross.s

<<Your Newton-Raphson example is bogus because it relies on the assumption
that anybody who asks for loose FP must not know what they're doing. It
wouldn't be a problem for me because I know perfectly well not to write
code like that. I expect most of the other people who have asked for
loose FP are at least as intelligent as I am.
>>

It was just an example, if you really think that it is acceptable to
allow any transofmration that is valid for real arithmetic under all
circumstances, then we are too far to even have a discussion, since 
this criterion is completely unworkable for many reasons. Again, 
briefly, it would allow transformations that would cause all programs
to immediately generate overflows. You can't really mean that you
support this criterion without any exceptions.

Going back to the NR example, there is nothing wrong at all about programming
an NR iteration to exact equality if analysis shows this is reliable, since
it is often the fastest way of doing things. Remember that the estimate in
an NR iteration is an *exact* representation of the estimate (if you were
doing interval arithmetic, you would shrink the interval to zero for
the estimate).

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05 15:37 dewar
@ 2001-08-05 16:46 ` Ross Smith
  0 siblings, 0 replies; 50+ messages in thread
From: Ross Smith @ 2001-08-05 16:46 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> It was just an example, if you really think that it is acceptable to
> allow any transofmration that is valid for real arithmetic under all
> circumstances, then we are too far to even have a discussion, since
> this criterion is completely unworkable for many reasons. Again,
> briefly, it would allow transformations that would cause all programs
> to immediately generate overflows. You can't really mean that you
> support this criterion without any exceptions.

And you're still asserting that and simply expecting me to take your
word for it, presumably because you're a Real Expert and I'm a mere peon
who writes (ugh) games. I keep asking you to provide evidence for that
assertion and you keep failing to do so.

> Going back to the NR example, there is nothing wrong at all about
> programming an NR iteration to exact equality if analysis shows this
> is reliable, since it is often the fastest way of doing things.

"If analysis shows this is reliable." But your example was one where it
was blatantly obvious that it _wasn't_ reliable. It's the same old
problem again: you keep coming up with examples that rely on the
assumption that I'm incapable of doing the analysis.

I'll throw out the challenge once again: Come up with a concrete example
in which "pretend they're real numbers" leads to pathological behaviour.
Without introducing blatantly extreme values like DBL_MAX or 10**1000,
without applying gratuitous transformations that are only there to break
the code and serve no plausible optimisation purpose (e.g. multiplying
both sides by some large factor), and (and I want to emphasise this yet
again because it's the root of your misapprehension) _without assuming
that the coder doesn't understand floating point as well as you do_.

Until you actually come up with such an example (and not merely claim
that it exists), I stand by my position that "pretend they're real
numbers" is a reasonable thing to want to be able to tell the compiler.

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05 15:27 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-05 15:27 UTC (permalink / raw)
  To: dewar, guerby; +Cc: gcc, jthorn

And yes, I still oppose your proposal as it is written, it fails to make
clear that positive performance data should be required to support any
transformation.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05 15:26 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-05 15:26 UTC (permalink / raw)
  To: dewar, guerby; +Cc: gcc, jthorn

>>Do you actively oppose to this proposal after rereading it?

Laurent you should reread your procedures, yes, you mentioned performance
in step 2, but in step 3, you said that the optimzation would be accepted
if no difficulties were found, but you did not make it clear that lack
of performance advantage data would be considered a fatal flaw, in fact
you don't even really hint of that critical step.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  7:31 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-05  7:31 UTC (permalink / raw)
  To: dewar, toon; +Cc: gcc, ross.s

<<2. It's absolutely necessary to provide a web page with our consensus
   on these matters.
>>

Indeed, I think Toon's aproach here is very helpful.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  5:55 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-05  5:55 UTC (permalink / raw)
  To: dewar, guerby; +Cc: gcc, jthorn

<<with a "clear set of criteria", whatever that means - and there will
always someone to show up with something that will be badly broken by
the optimization.)
>>

The fact that some particular program breaks with -ffast-math is not
by itself decisive. You can't eliminate surprises completely if you
start playing this game. If someone comes up with a response like

"Hey, I turned on optimization xxx in my program and the results are wrong"

Then that conveys no information without further analysis. What we need in
that case is to understand *why* the results are wrong. It might be for
example the case that the computation in question is unstable, and any
change, even legitimate changes could discombobulate the resuls.

An example. I would definitely think that -ffast-math should allow extra
precision at any point (you can also argue this should be on by default,
it is certainly allowed to be on by default in Ada, and I believe that in
at least some cases, it is on by default in GNU C, but it is definitely
NOT on by default in many other compilers, e.g. IBM avoids this on power
architevctures unless a special switch is set).

But there are certainly algorithms which blow up with extra precision.
A simple example of the extra precision causing a loss of performance
would be if you program some iteration which you know from your analysis
is stable to equality, but now the equality is between higher precision
numbers, and either your analysis did not cover this case, and the
computation no longer converges, or it takes much longer to converge.

A specific example is using Simpson's rule for integration. Especially
with truncating arithmetic, you get a behavior where the result converges
as you reduce the internal, then starts to diverge. Changing the precision
can greatly change this curve of convergence.

So the input from the "group 2" folks, who program in fpt but don't know
or care enough to do careful analyses, or simply don't have the time, must
be considered carefully. We are talking about optimizations that definitely
have the potential for upsetting results when the optimization is turned on,
so the fact that this indeed happens is not by itself an absolute argument
against the optimization.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  5:36 dewar
  2001-08-05  6:43 ` Toon Moene
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-05  5:36 UTC (permalink / raw)
  To: gcc, ross.s

Part of the trouble here is that I get the feeling that quite a few people
are discussing these issues for the first time, and reacting ab initio.
As I said earlier, these issues are very old, and it is well worth going
back to the old literature that discusses these issues. In particuloar the
discussions of floating-point evaluation in the Fortran standards are of course
highly relevant here.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05  5:36 dewar
@ 2001-08-05  6:43 ` Toon Moene
  0 siblings, 0 replies; 50+ messages in thread
From: Toon Moene @ 2001-08-05  6:43 UTC (permalink / raw)
  To: dewar; +Cc: gcc, ross.s

dewar@gnat.com wrote:
> 
> Part of the trouble here is that I get the feeling that quite a few people
> are discussing these issues for the first time, and reacting ab initio.
> As I said earlier, these issues are very old, and it is well worth going
> back to the old literature that discusses these issues. In particuloar the
> discussions of floating-point evaluation in the Fortran standards are of course
> highly relevant here.

This is exactly the reason why:

1. This is a recurring theme.
2. It's absolutely necessary to provide a web page with our consensus
   on these matters.

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  5:35 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-05  5:35 UTC (permalink / raw)
  To: gcc, ross.s

<<You'll have to show me a _realistic_ example before I'll believe that. I
don't consider anything involving the likes of DBL_MAX or 10**1000 to be
realistic.
>>

OK, but there are plenty of casual programs that use large numbers, that's
why we expand the exponent when we go to double as well as expanding the
mantissa. Poorly designed fpt systems in which this is not done (e.g.
mainframe 360) proved a real nuisance, not to experts, who of course
carefully prove that their values are in range and avoid overflows, but
to casual programmers. It's often surprising how large intermediate
values can get.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-05  3:51 dewar
  2001-08-05  5:38 ` Laurent Guerby
  2001-08-05 13:43 ` Ross Smith
  0 siblings, 2 replies; 50+ messages in thread
From: dewar @ 2001-08-05  3:51 UTC (permalink / raw)
  To: guerby, jthorn; +Cc: gcc, jthorn

<<Following the thread there seem to be two camps that have quite
incompatible viewpoints: (1) experts, (2) other people. I suggest
the following scheme:
>>

The point is that the rules have to be coherent and well documented and
understood. You are NOT going to achieve that unless you have input from
people who understand :-)

In particular, we need a clear set of criteria, and it is not going to be
easy to sort through them. People often ask for X in floating-point without
fully realizing the consequences of X. For example, we had at least one
person in category 2) here say that they were happy with any optimization
that would be valid for real numbers.

But that's obviously wrong (i.e. they would NOT be happy with the result).
Let's give an example of such an optimization.

Suppose you write a Newton-Raphson sqrt that iterates to exact equality on
the estimate. In many floating-point models, that's perfectly valid, and often
the most efficient represntation of the algorithm.

However, in real arithmetic, this loop would never converge. Does that mean
an optimizer is allowed to replace the loop with

  L1: JMP L1

Well the answer is yes, if you accept the bogus criterion that any
transformation that would be valid for reals is valid.

Now you probably react, well of course that's nonsense. Fine, it is! But
*why* is it nonsense? Because the criterion was nonsense.

So what you need to do is to have group 2)

a) come up with a proposed "optimization"
b) by experiment, or other convincing argument, show that it is a worthwhile
   optimization (any optimization should have to bear this burden) in either
   saving of time or space.

Now group 1) will point out the *full* consequences of the transformation.

Once everyone understands the full consequences, then if no one is too unhappy
with the consequneces, you can include it.

P.S. In my experience, it is group 2) who get most upset with unexpected
consequences. They tend to expect that the only effect will be a little
bit of noise in the last decimal digit, and are the first to complain
loudly when turning on optimization radically changes results. The experts
of course know better what to expect, and will not be surprised (though
they may complain :-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05  3:51 dewar
@ 2001-08-05  5:38 ` Laurent Guerby
  2001-08-05 13:43 ` Ross Smith
  1 sibling, 0 replies; 50+ messages in thread
From: Laurent Guerby @ 2001-08-05  5:38 UTC (permalink / raw)
  To: dewar; +Cc: jthorn, gcc, guerby

<<
The point is that the rules have to be coherent and well documented and
understood. You are NOT going to achieve that unless you have input from
people who understand :-)
>>

I was merely proposing to have a flag (one of the name I proposed
was -funsafe-fast-math) where the rules would be:

1. anyone can go ahead, code and document an optimization

2. volunteers field test it and report on performance and precision

3. unless 2 showed up unfixable practical problems, the optimization
is enabled under -funsafe-fast-math

Do you actively oppose to this proposal?

I have no opinion on other flags, since I do not consider myself an
expert. (My only feeling is that no GCC FP optimization work will be
done at all since all proposals will be discussed endlessly - even
with a "clear set of criteria", whatever that means - and there will
always someone to show up with something that will be badly broken by
the optimization.)

-- 
Laurent Guerby <guerby@acm.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-05  3:51 dewar
  2001-08-05  5:38 ` Laurent Guerby
@ 2001-08-05 13:43 ` Ross Smith
  1 sibling, 0 replies; 50+ messages in thread
From: Ross Smith @ 2001-08-05 13:43 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> Suppose you write a Newton-Raphson sqrt that iterates to exact equality on
> the estimate. In many floating-point models, that's perfectly valid, and often
> the most efficient represntation of the algorithm.
> 
> However, in real arithmetic, this loop would never converge. Does that mean
> an optimizer is allowed to replace the loop with
> 
>   L1: JMP L1
> 
> Well the answer is yes, if you accept the bogus criterion that any
> transformation that would be valid for reals is valid.

Please stop assuming that anybody who disagrees with you must be an
igoramus who needs to be protected from the consequences of their folly.

I understand the behaviour of floating point arithmetic perfectly well,
thank you. That's why I feel confident in asking the compiler to be
generous in its transformations: because I understand the nature of FP
and the nature of my problem well enough to know that it won't poison
the results fatally.

Your Newton-Raphson example is bogus because it relies on the assumption
that anybody who asks for loose FP must not know what they're doing. It
wouldn't be a problem for me because I know perfectly well not to write
code like that. I expect most of the other people who have asked for
loose FP are at least as intelligent as I am.

I think Laurent Guerby [*1] made a really unfortunate choice of
terminology. Your side has to get away from this assumption that "wants
precisely predictable behaviour at the expense of speed" == "expert" &&
"wants speed at the expense of precisely predictable behaviour" == "not
an expert".

[*1: It took me ages to find the relevant message in the archive so I
could identify the author. Robert, please stop snipping attributions, it
makes following threads unnecessarily difficult.]

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-04  6:17 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-04  6:17 UTC (permalink / raw)
  To: gcc, jthorn

>>This is clearly an area where one size does *not* fit all programs!

indeed ...

^ permalink raw reply	[flat|nested] 50+ messages in thread

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00221.html>]

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00223.html>]

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00225.html>]

* Re: What is acceptable for -ffast-math? A numerical viewpoint
       [not found]   ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00225.html>
@ 2001-08-04  6:16     ` Jonathan Thornburg
  0 siblings, 0 replies; 50+ messages in thread
From: Jonathan Thornburg @ 2001-08-04  6:16 UTC (permalink / raw)
  To: gcc; +Cc: Jonathan Thornburg

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00221.html >,
I wrote [[code to implement complex division with reduced chance of
spurious overflow/underflow]]
|     if (|c| < |d|)
|        then {
|             compute
|                 a/d + b/d i
|                 -----------
|                 c/d +  1  i
|             }
|        else {
|             compute
|                 a/c + b/c i
|                 -----------
|                  1  + d/c i
|             }

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00223.html >,
"Joseph S. Myers" <jsm28 at cam dot ac dot uk> wrote
> When we have an implementation of this sort of division in GCC, clearly it
> should go in libgcc as a library function.

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00225.html >,
dewar at gnat dot com wrote
# That's non-optimal unless you inline, since this is a case where you really
# want to be able to do extensive scheduling.

I'd like to see gcc offer the option of either inlining this or not,
presumably guided by -Os vs -finline-functions, and other such options
which control speed vs generated-code-size tradeoffs.

And I'd hope that other nontrivial libc and libm functions (memmov/cpy,
str*, cos, log, pow, ...) would also be inlined or not on a similar basis.

This is clearly an area where one size does *not* fit all programs!

-- 
-- Jonathan Thornburg <jthorn@aei.mpg.de>
   Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
   Golm, Germany             http://www.aei.mpg.de/~jthorn/home.html
   Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik
   "Stock prices have reached what looks like a permanently high plateau"
                        -- noted economist Irving Fisher, 15 October 1929

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-04  5:37 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-04  5:37 UTC (permalink / raw)
  To: jsm28, jthorn; +Cc: gcc, jthorn

<<When we have an implementation of this sort of division in GCC, clearly it
should go in libgcc as a library function.
>>

That's non-optimal unless you inline, since this is a case where you really
want to be able to do extensive scheduling.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-04  5:27 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-04  5:27 UTC (permalink / raw)
  To: gcc, jthorn; +Cc: jthorn

With regard to complex arithmetic, you really have to look at the defining
standard for guidance here. I am not sure what the Fortran standard says,
but Annex G of the Ada standard specifies that overflow (or infinities)
can occur only if the *result* is out of bounds, so something like your
method c is required if you claim to support Annex G. But most certainly
a less strict mode can use a) or b). The scaling of c) can indeed be
expensive, though the test is not necessarily that expensive if you have
good scheduling, complex divisions are often surrounded by other operations
that can be scheduled, and indeed on an EPIC machine, you can simply
speculate and do both computations, choosing the right one at the end.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-04  3:08 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-04  3:08 UTC (permalink / raw)
  To: gcc, jthorn; +Cc: jthorn

<<For (binary) IEEE arithmetic, transforming   (...)/2.0  into   (...)*0.5
won't change program results, because 1/2 is an exactly representatable
(binary) IEEE floating point number and IEEE specifies that the result
is the true mathematical result (identical in both cases precisely because
1/2 is exactly representable) rounded to the destination format.  (I think
this argument is still valid even in the presence of +/- 0, +/-infinity,
and NAN.)  So, I would argue that gcc could reasonably leave this
transformation enabled by default.  (We could of course also provide a
"do arithmetic exactly as I wrote it" mode which disables this.)
>>

But cases where the result is guaranteed the same are of COURSE fair game
for the code generoator to do what it likes. There is no sense in which the
use of the / operator specifies that a division instruction must be generated
(indeed such a rule is nonsense on a machine like the i860 which has no
division instruction :-) So we aren't discussing cases like this at all.

Yes, indeed the reciprocal transformation where it is safe is a routine
transformation that any decent code generator should do all the time.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-04  2:11 dewar
  2001-08-04 17:24 ` Ross Smith
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-04  2:11 UTC (permalink / raw)
  To: gcc, ross.s

<<Speaking as someone with an interest in games programming, I would
_love_ to have an option that tells GCC "apply any transform that would
make sense for true real numbers, and to hell with the last few decimal
places".
>>

Please reread my previous example. The trouble is that "apply any transform
that would make sense for true real numbers" does not just affect "the last
few decimal places", it can have much more radical effects, such as making
all your programs die immediately with overflow errors even if your numbers
are nowhere near the limit of overflow in the canonical evaluation semantics.

Now of course you react by saying, "well of course that's not acceptable", but
then we are left with the situation where you are

a) claiming that the criterion is a simple one, namely allow any
transformation that is allowed for real numbers

b) not willing to accept the consequences of this proposal

Furthermore, do you really mean "the last few decimal places"? Remember that
in 32-bit mode, you only have 6 decimal places of accuracy to start with.

<<This is why I can't understand the position you and Gabriel are taking.
You're acting as though _you_ were going to be forced to use this
option. Since you obviously want last-decimal-place precision,
presumably you're going to use -mieee instead. So why do you care what
the extreme maths option does?
>>

First of all, last decimal place position does not have much to do with
anything, most surely any serious analysis must be in terms of ULP's.
Second, you completely misintepret our position.

Suppose we take the position that we allow any transformation that does
not affect any result by more than X ULP's. Now we can discuss the value
of X, I would guess most people are thinking in terms of a "few" ULP's,
though you seem to be in the range of 50,000 or something like that which
is really radical.

The trouble is that for any value of X, any of the transformations we are
talking about can exceed X in some boundary cases.

Now you probably want to react by saying that you don't care about boundary
cases. Fine, but now you have to define what you mean by boundary cases,
and that, as we know from some of the discussions in the thread, is not
at all easy.

My guess is that a reasonable summary of your position (and others who share
it) is something like:

"do any reasonable transformation that would be OK for real arithmetic, where
reasonable is defined as not affecting my programs significantly, where
significantly is defined as not affecting more than the last bits of accuracy.

The trouble with this approach is

a) we can't easily get people to agree on just how much accuracy can be lost.
Sloppy floating-point programming is one thing, but if you get an answer
printed out that says:

   3.75428

and you turn on -ffast-math, and the answer is

   3.77875

then whether that is informally acceptable will vary greatly from one
person to another.

b) Even if we do agree on some kind of accuracy bound, we still have the
problem that you may turn on -ffast-math and not get any answer at all.
Either all your results are NaN's and infinities, or, depending on the
hardware, you terminate with a floating-point exception.

c) You can't issue an absolute edict against b) happening, because all of
the transformations involved can cause the result of b) in some cases.

d) Even if overflow does not occur, there will be boundary cases in which
the results are radically affected. In other words, your program might 
print a result of

   3.75428

and then you turn on -ffast-math, and you get

   -6.7E+42

e) Again, you can't issue an absolute edict against d) happening, because
all of the transformations can cause the result of d) in some cases.

f) It is hard to agree on, or clearly define, what one means by boundary
cases in which the behaviors of b) and d) are considered acceptable.

Floating-point is difficult. It is not so easy to implement DWIM :-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-04  2:11 dewar
@ 2001-08-04 17:24 ` Ross Smith
  0 siblings, 0 replies; 50+ messages in thread
From: Ross Smith @ 2001-08-04 17:24 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> <<Speaking as someone with an interest in games programming, I would
> _love_ to have an option that tells GCC "apply any transform that would
> make sense for true real numbers, and to hell with the last few decimal
> places".
> >>
> 
> Please reread my previous example. The trouble is that "apply any transform
> that would make sense for true real numbers" does not just affect "the last
> few decimal places", it can have much more radical effects, such as making
> all your programs die immediately with overflow errors even if your numbers
> are nowhere near the limit of overflow in the canonical evaluation semantics.

You'll have to show me a _realistic_ example before I'll believe that. I
don't consider anything involving the likes of DBL_MAX or 10**1000 to be
realistic.

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-03 18:31 dewar
  2001-08-03 20:34 ` Ross Smith
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-03 18:31 UTC (permalink / raw)
  To: gcc, ross.s

<<Stop telling us what we're allowed to want.

We are just trying to understand what you want, it was you who said
that you do not want optimizations that change program results, not me!
Now you change it to "change program results enough to matter". And
that of course is the point, we have to agree on what is enough to matter.

The problem is agreeing on a criterion. As I have said many times, for
me the *primary* criterion is that the optimization actually make a real
noticable difference in performance. There is no point in allowing 
non-identity transformations that do not provide noticeable speedup.

Then if the speedup *is* significant, then it makes sense to discuss
just how safe the transformation is, and whether it meets the collected
consensus understanding of what "changing program results but not enough
to matter" might mean.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-03 18:31 dewar
@ 2001-08-03 20:34 ` Ross Smith
  0 siblings, 0 replies; 50+ messages in thread
From: Ross Smith @ 2001-08-03 20:34 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> <<Stop telling us what we're allowed to want.
> 
> We are just trying to understand what you want, it was you who said
> that you do not want optimizations that change program results, not me!

No it wasn't. You seem to be confusing me with somebody else.

> Now you change it to "change program results enough to matter". And
> that of course is the point, we have to agree on what is enough to matter.

No we don't. Only the people who actually use -fextreme-maths-optimise
(or whatever it gets called) need to agree. People who don't like what
it does don't have to use it.

Nobody's suggested that it should be the default. In fact, there seems
to be unanimous agreement that it shouldn't be enabled by any -O level.
Personally, I think -mieee should be the default (at least on machines
like IA32 where native FP is close enough to IEEE that it doesn't have
to be emulated). (But then, I also think -ansi -pedantic should be the
default, and for that matter that -ansi and -mieee should be renamed
-iso and -miec559, so what do I know?)

This is why I can't understand the position you and Gabriel are taking.
You're acting as though _you_ were going to be forced to use this
option. Since you obviously want last-decimal-place precision,
presumably you're going to use -mieee instead. So why do you care what
the extreme maths option does?

Speaking as someone with an interest in games programming, I would
_love_ to have an option that tells GCC "apply any transform that would
make sense for true real numbers, and to hell with the last few decimal
places".

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-03 14:55 dewar
  2001-08-03 16:57 ` Ross Smith
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-03 14:55 UTC (permalink / raw)
  To: gcc, jthorn; +Cc: jthorn

<<Having optimization change program results is a trickier case.


But *all* the "optimizations" you suggest *do* change program results!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-03 14:55 dewar
@ 2001-08-03 16:57 ` Ross Smith
  2001-08-03 17:00   ` Tim Hollebeek
  0 siblings, 1 reply; 50+ messages in thread
From: Ross Smith @ 2001-08-03 16:57 UTC (permalink / raw)
  To: gcc

dewar@gnat.com wrote:
> 
> <<Having optimization change program results is a trickier case.
> 
> But *all* the "optimizations" you suggest *do* change program results!

Not enough to matter. This is the point you and Gabriel seem incapable
of understanding.

Who decides what's "enough to matter"? _We_ do. They're our programs.

Stop telling us what we're allowed to want.

-- 
Ross Smith <ross.s@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"Unix has always lurked provocatively in the background of the operating
system wars, like the Russian Army."                  -- Neal Stephenson

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-03 16:57 ` Ross Smith
@ 2001-08-03 17:00   ` Tim Hollebeek
  0 siblings, 0 replies; 50+ messages in thread
From: Tim Hollebeek @ 2001-08-03 17:00 UTC (permalink / raw)
  To: Ross Smith; +Cc: gcc

On Sat, Aug 04, 2001 at 11:57:18AM +1200, Ross Smith wrote:
>
> Who decides what's "enough to matter"? _We_ do. They're our programs.

I vote that all his programs segfault before main().

-Tim

^ permalink raw reply	[flat|nested] 50+ messages in thread

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00009.html>]

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-07/msg02141.html>]

* Re: What is acceptable for -ffast-math? A numerical viewpoint
       [not found] ` <url:http://gcc.gnu.org/ml/gcc/2001-07/msg02141.html>
@ 2001-08-03  9:17   ` Jonathan Thornburg
  2001-08-03 12:59     ` Toon Moene
  2001-08-04  7:26     ` Laurent Guerby
       [not found]   ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00176.html>
  1 sibling, 2 replies; 50+ messages in thread
From: Jonathan Thornburg @ 2001-08-03  9:17 UTC (permalink / raw)
  To: gcc; +Cc: Jonathan Thornburg

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00009.html >,
Wolfgang Bangerth <wolfgang dot bangerth at iwr dot uni-heidelberg dot de>
wrote
> Due to repeated demand: here is the opinion of a numerical analyst. To
> further define my position: there is some kind of schism in the numerics
> society, the "traditionalists" doing Fortran with highly tuned codes and
> relatively simple data structures, and the "modernists" using extremely
> large C++ programs on complex data structures. I belong to the latter
> group and can only speak for them.
> [[...]]
> My opinion is that -fast-math could well include slight deviations from
> IEEE modes for denormals, rounding modes, associative redistribution, etc.
> [[...]]
> So, concluding: if you have programs that run for several days, you'd be
> happy if you could cut that time by some hours using optimizations that
> - still do what common sense would dictate (IEEE is not pure common sense,
>   but rather a definition), i.e. swapping operands would be allowed, but
>   replacing NaNs by zero would not 
> - can be switched off in case I know I have to treat boundary cases     
>   exactly.
> I myself would certainly appreciate a compiler turning a*c+b*c into     
> (a+b)*c, I'd definitely not be shocked about that.

I normally refrain from posting "me too"s, but in this case
Wolfgang has expressed my thoughts so precisely that I'd like to
chime in.  I too do number-crunching for a living, with large C++
programs running for hours to weeks on a mix of workstations and
supercomputers, using a mix of simple and complex data structures.

I normally enable -ffast-math in my programs.  I would be delighted
to see gcc -ffast-math turn  a*c + b*c  into  (a+b)*c  if this ran
faster.  I have no qualms about underflows being flushed to zero,
and I suspect I could live fairly well with even old Cray arithmetic.
I would have major problems with 2.0/3.0 evaluating to 0.5, though. :)

It's worth pointing out that a lot of my sort of computing involves
machine-generated C/C++ code (usually generated by a symbolic algebra
system like Maple, Macsyma, or Mathematica).  Often various limitations
of the symbolic code mean that the generated C code is seriously ugly,
with even "obvious inefficiencies" like dividing by 5.0 instead of
multiplying by 0.2, etc.  So if I grant license to do this by saying
-ffast-math, I would like very much for the compiler to (eg) convert
(...)/5.0 into 0.2*(...).  My philosophy is that if I care about
exact bit-for-bit rounding, then I shouldn't use -ffast-math.

Having optimization change program results is a trickier case.
I guess I'd like this to be user-controllable.  I'd certainly like
the _option_ of cranking maximum gigaflops for my black hole simulations
(with -super-duper-optimize maybe giving slightly different results
from -g), but I'd also like the option of retaining identical results
(presumably at some performance penalty) for debugging.

Another "interesting" property of my codes is that they often include
particular subsections where I _do_ need either bit-for-bit rounding,
or at least something fairly close to it.  For example, sometimes I
fake quad precision with Keith Briggs' C++ doubledouble class
( http://www.btexact.com/people/briggsk2/doubledouble.html ).
And I often use Brent's ZEROIN routine, which (eg) infinite-loops on
x86 if compiled with gcc unless -ffloat-store is specified.

Presently I handle this via Makefile hackery to remove -ffast-math
and (on x86) add -ffloat-store) as needed (at a per-compilation-unit
granularity).  But it would sometimes be useful if I could get this
on a function or even block granularity, e.g. with #pragma or some
moral equivalent (__careful-math or whatever).  That way only the
minimal necessary section of code would have to pay the performance
penalties of "being careful".

If -ffast-math is *not* enabled, then I think gcc needs to be a lot
more careful with this sort of thing.  I view the absence of -ffast-math
as saying that the user _does_ care about getting every last IEEE bit
right, at least modulo x86 -ffloat-store wierdness.  Perhaps this is
too conservative, and we need to generalize this into a "math mode"
flag which can be "fast", "exact", or "normal", the latter being
a compromise position (suitable for a default setting) which would
hopefully give most of the performance of "fast" while (eg) not
using the associative law.

-- 
-- Jonathan Thornburg <jthorn@thp.univie.ac.at>
   Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
   Golm, Germany             http://www.aei.mpg.de/~jthorn/home.html
   "Space travel is utter bilge" -- common misquote of UK Astronomer Royal
                                    Richard Woolley's remrks of 1956
   "All this writing about space travel is utter bilge.  To go to the
    moon would cost as much as a major war." -- what he actually said

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-03  9:17   ` Jonathan Thornburg
@ 2001-08-03 12:59     ` Toon Moene
  2001-08-04  7:26     ` Laurent Guerby
  1 sibling, 0 replies; 50+ messages in thread
From: Toon Moene @ 2001-08-03 12:59 UTC (permalink / raw)
  To: Jonathan Thornburg; +Cc: gcc

Jonathan Thornburg wrote:

[ Another useful set of observations about transformations of
  floating point arithmetic - automatically generated code ... ]

I think we'll have to bite the bullet and proclaim the support / lack of
clear semantics of -ffast-math an "open project" like there are so many
of them on our home page ( http://gcc.gnu.org/projects/ ).

I'll write up a draft over the weekend.

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-03  9:17   ` Jonathan Thornburg
  2001-08-03 12:59     ` Toon Moene
@ 2001-08-04  7:26     ` Laurent Guerby
  1 sibling, 0 replies; 50+ messages in thread
From: Laurent Guerby @ 2001-08-04  7:26 UTC (permalink / raw)
  To: jthorn; +Cc: gcc, jthorn, guerby

Jonathan Thornburg <jthorn@galileo.thp.univie.ac.at>:
<<I normally refrain from posting "me too"s, but in this case
Wolfgang has expressed my thoughts so precisely that I'd like to
chime in.  I too do number-crunching for a living, with large C++
programs running for hours to weeks on a mix of workstations and
supercomputers, using a mix of simple and complex data structures.>>

"me too" but with Ada 95 :).

Following the thread there seem to be two camps that have quite
incompatible viewpoints: (1) experts, (2) other people. I suggest
the following scheme:

* -O2 and -O2 -mieee stay with their current meaning, are
"conservative" and all optimizations need expert group approval before
going in.

* -ffast-math will not be used by experts, and what goes in will be
decided by group (2) (where there seem to be consensus on use and
expectations BTW). Group (2) people will report on all new
optimisations to the GCC list on their own code but from performance
and result change standpoint and then decision will be made by group
(2) alone. Experts are welcomed to show problems with new
optimizations, but if they don't happen in practice and there is
performance gain I expect the optimization will go in.

Note that if some people in the expert group feel like they want their
own -ffast-math I don't mind having a -fsafe-fast-math or
-funsafe-fast-math as long as group (2) has a flag to play with
without much expert interference.

I feel that if we don't make the split, people with potential to do
some "aggressive" FP optimizations will be scared off, and that's bad
for some GNU users and so to the project as well.

Once GNAT is GCC CVS I will be able and willing to report on MIPS,
PowerPC, SPARC and a wide range of x86 processors on our numerical
code (equity derivative pricing at work).

Do people think this proposal is ok? Could people feeling they belong
to group (2) post if they're ready to commit to this procedure, if
they have improvements  to suggest, and what platform they have access
too?

-- 
Laurent Guerby <guerby@acm.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00176.html>]

[parent not found: <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00192.html>]

* Re: What is acceptable for -ffast-math? A numerical viewpoint
       [not found]     ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00192.html>
@ 2001-08-04  2:41       ` Jonathan Thornburg
  2001-08-04  3:32       ` Jonathan Thornburg
  1 sibling, 0 replies; 50+ messages in thread
From: Jonathan Thornburg @ 2001-08-04  2:41 UTC (permalink / raw)
  To: gcc; +Cc: Jonathan Thornburg

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00176.html >,
I wrote
| [[...]] if I grant license to do this by saying
| -ffast-math, I would like very much for the compiler to (eg) convert
| (...)/5.0 into 0.2*(...).  My philosophy is that if I care about
| exact bit-for-bit rounding, then I shouldn't use -ffast-math.
[[...]]
| Having optimization change program results is a trickier case.
| I guess I'd like this to be user-controllable.  [[...]]

In message <url: http://gcc.gnu.org/ml/gcc/2001-08/msg00192.html >,
dewar <dewar at gnat dot com> pointed out
> But *all* the "optimizations" you suggest *do* change program results!

Sorry, I didn't choose my examples carefully enough.

For (binary) IEEE arithmetic, transforming   (...)/2.0  into   (...)*0.5
won't change program results, because 1/2 is an exactly representatable
(binary) IEEE floating point number and IEEE specifies that the result
is the true mathematical result (identical in both cases precisely because
1/2 is exactly representable) rounded to the destination format.  (I think
this argument is still valid even in the presence of +/- 0, +/-infinity,
and NAN.)  So, I would argue that gcc could reasonably leave this
transformation enabled by default.  (We could of course also provide a
"do arithmetic exactly as I wrote it" mode which disables this.)

In contrast, my earlier example of   (...)/5.0  -->  (...)*0.2  *does*
change program results in the last bit or two (because 0.2 can't be exactly
represented in binary floating-point), so we might reasonably require
explict permission from the user (eg -ffast-math) before doint this.
We might also disable this transformation if the user requests "only
optimizations that don't change results".

At least in the absence of cross-compilation, it's quite practical
(with careful programming -- and _this_ programming is an example of
code that requires a "do arithmetic exactly as I wrote it" mode!) to
check at compile-time whether or not a given literal constant can be
exactly represented in a given floating-point format, and thus whether
or not the reciprocal transformation will change program results (slightly).

-- 
-- Jonathan Thornburg <jthorn@thp.univie.ac.at>
   Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
   Golm, Germany             http://www.aei.mpg.de/~jthorn/home.html
   "The first strike in the American Colonies was in 1776 in Philadelphia,
    when [...] carpenters demanded a 72-hour week." -- Anatole Beck

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
       [not found]     ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00192.html>
  2001-08-04  2:41       ` Jonathan Thornburg
@ 2001-08-04  3:32       ` Jonathan Thornburg
  2001-08-04  5:03         ` Joseph S. Myers
  1 sibling, 1 reply; 50+ messages in thread
From: Jonathan Thornburg @ 2001-08-04  3:32 UTC (permalink / raw)
  To: gcc; +Cc: Jonathan Thornburg

Another interesting example to consider here is how to implement
complex arithmetic, particularly complex division.

Suppose we're in a language where "complex" is a primitive datatype
(eg Fortran or C99).  Then there are several plausible ways to implement
complex division:

   a + bi   a*c + b*d   b*c - a*d
   ------ = --------- + --------- i
   c + di   c^2 + d^2   c^2 + d^2

(1) use the above expression with multiplication-by-the-inverse for
    the common divisor:
	# 3 adds/subtracts, 8 multiplies, 1 divide
	temp = 1.0 / (c*c + d*d)
	result.real = temp * (a*c + b*d)
	result.imag = temp * (b*c - a*d)
(2) use the above expression directly
	# cost = 3 adds/subtracts, 6 multiplies, 2 divides
	temp = c*c + d*d
	result.real = (a*c + b*d) / temp
	result.imag = (b*c - a*d) / temp
(3) rescale to avoid the possibility of overflow/underflow in c^2 + d^2:
    if (|c| < |d|)
       then {
	    compute
		a/d + b/d i
		-----------
		c/d +  1  i
	    }
       else {
	    compute
		a/c + b/c i
		-----------
		 1  + d/c i
	    }
    doing the computation using either (1) or (2) with multiplications
    by 1 elided

On most modern hardware (1) is substantially faster than (2), because
fp multiplies are fast and well-pipelined, but fp divides are *very*
slow and minimally pipelined.

However, both (1) and (2) will overflow/underflow if b^2 + c^2 hits the
range limits of the fp format, i.e. if either |b| or |c| is (roughly)
beyond the square root of the overflow/underflow threshold.  (3) avoids
almost (though not quite) all of the spurious overflows/underflows, at
the cost of a data-dependent conditional branch, which is a major
performance hit on (heavily-pipelined) modern hardware.  (3) also has
larger object code.

[There are also a couple of other variants of (3) which offer similar
robustness against overflow/underflow, e.g. rescale by a suitable power
of 2, or rescale by max(|c|,|d|), which offer somewhat different tradeoffs
of fp operation counts vs conditional move instructions vs full-fledged
conditional branches; different hardware will favor different choices
here.]

IMHO -ffast-math constitutes a request from the user to use (1), and
a "do arithmetic as carefully as you can" mode probably constitutes a
request to use (3) or one of its variants.

-- 
-- Jonathan Thornburg <jthorn@thp.univie.ac.at>
   Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
   Golm, Germany             http://www.aei.mpg.de/~jthorn/home.html
   "Washing one's hands of the conflict between the powerful and the
    powerless means to side with the powerful, not to be neutral."
                                      -- quote by Freire / poster by Oxfam

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-04  3:32       ` Jonathan Thornburg
@ 2001-08-04  5:03         ` Joseph S. Myers
  0 siblings, 0 replies; 50+ messages in thread
From: Joseph S. Myers @ 2001-08-04  5:03 UTC (permalink / raw)
  To: Jonathan Thornburg; +Cc: gcc, Jonathan Thornburg

On Sat, 4 Aug 2001, Jonathan Thornburg wrote:

> Suppose we're in a language where "complex" is a primitive datatype
> (eg Fortran or C99).  Then there are several plausible ways to implement
> complex division:

We need to follow #pragma STDC CX_LIMITED_RANGE - default OFF - here.

G.5.1#8 provides example code for complex division - which still doesn't
avoid all cases of overflow/underflow.

> beyond the square root of the overflow/underflow threshold.  (3) avoids
> almost (though not quite) all of the spurious overflows/underflows, at
> the cost of a data-dependent conditional branch, which is a major
> performance hit on (heavily-pipelined) modern hardware.  (3) also has
> larger object code.

When we have an implementation of this sort of division in GCC, clearly it
should go in libgcc as a library function.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-01 10:27 Reichelt
  2001-08-01 10:42 ` Gabriel Dos_Reis
  0 siblings, 1 reply; 50+ messages in thread
From: Reichelt @ 2001-08-01 10:27 UTC (permalink / raw)
  To: gcc

Wolfgang Bangerth wrote:

> Due to repeated demand: here is the opinion of a numerical analyst.

And here's a second opinion from a numerical analyst (me).

> To further define my position: there is some kind of schism in the numerics
> society, the "traditionalists" doing Fortran with highly tuned codes and
> relatively simple data structures, and the "modernists" using extremely
> large C++ programs on complex data structures. I belong to the latter
> group and can only speak for them. I know that the former group has
> different views on these topics.

I'm also a member of the latter group.

> My opinion is that -fast-math could well include slight deviations from
> IEEE modes for denormals, rounding modes, associative redistribution, etc.

For me this is true also.

> The reasons being:
> - denormals: if you run into the risk of getting NaNs one way but not
>   the other, you most certainly already have gone beyond the limits of
>   your program, i.e. your linear solver diverged, you have chosen an
>   inherently unstable formulation for your problem, etc. In practice, if
>   you don't get you NaN on this operation, you will get it some operations
>   later, and if not then the result is spoilt anyway because of the
>   instability. I cannot imagine someone writing a code which runs on
>   a well-defined side of the boundary between normal and denormalized
>   values and still returns reasonable results. And even if such a program
>   existed, then that person must have known exactly what he does and will
>   refrain from -fast-math and use -mieee instead.

Agreed to.

> - rounding modes: if it makes a difference, then again you are already
>   in an instable regime.
>   Then why bother: I know that I only solve an
>   approximation of the true laws of nature (e.g. after discretizing a
>   partial differential equation), why solve that to 16 digits of accuracy?

I think you are mixing two different things here.

To make things a little more precise you have to look at the different
errors you're introducing when numerically solving a problem:

(1) Difference between (continuous) model and reality (including noisy
    signals etc.),
(2) discretization error (difference between the continuous and the finite
    problem)
(3) round-off errors (all kinds of errors that stem from the fact
    that you have to approximate real numbers by floating point
    numbers with finite precision.

The point is: If the errors caused by (3) are smaller than the errors
in (1) and (2) then you're fine with -ffast-math.

With ill-conditioned problems this is not always the case.
You might get 2 or 3 correct digits out of your computation,
if you're are very careful with rounding. But you might lose
them if you don't do it right. But then: use -mieee.

> - association: Fortran codes differ significantly from C++ codes in
>   some ways, not the least are that C++ codes are often split into many
>   very small parts. In these cases, I have no control over how the
>   operands associate, if they come from different functions. For example,
>   if I have (a very much simplified situation)
>     double x, y;
>     double f () { return x; };
>     double g () { return y; };
>     double h () { return x; };
>     ...
>     double z = f() * g() / h();
>   then I'd be happy if I get z==y, although that might not be right
>   from an IEEE viewpoint, or if f()+g()+h() would be turned into
>   2*x+y. If you think that this example is unrealistic,
>   think about using the C++ valarray<> class with its small functions
>   combined with a case where some of the function's arguments are constant
>   and known so that most of the operations in the body of the functions
>   can be optimized away. If the compiler can help optimize such cases
>   across the boundaries of single functions, then that would be great, but
>   it will often require violating associative laws.
>
>   Note that this is a case where C++ differs significantly from Fortran:
>   there, you usually have large functions and optimizations take place
>   within these.

Agreed to.

In C++ you often don't want to tune your code for speed but choose
for readability instead. It would be nice if the compiler did the tuning.
Deciding whether the compiler can do this in an agressive manner or not
should be left to the user. He is the only one who can make that decision
depending on whether the additional errors are tolerable or not.
Those who cannot make this decision beforehand could stick to harmless
optimizations or just try the effects of -ffast-math. (They probably
wouldn't write code that produced tolerable results with IEEE and
intolerable without. So it's of no use to strictly stick to IEEE anyway.)

I opt for quite aggressive (and well documented) optimizations with
-ffast-math (with a big warning in the documentation). But -ffast-math
should not be turned on with -O2 by default. Maybe with -O3, but
I think it's best to leave it to the user completely.

Probably even better: Make different levels of optimization where one
can decide how much of 'sticking to IEEE' (I don't want to say 'security'
here) one is willing to pay for speed.

I see that this might be difficult if you want to compile a library.
Some people might want the fastest version some others the one that
sticks to IEEE. Having the source code of the library available you
can compile two versions if you need/want to. So even in that case
it's better to have the choice instead of having to stick to IEEE.
If you don't have the source code then you are dependant on what
the vendor chooses. But hopefully he will make the right decision
(you trusted him with other decisions, didn't you?).

My conclusion is: Give the user the option to switch on/off quite
aggressive optimizations/transformations whenever he pleases.

> So, concluding: if you have programs that run for several days, you'd be
> happy if you could cut that time by some hours using optimizations that
> - still do what common sense would dictate (IEEE is not pure common sense,
>   but rather a definition), i.e. swapping operands would be allowed, but
>   replacing NaNs by zero would not
> - can be switched off in case I know I have to treat boundary cases
>   exactly.
> I myself would certainly appreciate a compiler turning a*c+b*c into
> (a+b)*c, I'd definitely not be shocked about that.

Agreed to.

Greetings,
Volker Reichelt

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01 10:27 Reichelt
@ 2001-08-01 10:42 ` Gabriel Dos_Reis
  0 siblings, 0 replies; 50+ messages in thread
From: Gabriel Dos_Reis @ 2001-08-01 10:42 UTC (permalink / raw)
  To: Reichelt; +Cc: gcc

| I opt for quite aggressive (and well documented) optimizations with
| -ffast-math (with a big warning in the documentation). But -ffast-math
| should not be turned on with -O2 by default. Maybe with -O3, but
| I think it's best to leave it to the user completely.

I'm not of the opinion that -ffast-math should automatically be turned
on at any level of optimization. It should be selected by the user.
In order words, current settings in that respect is fine.

-- Gaby

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-01  6:28 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-01  6:28 UTC (permalink / raw)
  To: dewar, wolfgang.bangerth; +Cc: gcc

<<An n% increase in speed makes n hours on a 4-day run. But then you're
right that it is probably not worth the effort to implement optimizations
that get you 1 or 2 per cent. If it's 10%, it starts getting interesting.
>>

warning: 10% is a VERY heavy burden, very few individual optimizations
can make this much effect, so I would actually accept a much smaller
threshhold. For example, if the a/b/c transformation made a 2% imrpovement
to established programs, quake or SPEC :-) then it is on the radar as
significant. 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-01  6:27 dewar
  0 siblings, 0 replies; 50+ messages in thread
From: dewar @ 2001-08-01  6:27 UTC (permalink / raw)
  To: dewar, wolfgang.bangerth; +Cc: gcc

<<LWell, the claim was that the result is not _degraded_, but just _altered_.
Solving a linear system of equations with IEEE is just one way to get at
an approximate of the true inverse. It would be interesting to see whether
the residual   || A * A^-1 ||  is smaller if you compute an approximate
inverse A^-1 using IEEE or -fast-math, using the same algorithm. I'd claim
it does not make much of a difference.

>>

That's probably true in some cases, but most certainly NOT true in the case
of carefully constructed numerical codes, which depend on careful analysis
assuming the expected computational model.

Whether fast-math is acceptable in such cases depends on what it implements.
There is no point in being other than conservative with transformations in
the absence of any demonstration of any value of a transformation.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-01  6:12 dewar
  2001-08-01  6:25 ` Wolfgang Bangerth
  0 siblings, 1 reply; 50+ messages in thread
From: dewar @ 2001-08-01  6:12 UTC (permalink / raw)
  To: gcc, wolfgang.bangerth

<<So, concluding: if you have programs that run for several days, you'd be
happy if you could cut that time by some hours using optimizations that
>>

The idea that these optimizations can save time of this magnitude is without
evidentiary foundation. Sure, if an optimization can save a lot of time, we
look at it carefully. But if not, then it is not worth the time to implement
it, or the uncertainty and complexity that it introduces, even if the result
is identical. In the case where the result is also degraded, the burden is
greater.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  6:12 dewar
@ 2001-08-01  6:25 ` Wolfgang Bangerth
  2001-08-01  7:01   ` Gabriel Dos Reis
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Bangerth @ 2001-08-01  6:25 UTC (permalink / raw)
  To: dewar; +Cc: gcc, wolfgang.bangerth

On Wed, 1 Aug 2001 dewar@gnat.com wrote:
> <<So, concluding: if you have programs that run for several days, you'd be
> happy if you could cut that time by some hours using optimizations that
> >>
> 
> The idea that these optimizations can save time of this magnitude is without
> evidentiary foundation.

An n% increase in speed makes n hours on a 4-day run. But then you're
right that it is probably not worth the effort to implement optimizations
that get you 1 or 2 per cent. If it's 10%, it starts getting interesting.

> In the case where the result is also degraded, the burden is
> greater.

Well, the claim was that the result is not _degraded_, but just _altered_.
Solving a linear system of equations with IEEE is just one way to get at
an approximate of the true inverse. It would be interesting to see whether
the residual   || A * A^-1 ||  is smaller if you compute an approximate
inverse A^-1 using IEEE or -fast-math, using the same algorithm. I'd claim
it does not make much of a difference.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth          email: wolfgang.bangerth@iwr.uni-heidelberg.de
                             www: http://gaia.iwr.uni-heidelberg.de/~wolf

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  6:25 ` Wolfgang Bangerth
@ 2001-08-01  7:01   ` Gabriel Dos Reis
  2001-08-01  7:20     ` Wolfgang Bangerth
  0 siblings, 1 reply; 50+ messages in thread
From: Gabriel Dos Reis @ 2001-08-01  7:01 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: dewar, gcc

Wolfgang Bangerth <wolfgang.bangerth@iwr.uni-heidelberg.de> writes:

| Well, the claim was that the result is not _degraded_, but just _altered_.

Must we quibble?

| Solving a linear system of equations with IEEE is just one way to get at
| an approximate of the true inverse. It would be interesting to see whether
| the residual   || A * A^-1 ||  is smaller if you compute an approximate
| inverse A^-1 using IEEE or -fast-math, using the same algorithm. I'd claim
| it does not make much of a difference.

Firstly, no competent programmer would ever dream of using -ffast-math
for doing serious numerical computations.  And that point is already
understood in this discussion.

Secondly, Linear Algebra is just a tiny part of numerical
computations, and matrix residuals are even a much smaller part; thus
the point you're trying to make is unclear.

Finally, as a concrete example, look at the classical problem of
approximating the zeros of a univariate polynomial.  There you have
true degragation.  Just take the Wilkinson polynomial of 20th degree,
and make slight pertubation to its cofficients (preferably the three
leading cofficients) and see what happens. 

http://www-sop.inria.fr/saga/POL/BASE/1.unipol/wilkinson.1.n.dir/index.html

or more generally have a look at 

	http://www-sop.inria.fr/saga/POL/BASE/1.unipol/

-- Gaby

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  7:01   ` Gabriel Dos Reis
@ 2001-08-01  7:20     ` Wolfgang Bangerth
  2001-08-01  7:45       ` Gabriel Dos Reis
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Bangerth @ 2001-08-01  7:20 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: dewar, gcc

On 1 Aug 2001, Gabriel Dos Reis wrote:
> | Well, the claim was that the result is not _degraded_, but just _altered_.
> 
> Must we quibble?

I was honest: degraded only with respect to a certain set of definition on
which transformations a program may undergo during compilation. See, if
you discretize a partial differential equation you often have an error
(compared to the analytical solution of the problem) of 1-10% anyway, due
to the discretization. Where is the point in solving the discretized
problem to absurd accuracies, rather than doing it quickly with an error
that is below the discretization error anyway. Using -mieee and
-ffast-math, you only get two solutions, both being roughly of the same
accuracy with respect to the analytical solution. That's not degraded,
it's just different.

> | Solving a linear system of equations with IEEE is just one way to get at
> | an approximate of the true inverse. It would be interesting to see whether
> | the residual   || A * A^-1 ||  is smaller if you compute an approximate
> | inverse A^-1 using IEEE or -fast-math, using the same algorithm. I'd claim
> | it does not make much of a difference.
> 
> Firstly, no competent programmer would ever dream of using -ffast-math
> for doing serious numerical computations.  And that point is already
> understood in this discussion.

Maybe you think so, yet I do numerical computations for a living and still
use -ffast-math. So do the other ~20 people in our workgroup, and I
certainly know of more. They do so for the reasons stated above: the
accuracy of our programs is in any case lower than that of the
discretization, or of the model by which we describe reality.

> Secondly, Linear Algebra is just a tiny part of numerical
> computations, and matrix residuals are even a much smaller part; thus
> the point you're trying to make is unclear.

Then let me state it differently: in our work inverting matrices makes up
probably more than 50% of the computing time. We don't actually do it
using LU decompositions, or Gauss eliminations, but iteratively, such as
using a Conjugate Gradient method. These methods are inherently stable,
i.e. they can cope with slight inaccuracies, at the worst at the price of
one or two more iterations, but equally likely by gaining one or two. If
these iterations can be made faster, then that's a worthy goal.

> Finally, as a concrete example, look at the classical problem of
> approximating the zeros of a univariate polynomial.  There you have
> true degragation.  Just take the Wilkinson polynomial of 20th degree,
> and make slight pertubation to its cofficients (preferably the three
> leading cofficients) and see what happens. 

That's understood, but it is in an entirely different part of
computational mathematics. There you obviously need high and known
precision because the problems are unstable. PDE solvers, OTOH, usually
use stable methods. It is only for the latter applications that I said it
would be useful to have more aggressive/dubious optimizations.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth          email: wolfgang.bangerth@iwr.uni-heidelberg.de
                             www: http://gaia.iwr.uni-heidelberg.de/~wolf

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  7:20     ` Wolfgang Bangerth
@ 2001-08-01  7:45       ` Gabriel Dos Reis
  2001-08-01  7:55         ` Wolfgang Bangerth
  0 siblings, 1 reply; 50+ messages in thread
From: Gabriel Dos Reis @ 2001-08-01  7:45 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: Gabriel Dos Reis, dewar, gcc

Wolfgang Bangerth <wolfgang.bangerth@iwr.uni-heidelberg.de> writes:

| > Secondly, Linear Algebra is just a tiny part of numerical
| > computations, and matrix residuals are even a much smaller part; thus
| > the point you're trying to make is unclear.
| 
| Then let me state it differently: in our work inverting matrices makes up
| probably more than 50% of the computing time. We don't actually do it
| using LU decompositions, or Gauss eliminations, but iteratively, such as
| using a Conjugate Gradient method. These methods are inherently stable,
| i.e. they can cope with slight inaccuracies, at the worst at the price of
| one or two more iterations, but equally likely by gaining one or two. If
| these iterations can be made faster, then that's a worthy goal.
| 
| 
| > Finally, as a concrete example, look at the classical problem of
| > approximating the zeros of a univariate polynomial.  There you have
| > true degragation.  Just take the Wilkinson polynomial of 20th degree,
| > and make slight pertubation to its cofficients (preferably the three
| > leading cofficients) and see what happens. 
| 
| That's understood, but it is in an entirely different part of
| computational mathematics.

Just as inverting matrices is.

| There you obviously need high and known precision because the
| problems are unstable. PDE solvers, OTOH, usually use stable
| methods. It is only for the latter applications that I said it 
| would be useful to have more aggressive/dubious optimizations.

You seem to beleive that we were using unstable methods to approximate
polynomial roots.  That is untrue.  We are using stable methods,
combined with separation algorithms.  The trouble is not really the
methods but the problems at hand: Polynomial roots are very sensitive
to pertubation to coefficients.

My point was bring in a concrete conter-example to the claim that, it
doesn't matter how dubious are the transformations, it suffices to
use stable algorithms.

-- Gaby

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  7:45       ` Gabriel Dos Reis
@ 2001-08-01  7:55         ` Wolfgang Bangerth
  2001-08-01  8:03           ` Dima Volodin
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Bangerth @ 2001-08-01  7:55 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: dewar, gcc

On 1 Aug 2001, Gabriel Dos Reis wrote:
> | > Finally, as a concrete example, look at the classical problem of
> | > approximating the zeros of a univariate polynomial.  There you have
> | > true degragation.  Just take the Wilkinson polynomial of 20th degree,
> | > and make slight pertubation to its cofficients (preferably the three
> | > leading cofficients) and see what happens. 
> | There you obviously need high and known precision because the
> | problems are unstable. PDE solvers, OTOH, usually use stable
    ^^^^^^^^^^^^^^^^^^^^^
> | methods. It is only for the latter applications that I said it 
> | would be useful to have more aggressive/dubious optimizations.
> 
> You seem to believe that we were using unstable methods to approximate
> polynomial roots.  That is untrue.  We are using stable methods,
> combined with separation algorithms.  The trouble is not really the
> methods but the problems at hand: Polynomial roots are very sensitive
> to pertubation to coefficients.

Of course, it's just like nobody uses Gaussian elimination without
pivoting. I understand that it's the problem at hand that makes the
difficulties. That's why I wrote "problems", not "algorithms" :-)


> My point was bring in a concrete conter-example to the claim that, it
> doesn't matter how dubious are the transformations, it suffices to
> use stable algorithms.

So we agree: there are instable problems where -ffast-math makes no sense
because they are too sensitive even when using stable algorithms. And then
there are stable problems where stable methods might profit from
potentially dubious transformations; for these, a/b/c=a/(b*c) and
a*c+b*c=(a+b)*c would make sense.

My intention was just to point out that there are non-negligible branches
of computational maths where the stability of the problems and algorithms
would allow for such transformations and there they'd be useful.

Regards
  Wolfgang

-------------------------------------------------------------------------
Wolfgang Bangerth          email: wolfgang.bangerth@iwr.uni-heidelberg.de
                             www: http://gaia.iwr.uni-heidelberg.de/~wolf


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  7:55         ` Wolfgang Bangerth
@ 2001-08-01  8:03           ` Dima Volodin
  0 siblings, 0 replies; 50+ messages in thread
From: Dima Volodin @ 2001-08-01  8:03 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: Gabriel Dos Reis, dewar, gcc

On Wed, 1 Aug 2001 16:55:02 +0200 (MET DST), you wrote:

>So we agree: there are instable problems where -ffast-math makes no sense
>because they are too sensitive even when using stable algorithms. And then
>there are stable problems where stable methods might profit from
>potentially dubious transformations; for these, a/b/c=a/(b*c) and
>a*c+b*c=(a+b)*c would make sense.

Just out of curiosity: what about

	t1 = a*c;
	t2 = b*c;
	res = t1+t2;

Is it supposed to be folded into

	res = (a+b)*c;

with aggressive -ffast-math? Why?

>  Wolfgang

Thanks!

Dima

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
@ 2001-08-01  4:08 Wolfgang Bangerth
  2001-08-01  7:55 ` Gabriel Dos Reis
  0 siblings, 1 reply; 50+ messages in thread
From: Wolfgang Bangerth @ 2001-08-01  4:08 UTC (permalink / raw)
  To: gcc

Due to repeated demand: here is the opinion of a numerical analyst. To
further define my position: there is some kind of schism in the numerics
society, the "traditionalists" doing Fortran with highly tuned codes and
relatively simple data structures, and the "modernists" using extremely
large C++ programs on complex data structures. I belong to the latter
group and can only speak for them. I know that the former group has
different views on these topics.

My opinion is that -fast-math could well include slight deviations from
IEEE modes for denormals, rounding modes, associative redistribution, etc.
The reasons being:
- denormals: if you run into the risk of getting NaNs one way but not
  the other, you most certainly already have gone beyond the limits of
  your program, i.e. your linear solver diverged, you have chosen an
  inherently unstable formulation for your problem, etc. In practice, if
  you don't get you NaN on this operation, you will get it some operations
  later, and if not then the result is spoilt anyway because of the
  instability. I cannot imagine someone writing a code which runs on
  a well-defined side of the boundary between normal and denormalized
  values and still returns reasonable results. And even if such a program
  existed, then that person must have known exactly what he does and will
  refrain from -fast-math and use -mieee instead.
- rounding modes: if it makes a difference, then again you are already
  in an instable regime. Then why bother: I know that I only solve an
  approximation of the true laws of nature (e.g. after discretizing a
  partial differential equation), why solve that to 16 digits of accuracy?
- association: Fortran codes differ significantly from C++ codes in
  some ways, not the least are that C++ codes are often split into many
  very small parts. In these cases, I have no control over how the
  operands associate, if they come from different functions. For example,
  if I have (a very much simplified situation)
    double x, y;
    double f () { return x; };
    double g () { return y; };
    double h () { return x; };
    ...
    double z = f() * g() / h();
  then I'd be happy if I get z==y, although that might not be right
  from an IEEE viewpoint, or if f()+g()+h() would be turned into
  2*x+y. If you think that this example is unrealistic,
  think about using the C++ valarray<> class with its small functions
  combined with a case where some of the function's arguments are constant
  and known so that most of the operations in the body of the functions
  can be optimized away. If the compiler can help optimize such cases
  across the boundaries of single functions, then that would be great, but
  it will often require violating associative laws.

  Note that this is a case where C++ differs significantly from Fortran:
  there, you usually have large functions and optimizations take place
  within these.

So, concluding: if you have programs that run for several days, you'd be
happy if you could cut that time by some hours using optimizations that
- still do what common sense would dictate (IEEE is not pure common sense,
  but rather a definition), i.e. swapping operands would be allowed, but
  replacing NaNs by zero would not
- can be switched off in case I know I have to treat boundary cases
  exactly.
I myself would certainly appreciate a compiler turning a*c+b*c into
(a+b)*c, I'd definitely not be shocked about that.

Regards
  Wolfgang


-------------------------------------------------------------------------
Wolfgang Bangerth          email: wolfgang.bangerth@iwr.uni-heidelberg.de
                             www: http://gaia.iwr.uni-heidelberg.de/~wolf



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: What is acceptable for -ffast-math? A numerical viewpoint
  2001-08-01  4:08 Wolfgang Bangerth
@ 2001-08-01  7:55 ` Gabriel Dos Reis
  0 siblings, 0 replies; 50+ messages in thread
From: Gabriel Dos Reis @ 2001-08-01  7:55 UTC (permalink / raw)
  To: Wolfgang Bangerth; +Cc: gcc

Wolfgang Bangerth <wolfgang.bangerth@iwr.uni-heidelberg.de> writes:

| I belong to the latter group and can only speak for them.

I do extensive numerics (for living) in C++, and clearly you're not
speaking for me.

| - denormals: if you run into the risk of getting NaNs one way but not
|   the other, you most certainly already have gone beyond the limits of
|   your program

Untrue.  Using Graeffe's iteration (a stable algorithm to find the
greatest or smallest root radius) to isolate polynomial roots, one
can get NaNs for specific reasons without necessarily using an
unstable formaltation of the problem -- that was the whole point of a
subproject of the FIRSCO project.

-- Gaby

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2001-08-06 18:51 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-05  5:48 What is acceptable for -ffast-math? A numerical viewpoint dewar
2001-08-05 11:36 ` Laurent Guerby
  -- strict thread matches above, loose matches on Subject: below --
2001-08-06 18:51 dewar
2001-08-06  2:23 dewar
2001-08-06  9:50 ` Kevin Handy
2001-08-05 18:03 dewar
2001-08-05 20:16 ` Ross Smith
2001-08-05 15:37 dewar
2001-08-05 16:46 ` Ross Smith
2001-08-05 15:27 dewar
2001-08-05 15:26 dewar
2001-08-05  7:31 dewar
2001-08-05  5:55 dewar
2001-08-05  5:36 dewar
2001-08-05  6:43 ` Toon Moene
2001-08-05  5:35 dewar
2001-08-05  3:51 dewar
2001-08-05  5:38 ` Laurent Guerby
2001-08-05 13:43 ` Ross Smith
2001-08-04  6:17 dewar
     [not found] <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00221.html>
     [not found] ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00223.html>
     [not found]   ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00225.html>
2001-08-04  6:16     ` Jonathan Thornburg
2001-08-04  5:37 dewar
2001-08-04  5:27 dewar
2001-08-04  3:08 dewar
2001-08-04  2:11 dewar
2001-08-04 17:24 ` Ross Smith
2001-08-03 18:31 dewar
2001-08-03 20:34 ` Ross Smith
2001-08-03 14:55 dewar
2001-08-03 16:57 ` Ross Smith
2001-08-03 17:00   ` Tim Hollebeek
     [not found] <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00009.html>
     [not found] ` <url:http://gcc.gnu.org/ml/gcc/2001-07/msg02141.html>
2001-08-03  9:17   ` Jonathan Thornburg
2001-08-03 12:59     ` Toon Moene
2001-08-04  7:26     ` Laurent Guerby
     [not found]   ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00176.html>
     [not found]     ` <url:http://gcc.gnu.org/ml/gcc/2001-08/msg00192.html>
2001-08-04  2:41       ` Jonathan Thornburg
2001-08-04  3:32       ` Jonathan Thornburg
2001-08-04  5:03         ` Joseph S. Myers
2001-08-01 10:27 Reichelt
2001-08-01 10:42 ` Gabriel Dos_Reis
2001-08-01  6:28 dewar
2001-08-01  6:27 dewar
2001-08-01  6:12 dewar
2001-08-01  6:25 ` Wolfgang Bangerth
2001-08-01  7:01   ` Gabriel Dos Reis
2001-08-01  7:20     ` Wolfgang Bangerth
2001-08-01  7:45       ` Gabriel Dos Reis
2001-08-01  7:55         ` Wolfgang Bangerth
2001-08-01  8:03           ` Dima Volodin
2001-08-01  4:08 Wolfgang Bangerth
2001-08-01  7:55 ` Gabriel Dos Reis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).