Re: GCC beaten by ICC in stupid trig test!

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: GCC beaten by ICC in stupid trig test!
@ 2004-03-15 16:06 Robert Dewar
  2004-03-16  8:45 ` Per Abrahamsen
  2004-03-16 12:14 ` Scott Robert Ladd
  0 siblings, 2 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-15 16:06 UTC (permalink / raw)
  To: coyote, pinskia; +Cc: gcc

> The point here if you know that it is 1.0 then just return 1.0 instead 
> of trying to
> play tricks with trig functions.

optimziations like this are inappropriate in my view. We are dealing
with floating-point not real arithmetic.

Indeed if any programmer writes sin**2+cos**2, one can assume that the
intention is precisely to get the computed value that may not be 1.0. If
the programmer wants 1.0, they can write it explicitly!

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 16:06 GCC beaten by ICC in stupid trig test! Robert Dewar
@ 2004-03-16  8:45 ` Per Abrahamsen
  2004-03-17  0:09   ` Robert Dewar
  2004-03-16 12:14 ` Scott Robert Ladd
  1 sibling, 1 reply; 85+ messages in thread
From: Per Abrahamsen @ 2004-03-16  8:45 UTC (permalink / raw)
  To: gcc

dewar@gnat.com (Robert Dewar) writes:

> Indeed if any programmer writes sin**2+cos**2, one can assume that the
> intention is precisely to get the computed value that may not be 1.0. If
> the programmer wants 1.0, they can write it explicitly!

Not really, the sin**2+cos**2 may often be more readable, and may be
hidden behind some inlined functions.

Please do not assume that "real arithmetic" optimizations have already
been performed by the programmer by hand.  And please do not
discourage people from implementing them.  If you must, insist they
are only enabled by a flag for use by those of us who want the
compiler to do such optimizations.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 16:06 GCC beaten by ICC in stupid trig test! Robert Dewar
  2004-03-16  8:45 ` Per Abrahamsen
@ 2004-03-16 12:14 ` Scott Robert Ladd
  2004-03-17  0:19   ` Robert Dewar
  1 sibling, 1 reply; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-16 12:14 UTC (permalink / raw)
  To: Robert Dewar; +Cc: pinskia, gcc

Robert Dewar wrote:
> Indeed if any programmer writes sin**2+cos**2, one can assume that the
> intention is precisely to get the computed value that may not be 1.0. If
> the programmer wants 1.0, they can write it explicitly!

Mathematical identities provide good tests of numerical accuracy; I have 
found instances where certain compilers do *not* compute "sin**2+cos**2" 
as equal to 1, usually due to a poor implementation of sin and cos.

The actual line of code in question was part of a testing spike, and it 
had no effect on the actual subject of the original post, and was not 
intended to be part of a production program.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-16  8:45 ` Per Abrahamsen
@ 2004-03-17  0:09   ` Robert Dewar
  2004-03-17  0:36     ` Scott Robert Ladd
                       ` (2 more replies)
  0 siblings, 3 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-17  0:09 UTC (permalink / raw)
  To: Per Abrahamsen; +Cc: gcc

Per Abrahamsen wrote:

> Not really, the sin**2+cos**2 may often be more readable

Claiming that sin**2+cos**2 is more readable than 1.0 is really
a very marginal argument, especially since the former raises
the issue of whether you mean 1.0, or the actual value computed
by this formula. If you want 1.0, say so.

> and may be
> hidden behind some inlined functions.

All the more reason not to perform inappropriate optimziations

> Please do not assume that "real arithmetic" optimizations have already
> been performed by the programmer by hand.

The other view point here is "please don't assume the programmer
is incompetent and does not know what he is doing. Floating-point
operations are precisely defined in IEEE and if I am a serious
fpt programmer, I write the computations I want, and I do not
want the compiler substituting arbitrary non-equivalent
expressions.

> And please do not
> discourage people from implementing them.  If you must, insist they
> are only enabled by a flag for use by those of us who want the
> compiler to do such optimizations.

It's OK by me (though I consider it dubious and useless) to have
such optimizations around. It is absolutely essential that it be
possible to disable them.

I suppose that for people writing casual fpt code without error
analysis assuming that it gives some acceptable approximation
of real arithmetic, such optimizations are relatively harmless.
After all if you are happy with wrong results, then I suppose
getting different wrong results faster is not unacceptable.
(wrong results here = results whose accuracy is unknown).

But as above, for serious fpt programming, a compiler that
performs non-meaning preserving transformations (e.g. assuming
that fpt addition is commutative, or even worse associative),
is a menace.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-16 12:14 ` Scott Robert Ladd
@ 2004-03-17  0:19   ` Robert Dewar
  0 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-17  0:19 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: pinskia, gcc

Scott Robert Ladd wrote:

> 
> 
> Mathematical identities provide good tests of numerical accuracy; I have 
> found instances where certain compilers do *not* compute "sin**2+cos**2" 
> as equal to 1, usually due to a poor implementation of sin and cos.

You definitely cannot assume that sin**2+cos**2 = exactly 1.0 throughout
the defined domain of these functions. Expecting this identity is 
similar to thinking that x**y can be computed using log and exp!

> The actual line of code in question was part of a testing spike, and it 
> had no effect on the actual subject of the original post, and was not 
> intended to be part of a production program.

Sure, in fact I suspect the only use of sin**2+cos**2 in real programs
has to do with testing/exploiting the difference between the value of
this computed expression and 1.0. Yes, I know you can posit macros etc,
but I would be surprised if anyone comes up with a real program where
this expression appears in any other context.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  0:09   ` Robert Dewar
@ 2004-03-17  0:36     ` Scott Robert Ladd
  2004-03-17  5:53     ` Gabriel Dos Reis
  2004-03-17 14:51     ` Per Abrahamsen
  2 siblings, 0 replies; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-17  0:36 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Per Abrahamsen, gcc

Robert Dewar wrote:
> The other view point here is "please don't assume the programmer
> is incompetent and does not know what he is doing. Floating-point
> operations are precisely defined in IEEE and if I am a serious
> fpt programmer, I write the computations I want, and I do not
> want the compiler substituting arbitrary non-equivalent
> expressions.
> 
> It's OK by me (though I consider it dubious and useless) to have
> such optimizations around. It is absolutely essential that it be
> possible to disable them.
> 
> I suppose that for people writing casual fpt code without error
> analysis assuming that it gives some acceptable approximation
> of real arithmetic, such optimizations are relatively harmless.
> After all if you are happy with wrong results, then I suppose
> getting different wrong results faster is not unacceptable.
> (wrong results here = results whose accuracy is unknown).
> 
> But as above, for serious fpt programming, a compiler that
> performs non-meaning preserving transformations (e.g. assuming
> that fpt addition is commutative, or even worse associative),
> is a menace.

I strongly concur with everything Robert said here. People have a wide 
range of requirements for floating point code; what may work well for a 
game is not going to work well for a details. Sometimes, you need high 
accuracy; sometimes you need reproducible results on many platforms; 
sometimes, you need fast numbers the have low precision requirements. A 
quality compiler must make clear how options affect these parameters, 
and allow precise control of its behavior.

I'm working on an advanced accuracy benchmark for C, C++, and Fortran 
95; if anyone has suggestions, please feel free to e-mail me privately.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  0:09   ` Robert Dewar
  2004-03-17  0:36     ` Scott Robert Ladd
@ 2004-03-17  5:53     ` Gabriel Dos Reis
  2004-03-17  7:21       ` Robert Dewar
  2004-03-23 19:38       ` Joe Buck
  2004-03-17 14:51     ` Per Abrahamsen
  2 siblings, 2 replies; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-17  5:53 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Per Abrahamsen, gcc

Robert Dewar <dewar@gnat.com> writes:

| > Please do not assume that "real arithmetic" optimizations have already
| > been performed by the programmer by hand.
| 
| The other view point here is "please don't assume the programmer
| is incompetent and does not know what he is doing. Floating-point
| operations are precisely defined in IEEE and if I am a serious
| fpt programmer, I write the computations I want, and I do not
| want the compiler substituting arbitrary non-equivalent
| expressions.

Agreed in principle, but I think the whole point is that if you're a
competent floating point programmer, you would not be using
-ffast-math :-)

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  5:53     ` Gabriel Dos Reis
@ 2004-03-17  7:21       ` Robert Dewar
  2004-03-17  9:10         ` Gabriel Dos Reis
  2004-03-23 19:38       ` Joe Buck
  1 sibling, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-17  7:21 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Per Abrahamsen, gcc

> Agreed in principle, but I think the whole point is that if you're a
> competent floating point programmer, you would not be using
> -ffast-math :-)

That's fair enough, but then we should keep this criterion in mind
(no optimization that would be useful to competent floating-point
programmers should be included in -ffast-math :-)
> 
> -- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  7:21       ` Robert Dewar
@ 2004-03-17  9:10         ` Gabriel Dos Reis
  2004-03-21 16:55           ` Robert Dewar
  0 siblings, 1 reply; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-17  9:10 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Per Abrahamsen, gcc

Robert Dewar <dewar@gnat.com> writes:

| > Agreed in principle, but I think the whole point is that if you're a
| > competent floating point programmer, you would not be using
| > -ffast-math :-)
| 
| That's fair enough, but then we should keep this criterion in mind
| (no optimization that would be useful to competent floating-point
| programmers should be included in -ffast-math :-)

I had always thought that was the case :-)
Now, if you're saying that is not what -ffast-math is supposed to
mean, then I'm nervous.

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  0:09   ` Robert Dewar
  2004-03-17  0:36     ` Scott Robert Ladd
  2004-03-17  5:53     ` Gabriel Dos Reis
@ 2004-03-17 14:51     ` Per Abrahamsen
  2004-03-17 15:18       ` Gabriel Dos Reis
  2 siblings, 1 reply; 85+ messages in thread
From: Per Abrahamsen @ 2004-03-17 14:51 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc

Robert Dewar <dewar@gnat.com> writes:

> Per Abrahamsen wrote:
>
>> Not really, the sin**2+cos**2 may often be more readable
>
> Claiming that sin**2+cos**2 is more readable than 1.0 is really
> a very marginal argument,

It can easily be, in some result table with other sin/cos based
functions.  The alternative would be to add a comment stating that we
really meant "sin**2+cos**2", but hand optimized it into "1.0" because
the compiler for weird ideological reasons refused to do the job for
us.

>  especially since the former raises
> the issue of whether you mean 1.0, or the actual value computed
> by this formula. If you want 1.0, say so.

I, and I suspect 99% of all programmers, would mean >> the real value
produced by the mathematical formula "sin**2+cos**2" <<.  I know that
you, and other true numerical programmers, probably feel that we ought
to serve burgers instead of programming.  But wishful thinking doesn't
make it so.

> Floating-point operations are precisely defined in IEEE and if I am
> a serious fpt programmer, I write the computations I want, and I do
> not want the compiler substituting arbitrary non-equivalent
> expressions.

Fine, I just ask for a compiler flag for the *other* kind of
programmers, and for you not to discourage people from adding
optimization there.

> After all if you are happy with wrong results, then I suppose
> getting different wrong results faster is not unacceptable.
> (wrong results here = results whose accuracy is unknown).

Well, in my case, the accuracy of the models used are "within an order
of magnitude, or so we hope" and the accuracy of the input data are
"within an order of magnitude, or so we hope".  Floating-point
artifacts are not the primary concern when it come to the
interpretation of the output data.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17 14:51     ` Per Abrahamsen
@ 2004-03-17 15:18       ` Gabriel Dos Reis
  2004-03-17 16:05         ` Per Abrahamsen
  0 siblings, 1 reply; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-17 15:18 UTC (permalink / raw)
  To: Per Abrahamsen; +Cc: Robert Dewar, gcc

Per Abrahamsen <abraham@dina.kvl.dk> writes:

| > Floating-point operations are precisely defined in IEEE and if I am
| > a serious fpt programmer, I write the computations I want, and I do
| > not want the compiler substituting arbitrary non-equivalent
| > expressions.
| 
| Fine, I just ask for a compiler flag for the *other* kind of
| programmers, and for you not to discourage people from adding
| optimization there.

Calling a transformation an "optimization" before it is proven an
optimization is a serious semantic hole that undermines many
reasonings in that department.

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17 15:18       ` Gabriel Dos Reis
@ 2004-03-17 16:05         ` Per Abrahamsen
  0 siblings, 0 replies; 85+ messages in thread
From: Per Abrahamsen @ 2004-03-17 16:05 UTC (permalink / raw)
  To: gcc

Gabriel Dos Reis <gdr@integrable-solutions.net> writes:

> Calling a transformation an "optimization" before it is proven an
> optimization is a serious semantic hole that undermines many
> reasonings in that department.

Likely true, as I don't understand a word of it.  If my wording was
improper, pretend it was phrased in the terminology of your
preference.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  9:10         ` Gabriel Dos Reis
@ 2004-03-21 16:55           ` Robert Dewar
  0 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-21 16:55 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Per Abrahamsen, gcc

Gabriel Dos Reis wrote:

> | That's fair enough, but then we should keep this criterion in mind
> | (no optimization that would be useful to competent floating-point
> | programmers should be included in -ffast-math :-)
> 
> I had always thought that was the case :-)
> Now, if you're saying that is not what -ffast-math is supposed to
> mean, then I'm nervous.

Everyone I think agrees that this is what -ffast-math means. What
is a potential problem is that not everyone has a good understanding
of fpt issues, so the decision of what goes into -ffast-math needs
constant vigilance :-)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-17  5:53     ` Gabriel Dos Reis
  2004-03-17  7:21       ` Robert Dewar
@ 2004-03-23 19:38       ` Joe Buck
  2004-03-23 19:58         ` Gabriel Dos Reis
  1 sibling, 1 reply; 85+ messages in thread
From: Joe Buck @ 2004-03-23 19:38 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Robert Dewar, Per Abrahamsen, gcc

On Wed, Mar 17, 2004 at 05:28:40AM +0100, Gabriel Dos Reis wrote:
> Agreed in principle, but I think the whole point is that if you're a
> competent floating point programmer, you would not be using
> -ffast-math :-)

Not necessarily; a *really* competent floating point programmer can make
the assessment that the precision required for a particular floating point
task is more than adequately provided by -ffast-math.  Lossy audio coding
is one typical case; there are many others.


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-23 19:38       ` Joe Buck
@ 2004-03-23 19:58         ` Gabriel Dos Reis
  2004-03-23 20:49           ` Laurent GUERBY
  0 siblings, 1 reply; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-23 19:58 UTC (permalink / raw)
  To: Joe Buck; +Cc: Robert Dewar, Per Abrahamsen, gcc

Joe Buck <Joe.Buck@synopsys.COM> writes:

| On Wed, Mar 17, 2004 at 05:28:40AM +0100, Gabriel Dos Reis wrote:
| > Agreed in principle, but I think the whole point is that if you're a
| > competent floating point programmer, you would not be using
| > -ffast-math :-)
| 
| Not necessarily; a *really* competent floating point programmer can make
| the assessment that the precision required for a particular floating point
| task is more than adequately provided by -ffast-math.  Lossy audio coding
| is one typical case; there are many others.

Well, lossy audio coding is not the kind of computations I would say
involves serious floating point capabilities :-)
But, I can see what you meant.

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-23 19:58         ` Gabriel Dos Reis
@ 2004-03-23 20:49           ` Laurent GUERBY
  2004-03-24  8:17             ` Toon Moene
  0 siblings, 1 reply; 85+ messages in thread
From: Laurent GUERBY @ 2004-03-23 20:49 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Joe Buck, Robert Dewar, Per Abrahamsen, gcc

On Tue, 2004-03-23 at 19:46, Gabriel Dos Reis wrote:
> Joe Buck <Joe.Buck@synopsys.COM> writes:
> 
> | On Wed, Mar 17, 2004 at 05:28:40AM +0100, Gabriel Dos Reis wrote:
> | > Agreed in principle, but I think the whole point is that if you're a
> | > competent floating point programmer, you would not be using
> | > -ffast-math :-)
> | 
> | Not necessarily; a *really* competent floating point programmer can make
> | the assessment that the precision required for a particular floating point
> | task is more than adequately provided by -ffast-math.  Lossy audio coding
> | is one typical case; there are many others.
> 
> Well, lossy audio coding is not the kind of computations I would say
> involves serious floating point capabilities :-)
> But, I can see what you meant.

All Monte Carlo based algorithms are also likely to be in
the -ffast-math category. You're happy with 3-4 digits
of "accuracy" thanks to 1/sqrt(n) convergence :).

Also -ffast-math might be useful if you're not an expert in FP,
run your tests with and without, if something changes a lot
you're probably in a tricky area FP wise. Of course
if nothing changes you've not proven that much...

Laurent (who keeps 500 processors busy running Monte Carlo simulations
7/7 :)


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-23 20:49           ` Laurent GUERBY
@ 2004-03-24  8:17             ` Toon Moene
  2004-03-24 13:50               ` Robert Dewar
  0 siblings, 1 reply; 85+ messages in thread
From: Toon Moene @ 2004-03-24  8:17 UTC (permalink / raw)
  To: Laurent GUERBY
  Cc: Gabriel Dos Reis, Joe Buck, Robert Dewar, Per Abrahamsen, gcc

Laurent GUERBY wrote:

> All Monte Carlo based algorithms are also likely to be in
> the -ffast-math category. You're happy with 3-4 digits
> of "accuracy" thanks to 1/sqrt(n) convergence :).

Another example is all computation where reducing the continuous physics 
to discrete mathematics already introduced truncation errors larger than 
any floating point rounding error at 24 bit mantissa precision.

[ It's only five centuries ago that people who were able to predict
   the weather were burnt on the stake as witches - do not be so foolish
   to repeat history here ]

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc.gnu.org/fortran/ (under construction)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24  8:17             ` Toon Moene
@ 2004-03-24 13:50               ` Robert Dewar
  2004-03-24 18:25                 ` Paul Koning
  2004-03-24 18:56                 ` Joe Buck
  0 siblings, 2 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-24 13:50 UTC (permalink / raw)
  To: Toon Moene
  Cc: Laurent GUERBY, Gabriel Dos Reis, Joe Buck, Per Abrahamsen, gcc

Toon Moene wrote:

> Laurent GUERBY wrote:
> 
>> All Monte Carlo based algorithms are also likely to be in
>> the -ffast-math category. You're happy with 3-4 digits
>> of "accuracy" thanks to 1/sqrt(n) convergence :).

> Another example is all computation where reducing the continuous physics 
> to discrete mathematics already introduced truncation errors larger than 
> any floating point rounding error at 24 bit mantissa precision.

Well I don't know exactly what stuff is in -ffast-math, but I suspect it 
is a mistake to just think in terms of rounding error vs the mantissa 
length. For example, if -ffast-math is so sloppy as to consider
that (a+b)+c can be replaced by a+(b+c), then all bets are off.
It is easy to construct cases where the former has a value of 1.0
and the latter has a value of 0.0.



^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 13:50               ` Robert Dewar
@ 2004-03-24 18:25                 ` Paul Koning
  2004-03-24 18:51                   ` Robert Dewar
  2004-03-24 18:56                 ` Joe Buck
  1 sibling, 1 reply; 85+ messages in thread
From: Paul Koning @ 2004-03-24 18:25 UTC (permalink / raw)
  To: dewar; +Cc: gcc

>>>>> "Robert" == Robert Dewar <dewar@gnat.com> writes:

 Robert> Toon Moene wrote:
 >> Laurent GUERBY wrote:
 >> 
 >>> All Monte Carlo based algorithms are also likely to be in the
 >>> -ffast-math category. You're happy with 3-4 digits of "accuracy"
 >>> thanks to 1/sqrt(n) convergence :).

 >> Another example is all computation where reducing the continuous
 >> physics to discrete mathematics already introduced truncation
 >> errors larger than any floating point rounding error at 24 bit
 >> mantissa precision.

 Robert> Well I don't know exactly what stuff is in -ffast-math, but I
 Robert> suspect it is a mistake to just think in terms of rounding
 Robert> error vs the mantissa length. For example, if -ffast-math is
 Robert> so sloppy as to consider that (a+b)+c can be replaced by
 Robert> a+(b+c), then all bets are off.  It is easy to construct
 Robert> cases where the former has a value of 1.0 and the latter has
 Robert> a value of 0.0.

It's obvious that you can construct pathological cases where you end
up with zero bits of accuracy.  That doesn't justify the conclusion
that -ffast-math should avoid transformations where this may happen.
Otherwise you might as well get rid of -ffast-math -- which would be a
major mistake.

Incidentally, it is equally trivial to construct a pathological case
where the statement as written has zero bits of accuracy and the
transformation is much more accurate.

       paul

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 18:25                 ` Paul Koning
@ 2004-03-24 18:51                   ` Robert Dewar
  2004-03-25 18:18                     ` Per Abrahamsen
  0 siblings, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-24 18:51 UTC (permalink / raw)
  To: Paul Koning; +Cc: gcc

Paul Koning wrote:

> It's obvious that you can construct pathological cases where you end
> up with zero bits of accuracy.  That doesn't justify the conclusion
> that -ffast-math should avoid transformations where this may happen.
> Otherwise you might as well get rid of -ffast-math -- which would be a
> major mistake.

What's not so obvious (clearly) is that these are not pathological
cases. On the contrary, it is all to easy to run into them. It is
most misleading to present a view to casual fpt programmers that
-ffast-math is unlikely to run into anomolies except in pathological
cases, and that they can reasonably expect just minor impact on
precision.

Good advice given here by someone that gives a little more confidence
is to run a range of tests with and without the option to see what
effects it has.

> Incidentally, it is equally trivial to construct a pathological case
> where the statement as written has zero bits of accuracy and the
> transformation is much more accurate.

And that gets to the heart of the problem. When I write:

    (a + b) + c;

I get *exactly* the precise fpt result that I am asking for, your
statement that a + (b + c) might be more accurate is only true if
you persist in regarding such expressions as inaccurate real
arithmetic, but competent floating-point programmers don't think
of things this way, and they write the sequence of operations
that they want to compute. So replacing (a+b)+c with a+(b+c)
NEVER improves the accuracy, it merely computes the wrong result!

Now going back to the list of issues, there are big differences
between various optimizations. For example, increased accuracy
and range are not necessarily a problem, although it is very
nice, as in Ada, to have individual control.

Ada specifically allows extra precision. In some cases extra
precision can indeed be a menace (one of the problems with
hex fpt arithmetic is precisely that the precision is variable).
But for many algorithms it does indeed not change the error
bounds. In Ada, you can write

      Long_Float'Machine (A + B);

to force the result A+B to be exactly represented in standard
Long_Float format (i.e. remove the extra precision). It would
be nice to have a way of doing this in C.

> 
>        paul
> 

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 13:50               ` Robert Dewar
  2004-03-24 18:25                 ` Paul Koning
@ 2004-03-24 18:56                 ` Joe Buck
  2004-03-24 19:10                   ` Robert Dewar
  1 sibling, 1 reply; 85+ messages in thread
From: Joe Buck @ 2004-03-24 18:56 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Toon Moene, Laurent GUERBY, Gabriel Dos Reis, Per Abrahamsen, gcc

On Tue, Mar 23, 2004 at 10:32:13PM -0500, Robert Dewar wrote:
> Well I don't know exactly what stuff is in -ffast-math, but I suspect it 
> is a mistake to just think in terms of rounding error vs the mantissa 
> length. 

Most of the optimizations have to do with ignoring the need to distinguish
+0 and -0, or handle NaNs.

> For example, if -ffast-math is so sloppy as to consider
> that (a+b)+c can be replaced by a+(b+c), then all bets are off.

That's why -ffast-math doesn't do that; such a transformation would be
massively brain-damaged.  Please don't post speculation as to what
-ffast-math does, when that speculation will scare people away from using
what is sometimes the right tool for the job.

Reading the manual can be quite educational.  -ffast-math is actually the
combination of five separate shortcuts, which can be enabled or disabled
individually: -fno-math-errno, -funsafe-math-optimizations,
-fno-trapping-math, -ffinite-math-only and -fno-signaling-nans

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 18:56                 ` Joe Buck
@ 2004-03-24 19:10                   ` Robert Dewar
  2004-03-24 19:14                     ` Richard Guenther
  0 siblings, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-24 19:10 UTC (permalink / raw)
  To: Joe Buck
  Cc: Toon Moene, Laurent GUERBY, Gabriel Dos Reis, Per Abrahamsen, gcc

Joe Buck wrote:

> On Tue, Mar 23, 2004 at 10:32:13PM -0500, Robert Dewar wrote:
> 
>>Well I don't know exactly what stuff is in -ffast-math, but I suspect it 
>>is a mistake to just think in terms of rounding error vs the mantissa 
>>length. 
>  
> Most of the optimizations have to do with ignoring the need to distinguish
> +0 and -0, or handle NaNs.
> 
>>For example, if -ffast-math is so sloppy as to consider
>>that (a+b)+c can be replaced by a+(b+c), then all bets are off.
> 
> That's why -ffast-math doesn't do that; such a transformation would be
> massively brain-damaged.

It would be a useful achievement in this thread if everyone understood
the truth of the above sentence. Clearly that is not the case from other
contributions to the thread :-)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:10                   ` Robert Dewar
@ 2004-03-24 19:14                     ` Richard Guenther
  2004-03-24 19:39                       ` Paul Brook
  0 siblings, 1 reply; 85+ messages in thread
From: Richard Guenther @ 2004-03-24 19:14 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Joe Buck, Toon Moene, Laurent GUERBY, Gabriel Dos Reis,
	Per Abrahamsen, gcc

Robert Dewar wrote:
> Joe Buck wrote:
> 
>> On Tue, Mar 23, 2004 at 10:32:13PM -0500, Robert Dewar wrote:
>>
>>> Well I don't know exactly what stuff is in -ffast-math, but I suspect 
>>> it is a mistake to just think in terms of rounding error vs the 
>>> mantissa length. 
>>
>>  
>> Most of the optimizations have to do with ignoring the need to 
>> distinguish
>> +0 and -0, or handle NaNs.
>>
>>> For example, if -ffast-math is so sloppy as to consider
>>> that (a+b)+c can be replaced by a+(b+c), then all bets are off.
>>
>>
>> That's why -ffast-math doesn't do that; such a transformation would be
>> massively brain-damaged.
> 
> 
> It would be a useful achievement in this thread if everyone understood
> the truth of the above sentence. Clearly that is not the case from other
> contributions to the thread :-)

Well, first this (transforming (a+b)+c to a+(b+c)) would be a question 
of if the language standard permits this.  After this, I personally 
would like to have a way to override associativity, and I cannot see
a clearer way as to write (a+b)+c instead of a+b+c.  But that may be a 
language standard question again.  If (a+b)+c doesn't do it, I cannot 
see another way of really forcing evaluation order.

Another question would be, if a+b+c is always (a+b)+c, or if it is 
(a+(b+c)) or if this is (should be) unspecified.  Maybe we'd need
-fpreserve-left-to-right-evaluation-order?  This way 1.0+a+1.0 should 
not be transformed to 2.0+a or a+2.0.

Richard.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:14                     ` Richard Guenther
@ 2004-03-24 19:39                       ` Paul Brook
  2004-03-24 19:45                         ` Dave Korn
  0 siblings, 1 reply; 85+ messages in thread
From: Paul Brook @ 2004-03-24 19:39 UTC (permalink / raw)
  To: gcc
  Cc: Richard Guenther, Robert Dewar, Joe Buck, Toon Moene,
	Laurent GUERBY, Gabriel Dos Reis, Per Abrahamsen

> Well, first this (transforming (a+b)+c to a+(b+c)) would be a question
> of if the language standard permits this.  After this, I personally
> would like to have a way to override associativity, and I cannot see
> a clearer way as to write (a+b)+c instead of a+b+c.  But that may be a
> language standard question again.  If (a+b)+c doesn't do it, I cannot
> see another way of really forcing evaluation order.

The Fortran standard specifies that you can reorder a+b+c, but not (a+b)+c. 
Basically you must preserve paretheses, anything else is fair game. The exact 
wording is "any mathematically equivalent expression". I don't know what the 
C standard specifies on this issue.

We don't currently have a way of representing this. We're either overly 
conservative, or violate the standard.

Paul

^ permalink raw reply	[flat|nested] 85+ messages in thread

* RE: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:39                       ` Paul Brook
@ 2004-03-24 19:45                         ` Dave Korn
  2004-03-24 20:57                           ` Paul Brook
  2004-03-25  6:14                           ` Robert Dewar
  0 siblings, 2 replies; 85+ messages in thread
From: Dave Korn @ 2004-03-24 19:45 UTC (permalink / raw)
  To: gcc

 

> -----Original Message-----
> From: gcc-owner On Behalf Of Paul Brook
> Sent: 24 March 2004 18:53

> > Well, first this (transforming (a+b)+c to a+(b+c)) would be 
> a question 
> > of if the language standard permits this.  After this, I personally 
> > would like to have a way to override associativity, and I 
> cannot see a 
> > clearer way as to write (a+b)+c instead of a+b+c.  But that 
> may be a 
> > language standard question again.  If (a+b)+c doesn't do 
> it, I cannot 
> > see another way of really forcing evaluation order.
> 
> The Fortran standard specifies that you can reorder a+b+c, 
> but not (a+b)+c. 
> Basically you must preserve paretheses, anything else is fair 
> game. The exact wording is "any mathematically equivalent 
> expression". I don't know what the C standard specifies on this issue.
> 
> We don't currently have a way of representing this. We're 
> either overly conservative, or violate the standard.
> 
> Paul

  IIUIC the fact that the + operator is specified in the standard as binding
left-to-right imply that "a + b + c" with no brackets *has* to be
interpreted as "(a + b) + c".

  Either that, or I'm being dumb.


    cheers, 
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:45                         ` Dave Korn
@ 2004-03-24 20:57                           ` Paul Brook
  2004-03-25  6:14                           ` Robert Dewar
  1 sibling, 0 replies; 85+ messages in thread
From: Paul Brook @ 2004-03-24 20:57 UTC (permalink / raw)
  To: gcc

>   IIUIC the fact that the + operator is specified in the standard as
> binding left-to-right imply that "a + b + c" with no brackets *has* to be
> interpreted as "(a + b) + c".

Specific examples of allowable transformations (taken directly from the f95 
standard):

original written form-> allowable form
x+y -> y+x
-x+y -> y-x
x+y+z -> (x+y)+z
a+b-c -> a+(b-c)
x+y+z -> x+(y+z)
x*y+x*z -> x*(y+z)
a/b/c -> a/(b*c)
a/5.0 -> 0.2*a

However, the following transformation is not allowed:
a+(b-c)->(a+b)-c

Paul

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:45                         ` Dave Korn
  2004-03-24 20:57                           ` Paul Brook
@ 2004-03-25  6:14                           ` Robert Dewar
  2004-03-25 18:32                             ` Scott Robert Ladd
  1 sibling, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-25  6:14 UTC (permalink / raw)
  To: Dave Korn; +Cc: gcc

Dave Korn wrote:

>   IIUIC the fact that the + operator is specified in the standard as binding
> left-to-right imply that "a + b + c" with no brackets *has* to be
> interpreted as "(a + b) + c".
> 
>   Either that, or I'm being dumb.

In Fortran, this is an incorrect interpretation, the standard definitely
differentiates between (a+b)+c and a+b+c. So a Fortran compiler is 
allowed to reassociate the second expression (but it is a bad idea, and
it is desirable to have a mode in which this freedom is not taken
advangtage of).

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 18:51                   ` Robert Dewar
@ 2004-03-25 18:18                     ` Per Abrahamsen
  2004-03-27  1:26                       ` Robert Dewar
  0 siblings, 1 reply; 85+ messages in thread
From: Per Abrahamsen @ 2004-03-25 18:18 UTC (permalink / raw)
  To: gcc

Robert Dewar <dewar@gnat.com> writes:

> So replacing (a+b)+c with a+(b+c) NEVER improves the accuracy, it
> merely computes the wrong result!

You are speaking of different things here.  You speak of wrong
compared to the standards.  Paul Koning speaks of improved accuracy
compared to real arithmetics.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  6:14                           ` Robert Dewar
@ 2004-03-25 18:32                             ` Scott Robert Ladd
  2004-03-27  1:28                               ` Robert Dewar
  0 siblings, 1 reply; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-25 18:32 UTC (permalink / raw)
  To: Robert Dewar, gcc mailing list

Robert Dewar wrote:
> In Fortran, this is an incorrect interpretation, the standard definitely
> differentiates between (a+b)+c and a+b+c. So a Fortran compiler is 
> allowed to reassociate the second expression (but it is a bad idea, and
> it is desirable to have a mode in which this freedom is not taken
> advangtage of).

 From the latest Committee Draft of Fortran 2003, section 7.8.1.3:

  - - - - - - - - - - - - - -

The rules given in 7.2.1 specify the interpretation of a numeric
intrinsic operation. Once the interpretation has been established in
accordance with those rules, the processor may evaluate any
mathematically equivalent expression, provided that the integrity of
parentheses is not violated.

  - - - - - - - - - - - - - -

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 18:18                     ` Per Abrahamsen
@ 2004-03-27  1:26                       ` Robert Dewar
  0 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-27  1:26 UTC (permalink / raw)
  To: Per Abrahamsen; +Cc: gcc

Per Abrahamsen wrote:

> You are speaking of different things here.  You speak of wrong
> compared to the standards.  Paul Koning speaks of improved accuracy
> compared to real arithmetics.

Right, but C expressions are about the C language and floating-point,
not about real arithmetic!

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 18:32                             ` Scott Robert Ladd
@ 2004-03-27  1:28                               ` Robert Dewar
  0 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-27  1:28 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: gcc mailing list

Scott Robert Ladd wrote:
> Robert Dewar wrote:
> 
>> In Fortran, this is an incorrect interpretation, the standard definitely
>> differentiates between (a+b)+c and a+b+c. So a Fortran compiler is 
>> allowed to reassociate the second expression (but it is a bad idea, and
>> it is desirable to have a mode in which this freedom is not taken
>> advangtage of).
> 
> 
>  From the latest Committee Draft of Fortran 2003, section 7.8.1.3:
> 
>  - - - - - - - - - - - - - -
> 
> The rules given in 7.2.1 specify the interpretation of a numeric
> intrinsic operation. Once the interpretation has been established in
> accordance with those rules, the processor may evaluate any
> mathematically equivalent expression, provided that the integrity of
> parentheses is not violated.

I copy the entire message, because it is odd. Nothing in the second
paragraph contradicts what I said. On the contrary it is entirely
consistent.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-27  0:58                     ` Laurent GUERBY
@ 2004-03-27  1:16                       ` Joe Buck
  0 siblings, 0 replies; 85+ messages in thread
From: Joe Buck @ 2004-03-27  1:16 UTC (permalink / raw)
  To: Laurent GUERBY
  Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene,
	Scott Robert Ladd, gcc

I wrote:
> > IBM has a plan to take the #1 spot once the BlueGene project is complete
> > next year; their architecture is PowerPC based (each of its processor
> > chips has two PowerPC processors on the chip).  Their 1/128th scale
> > prototype is already #73.
 
On Sat, Mar 27, 2004 at 01:10:59AM +0100, Laurent GUERBY wrote:
> That's nice but economics still requires a price tag, and IBM being
> the only provider of this technology means the price will likely be high
> and most people will still buy x86 boxes (well x86_64 boxes of 2005, you
> can buy today 1U quad opterons*), up to a certain size where energy and
> space requirements will dictate something else.

They are going for "fastest", not "best price/performance".  And the buyer
for the first one is a military lab.  I could certainly talk more about
it, but it's off-topic for this list, so I'll stop now.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-27  0:55                   ` Joe Buck
@ 2004-03-27  0:58                     ` Laurent GUERBY
  2004-03-27  1:16                       ` Joe Buck
  0 siblings, 1 reply; 85+ messages in thread
From: Laurent GUERBY @ 2004-03-27  0:58 UTC (permalink / raw)
  To: Joe Buck
  Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene,
	Scott Robert Ladd, gcc, gcc-patches

On Sat, 2004-03-27 at 00:53, Joe Buck wrote:
> On Sat, Mar 27, 2004 at 12:20:24AM +0100, Laurent GUERBY wrote:
> > Economics. Less true nowadays with Apple decision to sell cheap bi G5
> > systems, but then it's reflected in the TOP500 with a very nice spot for
> > G5.
> 
> IBM has a plan to take the #1 spot once the BlueGene project is complete
> next year; their architecture is PowerPC based (each of its processor
> chips has two PowerPC processors on the chip).  Their 1/128th scale
> prototype is already #73.

That's nice but economics still requires a price tag, and IBM being
the only provider of this technology means the price will likely be high
and most people will still buy x86 boxes (well x86_64 boxes of 2005, you
can buy today 1U quad opterons*), up to a certain size where energy and
space requirements will dictate something else.

Laurent

* http://www.appro.com/product/server_1142h.asp
no idea if it works :)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-27  0:50                 ` Laurent GUERBY
@ 2004-03-27  0:55                   ` Joe Buck
  2004-03-27  0:58                     ` Laurent GUERBY
  0 siblings, 1 reply; 85+ messages in thread
From: Joe Buck @ 2004-03-27  0:55 UTC (permalink / raw)
  To: Laurent GUERBY
  Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene,
	Scott Robert Ladd, gcc, gcc-patches

On Sat, Mar 27, 2004 at 12:20:24AM +0100, Laurent GUERBY wrote:
> On Thu, 2004-03-25 at 14:06, Roger Sayle wrote:
> > But then its a complete
> > mystery why this so many of the top500 supercomputers are now Intel/AMD
> > clusters.
> 
> Economics. Less true nowadays with Apple decision to sell cheap bi G5
> systems, but then it's reflected in the TOP500 with a very nice spot for
> G5.

IBM has a plan to take the #1 spot once the BlueGene project is complete
next year; their architecture is PowerPC based (each of its processor
chips has two PowerPC processors on the chip).  Their 1/128th scale
prototype is already #73.  See

http://www.research.ibm.com/bluegene/

The completed machine will have 2**16 compute nodes and 1024 I/O nodes.
The I/O nodes run a Linux kernel; the compute nodes run a tiny OS that
supports only one process but provides a Posix interface.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 16:15               ` Roger Sayle
                                   ` (2 preceding siblings ...)
  2004-03-27  0:23                 ` Daniel Egger
@ 2004-03-27  0:50                 ` Laurent GUERBY
  2004-03-27  0:55                   ` Joe Buck
  3 siblings, 1 reply; 85+ messages in thread
From: Laurent GUERBY @ 2004-03-27  0:50 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Robert Dewar, Bradley Lucier, Toon Moene, Scott Robert Ladd, gcc,
	gcc-patches

On Thu, 2004-03-25 at 14:06, Roger Sayle wrote:
> But then its a complete
> mystery why this so many of the top500 supercomputers are now Intel/AMD
> clusters.

Economics. Less true nowadays with Apple decision to sell cheap bi G5
systems, but then it's reflected in the TOP500 with a very nice spot for
G5.

> Whilst I don't deny that there is a tiny population of GCC users whose
> results depend upon the specific representation of their floating point
> formats

People needing this level of precision always use assembly since no
compiler will ever suit them, especially on x86 because of the extended
mess.

Intel Compiler people must have realized this and offer compilers all
their users are happy with.

Laurent


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 16:15               ` Roger Sayle
  2004-03-25 16:36                 ` David Edelsohn
  2004-03-26  1:29                 ` Toon Moene
@ 2004-03-27  0:23                 ` Daniel Egger
  2004-03-27  0:50                 ` Laurent GUERBY
  3 siblings, 0 replies; 85+ messages in thread
From: Daniel Egger @ 2004-03-27  0:23 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

On 25.03.2004, at 14:06, Roger Sayle wrote:

> I consider myself serious, and make a very nice
> living from selling software to solve finite-difference 
> Poison-Boltzmann
> electrostatic calculations on regular grids, and molecular 
> minimizations
> using quasi-newtonian numerical optimizers.  Toon does numerical 
> weather
> forecasting, and he seems happy with -ffast-math.  Laurent performs 
> large
> scale Monte-Carlo simulations, and he also seems happy with it.

Hear, hear. I've written a FEM application myself and have always used
-fast-math. Since the final precision of the result depends much more
on the number of iterations to reach convergence than on the 
inprobabiltiy
of deliberate problem cases, the point is moot anyway.

I wouldn't claim myself to be a serious fp developer, but it was so
obvious that the results of -fast-math were always close or even 
identical
to the non-fast-math, that it made a bigger difference on which 
architecture
the application ran, so we always turned -fast-math on.

I expect the instabilities to show up only on cases which a professional
fp developer would rather check for then crunch. Hairy input can always
screw the result so I'd rather catch that than wonder about strange 
output.

> Another common myth is that anyone serious about floating point doesn't
> use the IA-32 architecture for numerical calculations, due to the 
> excess
> precision in floating point calculations.   But then its a complete
> mystery why this so many of the top500 supercomputers are now Intel/AMD
> clusters.

Interestingly, though the code for Intel was *much* better than for PPC,
my PPC machines all finished faster than my higher clocked AMD machines
on the same FEM code, often by simply requiring less iterations to reach
convergence.

Servus,
       Daniel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 478 bytes --]

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 16:15               ` Roger Sayle
  2004-03-25 16:36                 ` David Edelsohn
@ 2004-03-26  1:29                 ` Toon Moene
  2004-03-27  0:23                 ` Daniel Egger
  2004-03-27  0:50                 ` Laurent GUERBY
  3 siblings, 0 replies; 85+ messages in thread
From: Toon Moene @ 2004-03-26  1:29 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Robert Dewar, Bradley Lucier, Scott Robert Ladd, gcc, gcc-patches

Roger Sayle wrote:

 > Toon does numerical weather
> forecasting, and he seems happy with -ffast-math.

Indeed, and perhaps the incongruous rambling on "erroneous floating 
point approximations" is entirely the fault of the group that uses (and 
by any theoretical analysis is allowed to use) -ffast-math (or whatever 
the option might be called on a compiler from one of our competitors.

For me, it's very simple: If -ffast-math leads to answers that are less 
accurate (in the verification-against-observation-sense) or unphysical 
(against theoretical limit analysis), then I'll inspect my Fortran code 
and repair the formulation (either a single expression or the layout of 
loop code) that is responsible for the mayhem.

Under no circumstances I will give up using -ffast-math.

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc.gnu.org/fortran/ (under construction)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:50     ` Joe Buck
                         ` (2 preceding siblings ...)
  2004-03-25  7:24       ` Robert Dewar
@ 2004-03-25 18:19       ` Per Abrahamsen
  3 siblings, 0 replies; 85+ messages in thread
From: Per Abrahamsen @ 2004-03-25 18:19 UTC (permalink / raw)
  To: gcc

Joe Buck <Joe.Buck@synopsys.COM> writes:

> No.  Why would we need such a thing?  If the user does not care about
> order of evaluation, the user can write a+b+c .  

I write 

  1.0 * (a + b) + c

all the times in cases where I don't care about order of evaluation.

Or rather, I write 

  const double dt = 1.0; // [h]
  
  double foo (double a /* [kg N/h] */,
              double b /* [kg N/h] */,
              double c /* [kg N] */)
  { return dt * (a + b) + c; }

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 17:47                     ` David Edelsohn
@ 2004-03-25 18:03                       ` Scott Robert Ladd
  0 siblings, 0 replies; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-25 18:03 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc

David Edelsohn wrote:
> Scott> I know several types of customers (mine) who are using ICC in preference 
> Scott> to GCC:
> 
> 	What about ICC versus MSVC or other commercial compilers for
> Windows?

I haven't done any serious Windows development in over two years, so my 
information is a tad out-of-date. However, when I *was* doing Windows 
work, the company I worked for did use Intel's compiler for a document 
analysis and data-mining application, because the code was 10% faster 
than MSVC. That was two years ago, of course...

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 17:09                   ` Scott Robert Ladd
@ 2004-03-25 17:47                     ` David Edelsohn
  2004-03-25 18:03                       ` Scott Robert Ladd
  0 siblings, 1 reply; 85+ messages in thread
From: David Edelsohn @ 2004-03-25 17:47 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene, gcc

>>>>> Scott Robert Ladd writes:

Scott> I know several types of customers (mine) who are using ICC in preference 
Scott> to GCC:

	What about ICC versus MSVC or other commercial compilers for
Windows?

David

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 16:36                 ` David Edelsohn
@ 2004-03-25 17:09                   ` Scott Robert Ladd
  2004-03-25 17:47                     ` David Edelsohn
  0 siblings, 1 reply; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-25 17:09 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene, gcc

David Edelsohn wrote:
> 	What about the more general question of what type of customer uses
> ICC versus other compilers?  Are numericists using ICC?  Is GCC implicitly
> changing strategy with this proposal?

I know several types of customers (mine) who are using ICC in preference 
to GCC:

1) The developer of a real-time video codec, where Intel's C compiler 
produces code that is 15% faster than what is emitted from GCC.

2) A scientific institution writing C and Fortran 95 code for 
multiprocessor workstations, where they need/want OpenMP. Intel has 
OpenMP, GCC doesn't (we're working on it).

3) People who need a complete Fortran 95 (which Intel sells), but who 
don't want to spring for a commercial compiler. Note that Intel's 
Fortran 95 has some rough edges; if people can pay, they generally go 
for Absoft, PGI, or Lahey's products. I use Lahey myself.

Among people who *don't* use the Intel compiler are those who use other 
types of processors (including Opteron, since Intel turns off it's best 
optimizations on non-Intel processors). On my Opteron box, for example, 
gfortran is my only choice for a 64-bit hosted/64-bit generative Fortran 
95 compiler.

For Pentium 3 and 4, I find that Intel produces faster code on most 
numerical applications; I have seen little or not effect on accuracy. 
Again, this is why we need a *real* accuracy benchmark, so we can 
produce empirical data for understanding the trade-off between 
performance and accuracy.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
@ 2004-03-25 16:51 Wolfgang Bangerth
  0 siblings, 0 replies; 85+ messages in thread
From: Wolfgang Bangerth @ 2004-03-25 16:51 UTC (permalink / raw)
  To: gcc
  Cc: Roger Sayle, Robert Dewar, Bradley Lucier, Toon Moene,
	Scott Robert Ladd, gcc, gcc-patches

Roger Sayle wrote:
> I've heard it argued that people who are serious about floating point
> don't use -ffast-math.  I consider myself serious, and make a very nice
> living from selling software to solve finite-difference Poison-Boltzmann
> electrostatic calculations on regular grids, and molecular minimizations
> using quasi-newtonian numerical optimizers.  Toon does numerical weather
> forecasting, and he seems happy with -ffast-math.  Laurent performs large
> scale Monte-Carlo simulations, and he also seems happy with it.

I would like to add to this one more voice. We distribute our finite element 
code for some 5 years now, with >200 downloads a month. We haven't, ever, 
heard someone complain that we include -ffast-math in our flags for optimized 
builds. I have also not ever heard someone in the finite element community 
think about the effects of correctly treating NaN etc (if your PDE simulator 
generates NaNs, then you have a bug), or worrying about the order of 
execution of a+b+c. In fact, I most often use parenthesis to make code more 
readable, and never to force the order of evaluation in a+b+c. If you start 
to worry whether a+b+c is different than a+c+b, then a simulator must be so 
instable that any prediction it gives must necessarily be almost useless.

Frankly, it makes me angry if people here keep repeating that "any 
self-respecting numerical programmer can't possibly use -ffast-math and must 
necessarily know and care which way a+b+c is evaluated". I claim that this 
is, in almost all cases, wrong. This view is anal, I can't track the order of 
evaluation of operands in a program or a quarter million LOC. Please stop 
asserting these things.

If it were for me, then we should make -ffast-math the default with -O2, and 
allow for reassociating arguments by default. If by things like fused 
multiply-add we can save another 10%, then that's almost one day off my 
week-long computations! Those few who do care about these things are free to 
use the respective flags.

W.

-------------------------------------------------------------------------
Wolfgang Bangerth              email:            bangerth@ices.utexas.edu
                               www: http://www.ices.utexas.edu/~bangerth/

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25 16:15               ` Roger Sayle
@ 2004-03-25 16:36                 ` David Edelsohn
  2004-03-25 17:09                   ` Scott Robert Ladd
  2004-03-26  1:29                 ` Toon Moene
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 85+ messages in thread
From: David Edelsohn @ 2004-03-25 16:36 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Robert Dewar, Bradley Lucier, Toon Moene, Scott Robert Ladd, gcc

>>>>> Roger Sayle writes:

Roger> So my next experiment was to search the internet to see if anyone had
Roger> ever complained about the floating point accuracy of Intel's icc
Roger> compilers.  After an hour or two, I was unable to find a single report;
Roger> SPECcpu2000's fp self-tests all pass, POV-ray images are bit for bit
Roger> identical, etc...

	What about the more general question of what type of customer uses
ICC versus other compilers?  Are numericists using ICC?  Is GCC implicitly
changing strategy with this proposal?

David

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  8:18             ` Robert Dewar
@ 2004-03-25 16:15               ` Roger Sayle
  2004-03-25 16:36                 ` David Edelsohn
                                   ` (3 more replies)
  0 siblings, 4 replies; 85+ messages in thread
From: Roger Sayle @ 2004-03-25 16:15 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Bradley Lucier, Toon Moene, Scott Robert Ladd, gcc, gcc-patches

On Thu, 25 Mar 2004, Robert Dewar wrote:
> > int foo(double a, double b, double c)
> > {
> >   return (a+b)+c == a+(b+c);
> > }
>
> If it is really true that icc treats this as true at compile
> time in default mode, that's simply appalling in my view, and
> not something gcc should copy!

It really is true.  icc performs two additions and then checks the
result to identify NaNs.  GCC, even with -ffast-math, performs four
additions, and then the comparison.

So my next experiment was to search the internet to see if anyone had
ever complained about the floating point accuracy of Intel's icc
compilers.  After an hour or two, I was unable to find a single report;
SPECcpu2000's fp self-tests all pass, POV-ray images are bit for bit
identical, etc...

However, what I did find was review and benchmark, after benchmark,
after benchmark, after benchmark showing that icc consistently
outperformed GCC at floating point math and trigonometry.  More annoying
personally, is that most reviewers/comparisons initially never used/tried
-ffast-math until it was pointed out...

A typical example of what can be found is:
http://news.povray.org/povray.unofficial.patches/thread/%3Cnh5s5v4v5bs90rko424jmdvujd2hk7tn7j%404ax.com%3E/

So although we can all construct non-portable floating point cases where
with a particular floating point representation on a particular target we
can cause reassociation to return a different result, but pragmatically
these effects have almost no impact in the wild.   Changing from double to
float, or double to long-double, moving from VAX to alpha, using IA-32,
or even causing an extra register spill can cause numeric codes that rely
on reassociation order differences to fail.  Any code that depends on the
result of "foo" above is already poorly written.

Many "serious" numerical codes make heavy use of matrix algebra via
libraries such as BLAS, LINPACK, ATLAS, EISPACK, etc..., but the fact that
these libraries don't require/specify the order in which inner terms must
be multiplied and added, would seem to support that in-real-life
reassociation is very well tolerated in numerical codes.  Indeed one of
the major reasons for using BLAS in numerical codes, is to take advantage
of hard-coded reassociation with different CPUs/cache sizes performing
multiplications and additions in dramatically different orders.

I've heard it argued that people who are serious about floating point
don't use -ffast-math.  I consider myself serious, and make a very nice
living from selling software to solve finite-difference Poison-Boltzmann
electrostatic calculations on regular grids, and molecular minimizations
using quasi-newtonian numerical optimizers.  Toon does numerical weather
forecasting, and he seems happy with -ffast-math.  Laurent performs large
scale Monte-Carlo simulations, and he also seems happy with it.

Another common myth is that anyone serious about floating point doesn't
use the IA-32 architecture for numerical calculations, due to the excess
precision in floating point calculations.   But then its a complete
mystery why this so many of the top500 supercomputers are now Intel/AMD
clusters.

Whilst I don't deny that there is a tiny population of GCC users whose
results depend upon the specific representation of their floating point
formats, whose "discretization" to a fixed number of bits is a requirement
rather than a unfortunate feature/side-effect of current hardware
limitations, it does seem very unfair to handicap GCC for the vast
majority.

I completely disagree that reassociation is "not something gcc should
copy".  But perhaps one could argue that the reason GCC shouldn't ever
perform reassociation even with -ffast-math, whilst icc performs it by
default, is because there's no overlap between our intended user bases
or that Intel's superior performance is not something GCC's users want?

Roger
--

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  5:36         ` Gabriel Dos Reis
@ 2004-03-25  8:46           ` Robert Dewar
  0 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-25  8:46 UTC (permalink / raw)
  To: Gabriel Dos Reis
  Cc: Toon Moene, Joe Buck, Bradley Lucier, roger, gcc, abraham,
	gcc-patches, laurent, fjahanian

Gabriel Dos Reis wrote:

> The C++ standard clearly says that re-association is permitted only if
> the re-associated expression gives the same result as the original.
> So A+B+C really is (A+B)+C.

In Ada, the rule is a little more liberal, the reassociation is allowed
if the result is in the same model interval. In practice though, either
criterion forbids the reassocation.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  7:24       ` Robert Dewar
@ 2004-03-25  8:28         ` Gabriel Dos Reis
  0 siblings, 0 replies; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-25  8:28 UTC (permalink / raw)
  To: Robert Dewar
  Cc: Joe Buck, Bradley Lucier, roger, gcc, abraham, gcc-patches, toon,
	laurent, fjahanian

Robert Dewar <dewar@gnat.com> writes:

| I dislike anything being called an optimization, unsafe or not, when it
| is not an optimization but in fact a distortion of required language
| semantics.

In the past, I've argued along the same line and proposed we speak in
terms of "transformations". 

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  0:11           ` Roger Sayle
  2004-03-25  5:56             ` Scott Robert Ladd
  2004-03-25  6:07             ` Bradley Lucier
@ 2004-03-25  8:18             ` Robert Dewar
  2004-03-25 16:15               ` Roger Sayle
  2 siblings, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-25  8:18 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Bradley Lucier, Toon Moene, Scott Robert Ladd, gcc, gcc-patches

Roger Sayle wrote:

> May I remind everyone that the subject title "GCC beaten by ICC in
> stupid trig test!" refers to a posting by Scott Robert Ladd in which
> Intel's icc compiler generates floating point code 64x faster than
> gcc 3.3.3 (http://gcc.gnu.org/ml/gcc/2004-03/msg00634.html).
> 
> Would anyone like to hazard a guess at how many floating point additions
> Intel icc v7 generates for the following function?  Even with icc's
> default arguments, i.e. "icc foo.c"?
> 
> int foo(double a, double b, double c)
> {
>   return (a+b)+c == a+(b+c);
> }

If it is really true that icc treats this as true at compile
time in default mode, that's simply appalling in my view, and
not something gcc should copy!

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 20:14   ` Paul Koning
  2004-03-24 21:00     ` Joe Buck
  2004-03-24 21:07     ` Joseph S. Myers
@ 2004-03-25  7:29     ` Robert Dewar
  2 siblings, 0 replies; 85+ messages in thread
From: Robert Dewar @ 2004-03-25  7:29 UTC (permalink / raw)
  To: Paul Koning; +Cc: Joe.Buck, lucier, gcc

Paul Koning wrote:

> I don't have a C standard, but my copy of Harbison & Steele says what
> I expected about parentheses: "Parentheses do not necessarily force a
> particular evaluation order".  

Harbison and Steele, whether read carefully or not :-) should not be
considered a substitute for the standard.

> So as far as I can tell, by the language rules, (a+b)+c and a+(b+c)
> are the same -- they have the same ordering properties (or lack
> thereof). 

I am unaware of any authority for this statement.
> 
> Is the implication that if -fno-fast-math is in effect, parentheses
> acquire an ADDITIONAL semantic (evaluation order) that goes beyond the
> C language definition (forcing operand grouping)?

I an unaware of any authority for this statement.
> 
> Curious that C++ (at least as described in Stroustrup) *does* say that
> parentheses force evaluation order...

Not curious at all, of course parentheses control the evaluation order,
e.g. in (a+b)*c.

> 
> 	    paul

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:50     ` Joe Buck
  2004-03-24 23:48       ` Toon Moene
  2004-03-25  5:34       ` Gabriel Dos Reis
@ 2004-03-25  7:24       ` Robert Dewar
  2004-03-25  8:28         ` Gabriel Dos Reis
  2004-03-25 18:19       ` Per Abrahamsen
  3 siblings, 1 reply; 85+ messages in thread
From: Robert Dewar @ 2004-03-25  7:24 UTC (permalink / raw)
  To: Joe Buck
  Cc: Bradley Lucier, roger, gdr, gcc, abraham, gcc-patches, toon,
	laurent, fjahanian

Joe Buck wrote:

>>I'm aware of the usual examples.  Do you think that we need yet another 
>>fast-math flag to allow the compiler to reassociate values in 
>>floating-point arithmetic? -freally-unsafe-math-optimizations perhaps?  
>>Or maybe such a programmer as you describe will just turn off 
>>-funsafe-math-optimizations.
> 
> 
> No.  Why would we need such a thing?  If the user does not care about
> order of evaluation, the user can write a+b+c .

That's an appropriate statement for Fortran, but not for other languages
like C, C++ and Ada, where a+b+c is semantically equivalent to (a+b)+c
and the reassociation is not allowed in either case.

I dislike anything being called an optimization, unsafe or not, when it
is not an optimization but in fact a distortion of required language
semantics.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  0:11           ` Roger Sayle
  2004-03-25  5:56             ` Scott Robert Ladd
@ 2004-03-25  6:07             ` Bradley Lucier
  2004-03-25  8:18             ` Robert Dewar
  2 siblings, 0 replies; 85+ messages in thread
From: Bradley Lucier @ 2004-03-25  6:07 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Bradley Lucier, Toon Moene, Scott Robert Ladd, gcc, gcc-patches

Oh, I'm all in favor of re-associating with 
-funsafe-math-optimizations, especially to enable fused multiply-add 
instructions. I just wanted to point out to Joe that we were going to 
do this since he seemed to have been dead-set against it.

Brad

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  0:11           ` Roger Sayle
@ 2004-03-25  5:56             ` Scott Robert Ladd
  2004-03-25  6:07             ` Bradley Lucier
  2004-03-25  8:18             ` Robert Dewar
  2 siblings, 0 replies; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-25  5:56 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc

Roger Sayle wrote:
> May I remind everyone that the subject title "GCC beaten by ICC in
> stupid trig test!" refers to a posting by Scott Robert Ladd...

This thread has wandered a tad far from it's original topic, hasn't it? 
I also note that my original subject did not contain the word "stupid", 
which was added by someone who with some badly mistaken notions and a 
rude manner.

Back to the original topic: Doing a "fair" comparison between ICC and 
GCC is problematic at best, given that neither compiler provides 
complete documentation about their various options. For example, it 
appears the ICC does "unsafe" math by default, leading me to suspect 
that I should use ICC's -mp or -mp1 switches when comparing against GCC. 
But I'm not certain if "icc -mp" is *really* equivalent to a plain "gcc" 
(sand -ffast-math).

At least in the case of GCC, I can study the source code to find every 
instance where -ffast-math affects code generation... however, the 
average compiler user has neither the skills or time to examine the 
compiler source code for indications of its behavior.

What a numerical programmer needs to know is: How, exactly, do all of 
these switches affect accuracy?

Accuracy benchmarks are few and far between, and many are hoary old 
codes translated badly from antiquated versions of Fortran. I've found 
that most of these "accuracy" benchmarks produce identical results with 
and without -ffast-math; when there are differences, it is trivial, and 
in one case, -ffast-math actually *improved* accuracy.

Of course, the definition of "accuracy" is somewhat nebulous. For some 
programs, it is important that identical results be produced on any 
platform; for other programs, accuracy reflects precision.

And, of course, most programmers forget the mathematical rules about 
significant digits in source data. If I multiply 1.5 by 3.1415927, the 
answer is 4.7, not 4.71238905, unless I know for a fact that 1.5 is 
exact, and not some measurement of a value between 1.49 and 1.51 (for 
example).

Ah, but I now enter the realm of interval arithmetic, and am drifting 
from my own topic.

Hence my desire to write a new accuracy benchmark, something I'm doing 
whenever I have some of that elusive "free time." ;)

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 23:48       ` Toon Moene
  2004-03-25  0:02         ` Toon Moene
@ 2004-03-25  5:36         ` Gabriel Dos Reis
  2004-03-25  8:46           ` Robert Dewar
  1 sibling, 1 reply; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-25  5:36 UTC (permalink / raw)
  To: Toon Moene
  Cc: Joe Buck, Bradley Lucier, roger, gcc, abraham, gcc-patches,
	laurent, fjahanian

Toon Moene <toon@moene.indiv.nluug.nl> writes:

| Joe Buck wrote:
| 
| > No.  Why would we need such a thing?  If the user does not care about
| > order of evaluation, the user can write a+b+c .  As someone said, right
| > now we can't tell the difference between a+b+c; if we turn it into GIMPLE
| > and make t1 = a+b; 52 = t1+c; we can't tell if the user initially wrote
| > (a+b)+c or a+b+c.
| 
| This is true as far as Fortran is concerned (I cannot speak for other
| languages).
| 
| A+B+C means that the compiler could either evaluate (A+B)+C or A+(B+C)
| or (A+C)+B.

The C++ standard clearly says that re-association is permitted only if
the re-associated expression gives the same result as the original.
So A+B+C really is (A+B)+C.

-- Gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:50     ` Joe Buck
  2004-03-24 23:48       ` Toon Moene
@ 2004-03-25  5:34       ` Gabriel Dos Reis
  2004-03-25  7:24       ` Robert Dewar
  2004-03-25 18:19       ` Per Abrahamsen
  3 siblings, 0 replies; 85+ messages in thread
From: Gabriel Dos Reis @ 2004-03-25  5:34 UTC (permalink / raw)
  To: Joe Buck
  Cc: Bradley Lucier, roger, gcc, abraham, gcc-patches, toon, laurent,
	fjahanian

[ Apologies,  I've been reading diagonally through the mails; many
  messages contain  inaccuracies and misleading statements.  I'm
  responding to Joe's message, not because it is such message but
  because it contains what I wanted to point to. ]

Joe Buck <Joe.Buck@synopsys.com> writes:

| On Mar 24, 2004, at 1:51 PM, Joe Buck wrote:
| > > And no, I'm not being pedantic; in many scientific apps, the programmer
| > > is aware of the expected range of values the variables will have, and
| > > will deliberately arrange the operations so that variables of similar
| > > magnitudes are combined.
| 
| On Wed, Mar 24, 2004 at 01:56:16PM -0500, Bradley Lucier wrote:
| > I'm aware of the usual examples.  Do you think that we need yet another 
| > fast-math flag to allow the compiler to reassociate values in 
| > floating-point arithmetic? -freally-unsafe-math-optimizations perhaps?  
| > Or maybe such a programmer as you describe will just turn off 
| > -funsafe-math-optimizations.
| 
| No.  Why would we need such a thing?  If the user does not care about
| order of evaluation, the user can write a+b+c .  As someone said, right
| now we can't tell the difference between a+b+c;

But we should.  The FORTRAN standard gives some liberties about the
re-association that some languages don't.  For example, in C++
a+b+c means (a+b)+c whether the parenthesis is there or not.
The C++ standard is clear on that.
We should be very careful about that fact.  I semi-joked previously
about -ffast-math but now I'm seeing that some messages in this thread
(not yours, Joe!) don't quite get it.

-- gaby

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-25  0:02         ` Toon Moene
@ 2004-03-25  0:11           ` Roger Sayle
  2004-03-25  5:56             ` Scott Robert Ladd
                               ` (2 more replies)
  0 siblings, 3 replies; 85+ messages in thread
From: Roger Sayle @ 2004-03-25  0:11 UTC (permalink / raw)
  To: Bradley Lucier, Toon Moene, Scott Robert Ladd; +Cc: gcc, gcc-patches

May I remind everyone that the subject title "GCC beaten by ICC in
stupid trig test!" refers to a posting by Scott Robert Ladd in which
Intel's icc compiler generates floating point code 64x faster than
gcc 3.3.3 (http://gcc.gnu.org/ml/gcc/2004-03/msg00634.html).

Would anyone like to hazard a guess at how many floating point additions
Intel icc v7 generates for the following function?  Even with icc's
default arguments, i.e. "icc foo.c"?

int foo(double a, double b, double c)
{
  return (a+b)+c == a+(b+c);
}

It might surprise some people that even with -funsafe-math-optimizations
and/or -ffast-math GCC can't generate code with that performance/accuracy
trade-off!

Roger
--

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 23:48       ` Toon Moene
@ 2004-03-25  0:02         ` Toon Moene
  2004-03-25  0:11           ` Roger Sayle
  2004-03-25  5:36         ` Gabriel Dos Reis
  1 sibling, 1 reply; 85+ messages in thread
From: Toon Moene @ 2004-03-25  0:02 UTC (permalink / raw)
  Cc: JoeJoe.Buck, Bradley Lucier, roger, gdr, gcc, abraham,
	gcc-patches, laurent, fjahanian

I wrote:

> Dave Korn wrote:
> 
>  >   IIUIC the fact that the + operator is specified in the standard as
>  > binding left-to-right imply that "a + b + c" with no brackets *has* to
>  > be interpreted as "(a + b) + c".
> 
> No, it establishes the way the multi-operator expressions may be 
> combined *for interpretation*.  I.e., what the expression means (on a 
> high level, before applying such rules as 1.4 (6) [This standard does 
> not specify] ... the method of rounding, approximating or computing 
> numeric values on a particular processor").

This might not be as clear as possible, because '+' happens to be 
associative in the REAL world.

What the Fortran standard wants to make clear is that a grouping is 
necessary to establish an interpretation for unparenthesized expressions 
with more than one operator.

The example with '+' is trivial, because '+' is associative in the real 
numbers, so a standard-enforced ordering is superfluous.

However, this is not true of '/' (division) or '**' (exponentiation). 
Therefore, the standard establishes that for division:

A / B / C   means   (A / B) / C   and not   A / (B / C)

and for exponentiation:

A ** B ** C   means   A ** (B ** C)   and not   (A ** B) ** C

Both interpretations follow mathematical practice, but still have to be 
defined for the standard.

Hope this helps,

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc.gnu.org/fortran/ (under construction)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:50     ` Joe Buck
@ 2004-03-24 23:48       ` Toon Moene
  2004-03-25  0:02         ` Toon Moene
  2004-03-25  5:36         ` Gabriel Dos Reis
  2004-03-25  5:34       ` Gabriel Dos Reis
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 85+ messages in thread
From: Toon Moene @ 2004-03-24 23:48 UTC (permalink / raw)
  To: Joe Buck
  Cc: Bradley Lucier, roger, gdr, gcc, abraham, gcc-patches, laurent,
	fjahanian

Joe Buck wrote:

> No.  Why would we need such a thing?  If the user does not care about
> order of evaluation, the user can write a+b+c .  As someone said, right
> now we can't tell the difference between a+b+c; if we turn it into GIMPLE
> and make t1 = a+b; 52 = t1+c; we can't tell if the user initially wrote
> (a+b)+c or a+b+c. 

This is true as far as Fortran is concerned (I cannot speak for other 
languages).

A+B+C means that the compiler could either evaluate (A+B)+C or A+(B+C) 
or (A+C)+B.

If you don't write the parentheses, you leave it up to the compiler to 
find the most convenient sequence of computations.

Dave Korn wrote:

 >   IIUIC the fact that the + operator is specified in the standard as
 > binding left-to-right imply that "a + b + c" with no brackets *has* to
 > be interpreted as "(a + b) + c".

No, it establishes the way the multi-operator expressions may be 
combined *for interpretation*.  I.e., what the expression means (on a 
high level, before applying such rules as 1.4 (6) [This standard does 
not specify] ... the method of rounding, approximating or computing 
numeric values on a particular processor").

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://gcc.gnu.org/fortran/ (under construction)

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 21:44       ` Joe Buck
@ 2004-03-24 22:49         ` Joseph S. Myers
  0 siblings, 0 replies; 85+ messages in thread
From: Joseph S. Myers @ 2004-03-24 22:49 UTC (permalink / raw)
  To: Joe Buck; +Cc: Paul Koning, lucier, gcc

On Wed, 24 Mar 2004, Joe Buck wrote:

> So: does this mean that a conforming C compiler is not permitted, for
> 
> double add(double a, double b, double c) { return a + b + c;}
> 
> to generate the equivalent of
> 
>    double t1 = a + c;
>    return t1 + b;
> 
> as a Fortran compiler is?

Not if it defines __STDC_IEC_559__.  Otherwise, 5.2.4.2.2#4 states that
the implementation may state that the accuracy of floating-point
operations is unknown.

-ffast-math isn't necessarily intended as a mode that conforms even to ISO
C, but clearly most of the component flags should disable __STDC_IEC_559__
(except -fno-math-errno, where math_errhandling should take on whatever
value is consistent with all translation units in the program,
-fno-rounding-math (default), since this should only determine the
FENV_ACCESS default, and -fno-signaling-nans (default), since ISO C
doesn't support signaling NaNs).  (The default of CX_LIMITED_RANGE is
meant to be "off".  We only implement "on", but if the pragma ever gets
implemented then I expect -ffast-math would also make the default be the
nonconforming "on".)

GCC doesn't define __STDC_IEC_559__.  Glibc does, unconditionally.  This
is broken since (a) it must be constant throughout the translation unit,
regardless of inclusion of standard headers, (b) it should be disabled by
any one of -funsafe-math-optimizations, -ffinite-math-only and
-fno-trapping-math (but we don't have predefined macros for all the
separate flags in -ffast-math to let glibc do this - we should).

-- 
Joseph S. Myers
jsm@polyomino.org.uk

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 22:19       ` Richard Guenther
@ 2004-03-24 22:21         ` Dale Johannesen
  0 siblings, 0 replies; 85+ messages in thread
From: Dale Johannesen @ 2004-03-24 22:21 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Joe.Buck, Paul Koning, gcc, Dale Johannesen, Joseph S. Myers, lucier


On Mar 24, 2004, at 11:39 AM, Richard Guenther wrote:

> Joseph S. Myers wrote:
>> On Wed, 24 Mar 2004, Paul Koning wrote:
>>> I don't have a C standard, but my copy of Harbison & Steele says what
>>> I expected about parentheses: "Parentheses do not necessarily force a
>>> particular evaluation order".
>> Evaluation order (sequence points) has nothing to do with 
>> associativity.  The C syntax specifies that a+b+c means exactly 
>> (a+b)+c - but in both cases, a, b and c can be evaluated in any 
>> order.
>
> Browsing through the standard I cannot find anything supporting this. 
> Can you point me to the right section?  You may be reffering to 6.5 
> (3), but that is overly vague and doesn't mention grouping by 
> parantheses at all ("The grouping of operators and operands is 
> specified by the syntax").  And 6.5.1 (5) seems to be ambiguous, too.
>
> Apart from evaluation order, I only can find phrases weakening control 
> over FP, like 6.5 paragraphs 5 and 8.

5.1.2.3 discusses the matter at length.


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 21:00     ` Joe Buck
@ 2004-03-24 22:20       ` Dale Johannesen
  0 siblings, 0 replies; 85+ messages in thread
From: Dale Johannesen @ 2004-03-24 22:20 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc, Dale Johannesen, Paul Koning, lucier

On Mar 24, 2004, at 11:12 AM, Joe Buck wrote:

> On Wed, Mar 24, 2004 at 02:10:13PM -0500, Paul Koning wrote:
>>  Joe> Consider a, b, and c as single precision floating point values,
>>  Joe> and a=1, b=-1, c=1.2345e-8.  (a+b)+c will compute as 1.2345e-8.
>>  Joe> a+(b+c) will return zero, as will (a+c)+b.
>>
>> I don't have a C standard, but my copy of Harbison & Steele says what
>> I expected about parentheses: "Parentheses do not necessarily force a
>> particular evaluation order".
>
> This was true of K&R C, but I seem to recall that the standards 
> committee
> changed that.  Any standards gurus out there care to comment?

This is correct.  My copy of H&S (1987) is explicit that reassociation 
is allowed
even when that would change the result.  It does have an appendix on
Draft Proposed ANSI C, as it was then, but that also does not mention 
the rule
that became standard; apparently this was introduced late in the 
standardization
process.  (It does list unary + as a means of forcing order of 
evaulation, thus:
   (+(a+b))+c
Unary minus did not have this effect.  I don't much like the rule 
that's in the standard,
but I have to admit it's better than that.)

The current standards define order of evaluation implicitly from the 
parsing order.
Rearrangements are permitted only if they can't change the result 
(generally true
for integer, not for FP).  Thus, it is more restrictive than Fortran, 
where K&R C was
less so.

But weren't we talking about -ffast-math?  Standards are irrelevant to 
that discussion.
-ffast-math is for things that aren't safe according to the standard, 
but are frequently
useful; that's what it's for.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 21:07     ` Joseph S. Myers
  2004-03-24 21:38       ` Joe Buck
  2004-03-24 21:44       ` Joe Buck
@ 2004-03-24 22:19       ` Richard Guenther
  2004-03-24 22:21         ` Dale Johannesen
  2 siblings, 1 reply; 85+ messages in thread
From: Richard Guenther @ 2004-03-24 22:19 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Paul Koning, Joe.Buck, lucier, gcc

Joseph S. Myers wrote:
> On Wed, 24 Mar 2004, Paul Koning wrote:
> 
> 
>>I don't have a C standard, but my copy of Harbison & Steele says what
>>I expected about parentheses: "Parentheses do not necessarily force a
>>particular evaluation order".  
> 
> 
> Evaluation order (sequence points) has nothing to do with associativity.  
> The C syntax specifies that a+b+c means exactly (a+b)+c - but in both 
> cases, a, b and c can be evaluated in any order.

Browsing through the standard I cannot find anything supporting this. 
Can you point me to the right section?  You may be reffering to 6.5 (3), 
but that is overly vague and doesn't mention grouping by parantheses at 
all ("The grouping of operators and operands is specified by the 
syntax").  And 6.5.1 (5) seems to be ambiguous, too.

Apart from evaluation order, I only can find phrases weakening control 
over FP, like 6.5 paragraphs 5 and 8.

Richard.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 21:07     ` Joseph S. Myers
  2004-03-24 21:38       ` Joe Buck
@ 2004-03-24 21:44       ` Joe Buck
  2004-03-24 22:49         ` Joseph S. Myers
  2004-03-24 22:19       ` Richard Guenther
  2 siblings, 1 reply; 85+ messages in thread
From: Joe Buck @ 2004-03-24 21:44 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Paul Koning, lucier, gcc

On Wed, Mar 24, 2004 at 07:14:43PM +0000, Joseph S. Myers wrote:
> Evaluation order (sequence points) has nothing to do with associativity.  
> The C syntax specifies that a+b+c means exactly (a+b)+c - but in both 
> cases, a, b and c can be evaluated in any order.

So: does this mean that a conforming C compiler is not permitted, for

double add(double a, double b, double c) { return a + b + c;}

to generate the equivalent of

   double t1 = a + c;
   return t1 + b;

as a Fortran compiler is?

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 21:07     ` Joseph S. Myers
@ 2004-03-24 21:38       ` Joe Buck
  2004-03-24 21:44       ` Joe Buck
  2004-03-24 22:19       ` Richard Guenther
  2 siblings, 0 replies; 85+ messages in thread
From: Joe Buck @ 2004-03-24 21:38 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Paul Koning, lucier, gcc

On Wed, Mar 24, 2004 at 07:14:43PM +0000, Joseph S. Myers wrote:
> On Wed, 24 Mar 2004, Paul Koning wrote:
> 
> > I don't have a C standard, but my copy of Harbison & Steele says what
> > I expected about parentheses: "Parentheses do not necessarily force a
> > particular evaluation order".  
> 
> Evaluation order (sequence points) has nothing to do with associativity.  
> The C syntax specifies that a+b+c means exactly (a+b)+c - but in both 
> cases, a, b and c can be evaluated in any order.

Exactly; evaluation order is completely separate from the question at
hand.  Evaluation order means that if a, b, and c are actually
expressions, these expressions can be evaluated in any order.  For
associativity, the question is whether a must be added to b, and the
result added to c, or if it is permissible to combine the operands in a
different way.


^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 20:14   ` Paul Koning
  2004-03-24 21:00     ` Joe Buck
@ 2004-03-24 21:07     ` Joseph S. Myers
  2004-03-24 21:38       ` Joe Buck
                         ` (2 more replies)
  2004-03-25  7:29     ` Robert Dewar
  2 siblings, 3 replies; 85+ messages in thread
From: Joseph S. Myers @ 2004-03-24 21:07 UTC (permalink / raw)
  To: Paul Koning; +Cc: Joe.Buck, lucier, gcc

On Wed, 24 Mar 2004, Paul Koning wrote:

> I don't have a C standard, but my copy of Harbison & Steele says what
> I expected about parentheses: "Parentheses do not necessarily force a
> particular evaluation order".  

Evaluation order (sequence points) has nothing to do with associativity.  
The C syntax specifies that a+b+c means exactly (a+b)+c - but in both 
cases, a, b and c can be evaluated in any order.

-- 
Joseph S. Myers
jsm@polyomino.org.uk

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 20:14   ` Paul Koning
@ 2004-03-24 21:00     ` Joe Buck
  2004-03-24 22:20       ` Dale Johannesen
  2004-03-24 21:07     ` Joseph S. Myers
  2004-03-25  7:29     ` Robert Dewar
  2 siblings, 1 reply; 85+ messages in thread
From: Joe Buck @ 2004-03-24 21:00 UTC (permalink / raw)
  To: Paul Koning; +Cc: lucier, gcc

On Wed, Mar 24, 2004 at 02:10:13PM -0500, Paul Koning wrote:
>  Joe> Consider a, b, and c as single precision floating point values,
>  Joe> and a=1, b=-1, c=1.2345e-8.  (a+b)+c will compute as 1.2345e-8.
>  Joe> a+(b+c) will return zero, as will (a+c)+b. 
> 
> I don't have a C standard, but my copy of Harbison & Steele says what
> I expected about parentheses: "Parentheses do not necessarily force a
> particular evaluation order".  

This was true of K&R C, but I seem to recall that the standards committee
changed that.  Any standards gurus out there care to comment?

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:29 ` Joe Buck
  2004-03-24 19:43   ` Bradley Lucier
@ 2004-03-24 20:14   ` Paul Koning
  2004-03-24 21:00     ` Joe Buck
                       ` (2 more replies)
  1 sibling, 3 replies; 85+ messages in thread
From: Paul Koning @ 2004-03-24 20:14 UTC (permalink / raw)
  To: Joe.Buck; +Cc: lucier, gcc

>>>>> "Joe" == Joe Buck <Joe.Buck@synopsys.COM> writes:

 Joe> It's OK for -ffast-math to make the kind of transformation that
 Joe> might lose the last bit or two of an IEEE FP result.  However,
 Joe> disregarding parentheses will frequently throw away far more
 Joe> precision than that.

 Joe> Consider a, b, and c as single precision floating point values,
 Joe> and a=1, b=-1, c=1.2345e-8.  (a+b)+c will compute as 1.2345e-8.
 Joe> a+(b+c) will return zero, as will (a+c)+b. 

I don't have a C standard, but my copy of Harbison & Steele says what
I expected about parentheses: "Parentheses do not necessarily force a
particular evaluation order".  

So as far as I can tell, by the language rules, (a+b)+c and a+(b+c)
are the same -- they have the same ordering properties (or lack
thereof). 

Is the implication that if -fno-fast-math is in effect, parentheses
acquire an ADDITIONAL semantic (evaluation order) that goes beyond the
C language definition (forcing operand grouping)?

Curious that C++ (at least as described in Stroustrup) *does* say that
parentheses force evaluation order...

	    paul

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:43   ` Bradley Lucier
@ 2004-03-24 19:50     ` Joe Buck
  2004-03-24 23:48       ` Toon Moene
                         ` (3 more replies)
  0 siblings, 4 replies; 85+ messages in thread
From: Joe Buck @ 2004-03-24 19:50 UTC (permalink / raw)
  To: Bradley Lucier
  Cc: roger, gdr, gcc, abraham, gcc-patches, toon, laurent, fjahanian

On Mar 24, 2004, at 1:51 PM, Joe Buck wrote:
> > And no, I'm not being pedantic; in many scientific apps, the programmer
> > is aware of the expected range of values the variables will have, and
> > will deliberately arrange the operations so that variables of similar
> > magnitudes are combined.

On Wed, Mar 24, 2004 at 01:56:16PM -0500, Bradley Lucier wrote:
> I'm aware of the usual examples.  Do you think that we need yet another 
> fast-math flag to allow the compiler to reassociate values in 
> floating-point arithmetic? -freally-unsafe-math-optimizations perhaps?  
> Or maybe such a programmer as you describe will just turn off 
> -funsafe-math-optimizations.

No.  Why would we need such a thing?  If the user does not care about
order of evaluation, the user can write a+b+c .  As someone said, right
now we can't tell the difference between a+b+c; if we turn it into GIMPLE
and make t1 = a+b; 52 = t1+c; we can't tell if the user initially wrote
(a+b)+c or a+b+c.  I suspect that the main reason people want to go in
this direction is to speed up the second case, where the user has not
specified an order.  So, rather than breaking the first case, we need to be
able to distinguish the two cases.  I don't know how this should be done;
maybe flag the temporary to say that t1 is required to be computed as is
in the explicit paretheses case, and not required in the other case.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:29 ` Joe Buck
@ 2004-03-24 19:43   ` Bradley Lucier
  2004-03-24 19:50     ` Joe Buck
  2004-03-24 20:14   ` Paul Koning
  1 sibling, 1 reply; 85+ messages in thread
From: Bradley Lucier @ 2004-03-24 19:43 UTC (permalink / raw)
  To: Joe Buck
  Cc: roger, gdr, gcc, abraham, Bradley Lucier, gcc-patches, toon,
	laurent, fjahanian


On Mar 24, 2004, at 1:51 PM, Joe Buck wrote:

> And no, I'm not being pedantic; in many scientific apps, the programmer
> is aware of the expected range of values the variables will have, and
> will deliberately arrange the operations so that variables of similar
> magnitudes are combined.

I'm aware of the usual examples.  Do you think that we need yet another 
fast-math flag to allow the compiler to reassociate values in 
floating-point arithmetic? -freally-unsafe-math-optimizations perhaps?  
Or maybe such a programmer as you describe will just turn off 
-funsafe-math-optimizations.

Brad

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-24 19:04 Bradley Lucier
@ 2004-03-24 19:29 ` Joe Buck
  2004-03-24 19:43   ` Bradley Lucier
  2004-03-24 20:14   ` Paul Koning
  0 siblings, 2 replies; 85+ messages in thread
From: Joe Buck @ 2004-03-24 19:29 UTC (permalink / raw)
  To: Bradley Lucier
  Cc: roger, gdr, gcc, abraham, gcc-patches, toon, laurent, fjahanian

On Wed, Mar 24, 2004 at 01:21:32PM -0500, Bradley Lucier wrote:
> Re:
> 
> > On Tue, Mar 23, 2004 at 10:32:13PM -0500, Robert Dewar wrote:
> > > For example, if -ffast-math is so sloppy as to consider
> > > that (a+b)+c can be replaced by a+(b+c), then all bets are off.
> >
> > That's why -ffast-math doesn't do that; such a transformation would be
> > massively brain-damaged.
> 
> Well, -ffast-math (or more specifically, -funsafe-math-optimizations) 
> is about to do this, at least in some cases on tree-ssa, see
> 
> http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01891.html
> 
> If you think that this transformation shouldn't be done, then I guess 
> now is the time to speak up.

It's OK for -ffast-math to make the kind of transformation that might
lose the last bit or two of an IEEE FP result.  However, disregarding 
parentheses will frequently throw away far more precision than that.

Consider a, b, and c as single precision floating point values, and
a=1, b=-1, c=1.2345e-8.  (a+b)+c will compute as 1.2345e-8.  a+(b+c)
will return zero, as will (a+c)+b.  *All* the precision of the result,
every single bit, is lost.  This is because FLT_EPSILON is 1.1920929e-07F,
and 1.0F plus a value smaller than 1.0F gives zero.  Now, in on x86
you might not see this because the value is computed to 80 bits, but
what if you spill the partial sum?

And no, I'm not being pedantic; in many scientific apps, the programmer
is aware of the expected range of values the variables will have, and
will deliberately arrange the operations so that variables of similar
magnitudes are combined.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
@ 2004-03-24 19:04 Bradley Lucier
  2004-03-24 19:29 ` Joe Buck
  0 siblings, 1 reply; 85+ messages in thread
From: Bradley Lucier @ 2004-03-24 19:04 UTC (permalink / raw)
  To: joe.buck
  Cc: roger, gdr, gcc, abraham, Bradley Lucier, gcc-patches, toon,
	laurent, fjahanian

Re:

> On Tue, Mar 23, 2004 at 10:32:13PM -0500, Robert Dewar wrote:
> > For example, if -ffast-math is so sloppy as to consider
> > that (a+b)+c can be replaced by a+(b+c), then all bets are off.
>
> That's why -ffast-math doesn't do that; such a transformation would be
> massively brain-damaged.

Well, -ffast-math (or more specifically, -funsafe-math-optimizations) 
is about to do this, at least in some cases on tree-ssa, see

http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01891.html

If you think that this transformation shouldn't be done, then I guess 
now is the time to speak up.

Brad

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 14:18               ` Zdenek Dvorak
@ 2004-03-15 14:29                 ` Segher Boessenkool
  0 siblings, 0 replies; 85+ messages in thread
From: Segher Boessenkool @ 2004-03-15 14:29 UTC (permalink / raw)
  To: Zdenek Dvorak
  Cc: Scott Robert Ladd, gcc, Paolo Carlini, Joseph S. Myers, Andrew Pinski

> we do not do that currently.  Something similar is on my todo list --
> replacement of the final value of an eliminated induction variable,
> as in
>
> for (i = 0; i < 100; i++)
>   a = 10 * i;
> foo (a);
>
> to
>
> for (i = 0; i < 100; i++);
> a = 1000;
> foo (a);
>
> but I am not sure whether to do it for non-integer variables as
> well, due to reasons you mention below.

a = 990?  You don't need floats to do rounding errors ;-)

Segher

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 14:13             ` Paolo Carlini
  2004-03-15 14:18               ` Zdenek Dvorak
@ 2004-03-15 14:28               ` Joseph S. Myers
  1 sibling, 0 replies; 85+ messages in thread
From: Joseph S. Myers @ 2004-03-15 14:28 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Zdenek Dvorak, Andrew Pinski, Scott Robert Ladd, gcc

On Mon, 15 Mar 2004, Paolo Carlini wrote:

> Now, I have another question: when -ffast-math is passed, should we even 
> collapse
> the loop to a single integer to be multiplied by the return value of doit?

I don't know whether this will occur in real code (and so whether it is
worth doing), but replacing repeated addition with multiplication seems
like the sort of thing -ffast-math can do.  (The default for the
FP_CONTRACT pragma is implementation-defined, but that would only allow
contraction within a single expression, e.g. x+x+x+x --> 4*x, not in loops
like this.)

-- 
Joseph S. Myers
jsm@polyomino.org.uk

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 14:13             ` Paolo Carlini
@ 2004-03-15 14:18               ` Zdenek Dvorak
  2004-03-15 14:29                 ` Segher Boessenkool
  2004-03-15 14:28               ` Joseph S. Myers
  1 sibling, 1 reply; 85+ messages in thread
From: Zdenek Dvorak @ 2004-03-15 14:18 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Joseph S. Myers, Andrew Pinski, Scott Robert Ladd, gcc

Hello,

> >You need to allow for the possibility of doit trapping, but moving it
> >should still be OK in this case by virtue of C99 F.8.1#3: you don't need
> >to keep the same number of traps as implied by the source code.
> > 
> >
> Ah! Today I'm learning *too* much, thanks to everyone!
> 
> Now, I have another question: when -ffast-math is passed, should we even 
> collapse
> the loop to a single integer to be multiplied by the return value of doit?
> 
> Currently, on the gcc-lno branch we don't do that... or we don't *want* 
> to do that? ;)

we do not do that currently.  Something similar is on my todo list --
replacement of the final value of an eliminated induction variable,
as in

for (i = 0; i < 100; i++)
  a = 10 * i;
foo (a);

to

for (i = 0; i < 100; i++);
a = 1000;
foo (a);

but I am not sure whether to do it for non-integer variables as
well, due to reasons you mention below.

Zdenek

> Naively, seems something really tricky to attempt because the sum 
> involving the loop
> index i may overflow in the process.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 14:05           ` Joseph S. Myers
@ 2004-03-15 14:13             ` Paolo Carlini
  2004-03-15 14:18               ` Zdenek Dvorak
  2004-03-15 14:28               ` Joseph S. Myers
  0 siblings, 2 replies; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15 14:13 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Zdenek Dvorak, Andrew Pinski, Scott Robert Ladd, gcc

Joseph S. Myers wrote:

>You need to allow for the possibility of doit trapping, but moving it
>should still be OK in this case by virtue of C99 F.8.1#3: you don't need
>to keep the same number of traps as implied by the source code.
>  
>
Ah! Today I'm learning *too* much, thanks to everyone!

Now, I have another question: when -ffast-math is passed, should we even 
collapse
the loop to a single integer to be multiplied by the return value of doit?

Currently, on the gcc-lno branch we don't do that... or we don't *want* 
to do that? ;)

Naively, seems something really tricky to attempt because the sum 
involving the loop
index i may overflow in the process.

Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 13:51         ` Zdenek Dvorak
  2004-03-15 13:55           ` Paolo Carlini
@ 2004-03-15 14:05           ` Joseph S. Myers
  2004-03-15 14:13             ` Paolo Carlini
  1 sibling, 1 reply; 85+ messages in thread
From: Joseph S. Myers @ 2004-03-15 14:05 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Paolo Carlini, Andrew Pinski, Scott Robert Ladd, gcc

On Mon, 15 Mar 2004, Zdenek Dvorak wrote:

> > Perhaps it's ok moving doit even when -ffast-math is not passed, I don't
> > know for sure, honestly...
> 
> it should be -- it is just an invariant motion (the value returned by
> doit obviously is always the same, since we invoke it with the same
> argument).

You need to allow for the possibility of doit trapping, but moving it
should still be OK in this case by virtue of C99 F.8.1#3: you don't need
to keep the same number of traps as implied by the source code.

-- 
Joseph S. Myers
jsm@polyomino.org.uk

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 13:55           ` Paolo Carlini
@ 2004-03-15 14:00             ` Zdenek Dvorak
  0 siblings, 0 replies; 85+ messages in thread
From: Zdenek Dvorak @ 2004-03-15 14:00 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Andrew Pinski, Scott Robert Ladd, gcc

Hello,

> >>Perhaps it's ok moving doit even when -ffast-math is not passed, I don't
> >>know for sure, honestly...
> >>   
> >>
> >it should be -- it is just an invariant motion (the value returned by
> >doit obviously is always the same, since we invoke it with the same
> >argument).
> > 
> >
> Ah, ok! On the other hand, it wouldn't be ok collapsing in a second step 
> the whole
> loop to a constant, right?

right.

Zdenek

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 13:51         ` Zdenek Dvorak
@ 2004-03-15 13:55           ` Paolo Carlini
  2004-03-15 14:00             ` Zdenek Dvorak
  2004-03-15 14:05           ` Joseph S. Myers
  1 sibling, 1 reply; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15 13:55 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Andrew Pinski, Scott Robert Ladd, gcc

Zdenek Dvorak wrote:

>>Perhaps it's ok moving doit even when -ffast-math is not passed, I don't
>>know for sure, honestly...
>>    
>>
>it should be -- it is just an invariant motion (the value returned by
>doit obviously is always the same, since we invoke it with the same
>argument).
>  
>
Ah, ok! On the other hand, it wouldn't be ok collapsing in a second step 
the whole
loop to a constant, right?

Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 13:42       ` Paolo Carlini
@ 2004-03-15 13:51         ` Zdenek Dvorak
  2004-03-15 13:55           ` Paolo Carlini
  2004-03-15 14:05           ` Joseph S. Myers
  0 siblings, 2 replies; 85+ messages in thread
From: Zdenek Dvorak @ 2004-03-15 13:51 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Andrew Pinski, Scott Robert Ladd, gcc

Hello,

> >doing what? I do not see any loop related optimization here.
> >
> Hi Zdenek. What is doing "the trick" (sorry for my informal words) on the
> gcc-lno branch is -ftree-loop-optimize, *not* -funroll-loops.
> 
> When -ftree-loop-optimize is passed, the trigonometric computation (doit)
> is moved outside of the loop, this is the complete result:

[snip]

> Perhaps it's ok moving doit even when -ffast-math is not passed, I don't
> know for sure, honestly...

it should be -- it is just an invariant motion (the value returned by
doit obviously is always the same, since we invoke it with the same
argument).

Zdenek

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 13:29     ` Zdenek Dvorak
@ 2004-03-15 13:42       ` Paolo Carlini
  2004-03-15 13:51         ` Zdenek Dvorak
  0 siblings, 1 reply; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15 13:42 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Andrew Pinski, Scott Robert Ladd, gcc

Zdenek Dvorak wrote:

>doing what? I do not see any loop related optimization here.
>
Hi Zdenek. What is doing "the trick" (sorry for my informal words) on the
gcc-lno branch is -ftree-loop-optimize, *not* -funroll-loops.

When -ftree-loop-optimize is passed, the trigonometric computation (doit)
is moved outside of the loop, this is the complete result:

08048450 <main>:
 8048450:       55                      push   %ebp
 8048451:       d9 e8                   fld1
 8048453:       89 e5                   mov    %esp,%ebp
 8048455:       83 ec 18                sub    $0x18,%esp
 8048458:       83 e4 f0                and    $0xfffffff0,%esp
 804845b:       83 ec 10                sub    $0x10,%esp
 804845e:       dd 1c 24                fstpl  (%esp)
 8048461:       e8 aa ff ff ff          call   8048410 <doit>
 8048466:       d9 ee                   fldz
 8048468:       31 c0                   xor    %eax,%eax
 804846a:       8d b6 00 00 00 00       lea    0x0(%esi),%esi

 8048470:       40                      inc    %eax
 8048471:       d8 c1                   fadd   %st(1),%st
 8048473:       3d 00 e1 f5 05          cmp    $0x5f5e100,%eax
 8048478:       75 f6                   jne    8048470 <main+0x20>

 804847a:       dd d9                   fstp   %st(1)
 804847c:       dd 5c 24 04             fstpl  0x4(%esp)
 8048480:       c7 04 24 98 85 04 08    movl   $0x8048598,(%esp)
 8048487:       e8 a4 fe ff ff          call   8048330 <_init+0x48>
 804848c:       c9                      leave
 804848d:       31 c0                   xor    %eax,%eax
 804848f:       c3                      ret

Whereas, without -ftree-loop-optimize, we have:

08048450 <main>:
 8048450:       55                      push   %ebp
 8048451:       d9 ee                   fldz
 8048453:       89 e5                   mov    %esp,%ebp
 8048455:       53                      push   %ebx
 8048456:       83 ec 24                sub    $0x24,%esp
 8048459:       bb ff e0 f5 05          mov    $0x5f5e0ff,%ebx
 804845e:       dd 5d f0                fstpl  0xfffffff0(%ebp)
 8048461:       83 e4 f0                and    $0xfffffff0,%esp
 8048464:       83 ec 10                sub    $0x10,%esp
 8048467:       eb 09                   jmp    8048472 <main+0x22>
 8048469:       8d b4 26 00 00 00 00    lea    0x0(%esi),%esi

 8048470:       dd d8                   fstp   %st(0)
 8048472:       c7 04 24 00 00 00 00    movl   $0x0,(%esp)
 8048479:       b8 00 00 f0 3f          mov    $0x3ff00000,%eax
 804847e:       89 44 24 04             mov    %eax,0x4(%esp)
 8048482:       e8 89 ff ff ff          call   8048410 <doit>
 8048487:       dc 45 f0                faddl  0xfffffff0(%ebp)
 804848a:       4b                      dec    %ebx
 804848b:       dd 55 f0                fstl   0xfffffff0(%ebp)
 804848e:       79 e0                   jns    8048470 <main+0x20>

 8048490:       dd 5c 24 04             fstpl  0x4(%esp)
 8048494:       c7 04 24 b8 85 04 08    movl   $0x80485b8,(%esp)
 804849b:       e8 90 fe ff ff          call   8048330 <_init+0x48>
 80484a0:       8b 5d fc                mov    0xfffffffc(%ebp),%ebx
 80484a3:       31 c0                   xor    %eax,%eax
 80484a5:       c9                      leave
 80484a6:       c3                      ret

Perhaps it's ok moving doit even when -ffast-math is not passed, I don't
know for sure, honestly...

Thanks for your feedback,
Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 11:25   ` Paolo Carlini
  2004-03-15 11:31     ` Paolo Carlini
@ 2004-03-15 13:29     ` Zdenek Dvorak
  2004-03-15 13:42       ` Paolo Carlini
  1 sibling, 1 reply; 85+ messages in thread
From: Zdenek Dvorak @ 2004-03-15 13:29 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Andrew Pinski, Scott Robert Ladd, gcc

Hello,

> >The reason why still is that ICC will just unroll the loop to be "r = 
> >doit(a)*100000000.0" so that is the reasons why
> >ICC is better than GCC at doing this stupid trig test (note this is 
> >transformation
> >is only valid if fast-math is on as you loose precision).
> 
> In the meanwhile we have learned that the real reason why Icc performs
> better then mainline gcc is the use of an iterative SSE instruction
> (see Dan Nicolaescu message).
> 
> On the other hand, gcc-lno "appear" to perform as well as Icc because of
> the unrolling trick (just checked that gcc-lno transforms the loop to
> the trivial:
>
>   8048470:       40                      inc    %eax
>   8048471:       d8 c1                   fadd   %st(1),%st
>   8048473:       3d 00 e1 f5 05          cmp    $0x5f5e100,%eax
>   8048478:       75 f6                   jne    8048470 <main+0x20>)
> 
> Now, my question is: why gcc-lno is doing that also when -ffast-math is
> *not* passed???

doing what? I do not see any loop related optimization here.

Zdenek

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15 11:25   ` Paolo Carlini
@ 2004-03-15 11:31     ` Paolo Carlini
  2004-03-15 13:29     ` Zdenek Dvorak
  1 sibling, 0 replies; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15 11:31 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Andrew Pinski, Scott Robert Ladd, gcc, Zdenek Dvorak

Paolo Carlini wrote:

> In the meanwhile we have learned that the real reason why Icc performs
> better then mainline gcc is the use of an iterative SSE instruction
> (see Dan Nicolaescu message).

+ the <math.h> interesting issue, of course, but the real point of my 
message
was the other one.

Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15  0:12 ` GCC beaten by ICC in stupid " Andrew Pinski
                     ` (2 preceding siblings ...)
  2004-03-15  2:36   ` Scott Robert Ladd
@ 2004-03-15 11:25   ` Paolo Carlini
  2004-03-15 11:31     ` Paolo Carlini
  2004-03-15 13:29     ` Zdenek Dvorak
  3 siblings, 2 replies; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15 11:25 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Scott Robert Ladd, gcc, Zdenek Dvorak

Andrew Pinski wrote:

> The reason why still is that ICC will just unroll the loop to be "r = 
> doit(a)*100000000.0" so that is the reasons why
> ICC is better than GCC at doing this stupid trig test (note this is 
> transformation
> is only valid if fast-math is on as you loose precision).

In the meanwhile we have learned that the real reason why Icc performs
better then mainline gcc is the use of an iterative SSE instruction
(see Dan Nicolaescu message).

On the other hand, gcc-lno "appear" to perform as well as Icc because of
the unrolling trick (just checked that gcc-lno transforms the loop to
the trivial:

   8048470:       40                      inc    %eax
   8048471:       d8 c1                   fadd   %st(1),%st
   8048473:       3d 00 e1 f5 05          cmp    $0x5f5e100,%eax
   8048478:       75 f6                   jne    8048470 <main+0x20>)

Now, my question is: why gcc-lno is doing that also when -ffast-math is
*not* passed???

Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15  0:12 ` GCC beaten by ICC in stupid " Andrew Pinski
  2004-03-15  0:32   ` Paolo Carlini
  2004-03-15  1:31   ` Scott Robert Ladd
@ 2004-03-15  2:36   ` Scott Robert Ladd
  2004-03-15 11:25   ` Paolo Carlini
  3 siblings, 0 replies; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-15  2:36 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc

Andrew Pinski wrote:
> Here is a much better benchmark to try, notice that we are doing more
>  work now but the point is that ICC is going to be the transformation
>  and it not going to see that doit is constant so it will not pull it
>  out of the loop and it cannot unroll the loop into just being a 
> constant.

Intel does *not* unroll the original code into a constnt. To wit, the 
code generated on my system:

         .globl main
main:
..B1.1:                         # Preds ..B1.0
         pushl     %ebx                                          #14.1
         movl      %esp, %ebx                                    #14.1
         andl      $-64, %esp                                    #14.1
         pushl     %edi                                          #14.1
         subl      $60, %esp                                     #14.1
         call      __intel_proc_init_N                           #14.1
         pushl     %eax                                          #14.1
         pushl     %eax                                          #14.1
         stmxcsr   (%esp)                                        #14.1
         popl      %eax                                          #14.1
         orl       $32768, %eax                                  #14.1
         pushl     %eax                                          #14.1
         ldmxcsr   (%esp)                                        #14.1
         popl      %eax                                          #14.1
         popl      %eax                                          #14.1
         movapd    _2il0floatpacket.1, %xmm0                     #18.14
         xorl      %edi, %edi                                    #17.5
         pxor      %xmm1, %xmm1                                  #
         movapd    %xmm1, 16(%esp)                               #
         call      vmldSin2                                      #18.14
                                 # LOE ebp esi edi xmm0
..B1.7:                         # Preds ..B1.1
         movapd    %xmm0, 32(%esp)                               #18.14
         movapd    _2il0floatpacket.1, %xmm0                     #18.14
         call      vmldCos2                                      #18.14
                                 # LOE ebp esi edi xmm0
..B1.8:                         # Preds ..B1.7
         mulpd     %xmm0, %xmm0                                  #18.14
         movapd    32(%esp), %xmm1                               #18.14
         mulpd     %xmm1, %xmm1                                  #18.14
         movapd    16(%esp), %xmm2                               #18.14
         movapd    %xmm1, 32(%esp)                               #18.14
         movapd    32(%esp), %xmm1                               #18.14
         .align    4,0x90
                                 # LOE ebp esi edi xmm0 xmm1 xmm2
..B1.2:                         # Preds ..B1.8 ..B1.2
         addpd     %xmm1, %xmm2                                  #18.9
         addpd     %xmm0, %xmm2                                  #18.14
         addl      $2, %edi                                      #17.5
         cmpl      $100000000, %edi                              #17.5
         jb        ..B1.2        # Prob 100%                     #17.5
                                 # LOE ebp esi edi xmm0 xmm1 xmm2
..B1.3:                         # Preds ..B1.2
         movapd    %xmm2, 16(%esp)                               #
         movapd    16(%esp), %xmm1                               #17.5
         movapd    %xmm1, %xmm0                                  #17.5
         unpckhpd  %xmm0, %xmm0                                  #17.5
         addsd     %xmm0, %xmm1                                  #
         movl      $__STRING.0, (%esp)                           #20.12
         movsd     %xmm1, 4(%esp)                                #20.23
         call      printf                                        #20.5
                                 # LOE ebp esi
..B1.4:                         # Preds ..B1.3
         xorl      %eax, %eax                                    #21.12
         addl      $60, %esp                                     #21.12
         popl      %edi                                          #21.12
         movl      %ebx, %esp                                    #21.12
         popl      %ebx                                          #21.12
         ret                                                     #21.12


I made some minor mention of this in the original post, but it was 
likely too vague for you.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15  0:12 ` GCC beaten by ICC in stupid " Andrew Pinski
  2004-03-15  0:32   ` Paolo Carlini
@ 2004-03-15  1:31   ` Scott Robert Ladd
  2004-03-15  2:36   ` Scott Robert Ladd
  2004-03-15 11:25   ` Paolo Carlini
  3 siblings, 0 replies; 85+ messages in thread
From: Scott Robert Ladd @ 2004-03-15  1:31 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc

Andrew Pinski wrote:
> 
> On Mar 14, 2004, at 15:39, Scott Robert Ladd wrote:
> 
>> Hello,
>>
>> Consider the following program, compiled and run on a Pentium 4 
>> (Northwood) system:
>>
>>     #include <math.h>
>>     #include <stdio.h>
>>
>>     double doit(double a)
>>     {
>>         double s = sin(a);
>>         double c = cos(a);
>>
>>         // should always be 1
>>         return s * s + c * c;
>>     }
>>
>>     int main(void)
>>     {
>>         double a = 1.0, r = 0.0;
>>
>>         for (int i = 0; i < 100000000; ++i)
>>             r += doit(a);
>>
>>         printf("r = %f\n",r);
>>         return 0;
>>     }
>>
> 
> The point here if you know that it is 1.0 then just return 1.0 instead 
> of trying to play tricks with trig functions.

Don't be so damned insulting.

This is a simple example program meant tio demonstrate a problem in GCC 
code generation. You know -- providing a simple piece of code that 
focuses on the problem, rather than presenting a thousand-line program.

The reason doit returns 1 is so that the optimizers don't eliminate the 
code entirely. If doit() merely computes sin and cos, and returns 
nothing, it is compiled to nothing. Thus the return value to force code 
generation. By using the sine/cosine relationship, to prove that the 
function correctly computed the two values.

-- 
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-15  0:12 ` GCC beaten by ICC in stupid " Andrew Pinski
@ 2004-03-15  0:32   ` Paolo Carlini
  2004-03-15  1:31   ` Scott Robert Ladd
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 85+ messages in thread
From: Paolo Carlini @ 2004-03-15  0:32 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Scott Robert Ladd, gcc

Andrew Pinski wrote:

> Here is a much better benchmark to try, notice that we are doing more 
> work now but the point is that ICC is going to be the transformation 
> and it not going to see that doit
> is constant so it will not pull it out of the loop and it cannot 
> unroll the loop into
> just being a constant.

Ah! Thanks Andrew for the explanation. Now some recent developments
are much more clear to me!

Indeed, your modified testcase runs slower everywhere... ;)

Paolo.

^ permalink raw reply	[flat|nested] 85+ messages in thread

* Re: GCC beaten by ICC in stupid trig test!
  2004-03-14 23:40 GCC viciously beaten by ICC in " Scott Robert Ladd
@ 2004-03-15  0:12 ` Andrew Pinski
  2004-03-15  0:32   ` Paolo Carlini
                     ` (3 more replies)
  0 siblings, 4 replies; 85+ messages in thread
From: Andrew Pinski @ 2004-03-15  0:12 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: gcc, Andrew Pinski

On Mar 14, 2004, at 15:39, Scott Robert Ladd wrote:

> Hello,
>
> Consider the following program, compiled and run on a Pentium 4 
> (Northwood) system:
>
>     #include <math.h>
>     #include <stdio.h>
>
>     double doit(double a)
>     {
>         double s = sin(a);
>         double c = cos(a);
>
>         // should always be 1
>         return s * s + c * c;
>     }
>
>     int main(void)
>     {
>         double a = 1.0, r = 0.0;
>
>         for (int i = 0; i < 100000000; ++i)
>             r += doit(a);
>
>         printf("r = %f\n",r);
>         return 0;
>     }
>

The point here if you know that it is 1.0 then just return 1.0 instead 
of trying to
play tricks with trig functions.  Yes GCC should do better for trig 
functions
but in most cases, the developer was just doing something dumb like the 
above example
which by the way is not a good benchmark anyways because you know that 
the trig
functions can be reduced to just a load of a constant (as ICC does this 
transformation
while GCC does not but could).

Actually what is happening here is that the function doit is being 
inlined and the math in the inner loop is not being pulled out of the 
loop as it is constant just like a is.
So doing the following (aka forces GCC not to inline) will at least get 
GCC to be about
the same ball park (but still nowhere near) as ICC.  The reason why 
still is that ICC will just unroll the loop to be "r = 
doit(a)*100000000.0" so that is the reasons why
ICC is better than GCC at doing this stupid trig test (note this is 
transformation
is only valid if fast-math is on as you loose precision).

Here is a much better benchmark to try, notice that we are doing more 
work now but the point is that ICC is going to be the transformation 
and it not going to see that doit
is constant so it will not pull it out of the loop and it cannot unroll 
the loop into
just being a constant.

     #include <math.h>
     #include <stdio.h>

     double doit(double a)
     {
         double s = sin(a);
         double c = cos(a);

         return s * c;
     }

     int main(void)
     {
         double a = 1.0, r = 0.0;

         for (int i = 0; i < 100000000; ++i, a++)
             r += doit(a);

         printf("r = %f\n",r);
         return 0;
     }

^ permalink raw reply	[flat|nested] 85+ messages in thread

end of thread, other threads:[~2004-03-27  0:40 UTC | newest]

Thread overview: 85+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-15 16:06 GCC beaten by ICC in stupid trig test! Robert Dewar
2004-03-16  8:45 ` Per Abrahamsen
2004-03-17  0:09   ` Robert Dewar
2004-03-17  0:36     ` Scott Robert Ladd
2004-03-17  5:53     ` Gabriel Dos Reis
2004-03-17  7:21       ` Robert Dewar
2004-03-17  9:10         ` Gabriel Dos Reis
2004-03-21 16:55           ` Robert Dewar
2004-03-23 19:38       ` Joe Buck
2004-03-23 19:58         ` Gabriel Dos Reis
2004-03-23 20:49           ` Laurent GUERBY
2004-03-24  8:17             ` Toon Moene
2004-03-24 13:50               ` Robert Dewar
2004-03-24 18:25                 ` Paul Koning
2004-03-24 18:51                   ` Robert Dewar
2004-03-25 18:18                     ` Per Abrahamsen
2004-03-27  1:26                       ` Robert Dewar
2004-03-24 18:56                 ` Joe Buck
2004-03-24 19:10                   ` Robert Dewar
2004-03-24 19:14                     ` Richard Guenther
2004-03-24 19:39                       ` Paul Brook
2004-03-24 19:45                         ` Dave Korn
2004-03-24 20:57                           ` Paul Brook
2004-03-25  6:14                           ` Robert Dewar
2004-03-25 18:32                             ` Scott Robert Ladd
2004-03-27  1:28                               ` Robert Dewar
2004-03-17 14:51     ` Per Abrahamsen
2004-03-17 15:18       ` Gabriel Dos Reis
2004-03-17 16:05         ` Per Abrahamsen
2004-03-16 12:14 ` Scott Robert Ladd
2004-03-17  0:19   ` Robert Dewar
  -- strict thread matches above, loose matches on Subject: below --
2004-03-25 16:51 Wolfgang Bangerth
2004-03-24 19:04 Bradley Lucier
2004-03-24 19:29 ` Joe Buck
2004-03-24 19:43   ` Bradley Lucier
2004-03-24 19:50     ` Joe Buck
2004-03-24 23:48       ` Toon Moene
2004-03-25  0:02         ` Toon Moene
2004-03-25  0:11           ` Roger Sayle
2004-03-25  5:56             ` Scott Robert Ladd
2004-03-25  6:07             ` Bradley Lucier
2004-03-25  8:18             ` Robert Dewar
2004-03-25 16:15               ` Roger Sayle
2004-03-25 16:36                 ` David Edelsohn
2004-03-25 17:09                   ` Scott Robert Ladd
2004-03-25 17:47                     ` David Edelsohn
2004-03-25 18:03                       ` Scott Robert Ladd
2004-03-26  1:29                 ` Toon Moene
2004-03-27  0:23                 ` Daniel Egger
2004-03-27  0:50                 ` Laurent GUERBY
2004-03-27  0:55                   ` Joe Buck
2004-03-27  0:58                     ` Laurent GUERBY
2004-03-27  1:16                       ` Joe Buck
2004-03-25  5:36         ` Gabriel Dos Reis
2004-03-25  8:46           ` Robert Dewar
2004-03-25  5:34       ` Gabriel Dos Reis
2004-03-25  7:24       ` Robert Dewar
2004-03-25  8:28         ` Gabriel Dos Reis
2004-03-25 18:19       ` Per Abrahamsen
2004-03-24 20:14   ` Paul Koning
2004-03-24 21:00     ` Joe Buck
2004-03-24 22:20       ` Dale Johannesen
2004-03-24 21:07     ` Joseph S. Myers
2004-03-24 21:38       ` Joe Buck
2004-03-24 21:44       ` Joe Buck
2004-03-24 22:49         ` Joseph S. Myers
2004-03-24 22:19       ` Richard Guenther
2004-03-24 22:21         ` Dale Johannesen
2004-03-25  7:29     ` Robert Dewar
2004-03-14 23:40 GCC viciously beaten by ICC in " Scott Robert Ladd
2004-03-15  0:12 ` GCC beaten by ICC in stupid " Andrew Pinski
2004-03-15  0:32   ` Paolo Carlini
2004-03-15  1:31   ` Scott Robert Ladd
2004-03-15  2:36   ` Scott Robert Ladd
2004-03-15 11:25   ` Paolo Carlini
2004-03-15 11:31     ` Paolo Carlini
2004-03-15 13:29     ` Zdenek Dvorak
2004-03-15 13:42       ` Paolo Carlini
2004-03-15 13:51         ` Zdenek Dvorak
2004-03-15 13:55           ` Paolo Carlini
2004-03-15 14:00             ` Zdenek Dvorak
2004-03-15 14:05           ` Joseph S. Myers
2004-03-15 14:13             ` Paolo Carlini
2004-03-15 14:18               ` Zdenek Dvorak
2004-03-15 14:29                 ` Segher Boessenkool
2004-03-15 14:28               ` Joseph S. Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).