Re: Third Draft "Unsafe fp optimizations" project description.

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: Third Draft "Unsafe fp optimizations" project description.
@ 2001-08-09 12:46 dewar
  2001-08-09 12:59 ` Brad Lucier
  0 siblings, 1 reply; 8+ messages in thread
From: dewar @ 2001-08-09 12:46 UTC (permalink / raw)
  To: lucier, toon; +Cc: gcc

<<Finally, I suggest that the document say explicity that gcc
reserves the right to apply any transformation that does not change
the value of an expression or the settings of the exception flags
under IEEE 754 arithmetic.
>>

I would state this differently as a note. The point here is that the
notion of transformation is confusing and misleading. The code generator
can always generate any sequence of instructions that has the same effect.
For example, it is just fine to replace a floating-point object with an
integer object if the semantics of the program is unchanged. 

Something like:

Note: this document is discussing transformations that potentially affect
the behavior of the program from a formal semantic point of view. The
code generator may always make transformations that have no formal semantic
effect. For example, if we write:

   x := x / 2.0;

then the code generator might generate a division instruction which divides
by two, or a halve instruction if one is present, or a multiplication by
0.5. All of these generate exactly the same result in x, and so the code
generator is free to pick which ever sequence it deems most efficient.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
  2001-08-09 12:46 Third Draft "Unsafe fp optimizations" project description dewar
@ 2001-08-09 12:59 ` Brad Lucier
  0 siblings, 0 replies; 8+ messages in thread
From: Brad Lucier @ 2001-08-09 12:59 UTC (permalink / raw)
  To: dewar; +Cc: lucier, toon, gcc

> 
> <<Finally, I suggest that the document say explicity that gcc
> reserves the right to apply any transformation that does not change
> the value of an expression or the settings of the exception flags
> under IEEE 754 arithmetic.
> >>

My point is that gcc should feel free to apply transformations that 
do not change results in IEEE arithmetic, even if they
might change the results in some other arithmetic (old Cray
arithmetic, say, where x / 2.0 may not have been equivalent to x * 0.5).

Brad

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
  2001-08-09 14:29 ` Stephen L Moshier
@ 2001-08-10 12:25   ` Joe Buck
  0 siblings, 0 replies; 8+ messages in thread
From: Joe Buck @ 2001-08-10 12:25 UTC (permalink / raw)
  To: moshier; +Cc: gcc

> We do not in general try to be bug compatible with all target
> arithmetics, but I'm sure no one would mind if you wanted to put
> in the effort to implement that.

Actually, I would mind.  Our biggest problem with gcc 3.x is bloat.
It keeps getting bigger and slower and produces worse code.  Adding
more code to gcc can produce benefits, but it also costs.  Too many
cute ideas have been added whose costs exceed their benefits and that
make the compiler less maintainable.

Now, if someone has ideas for a cleaner architecture that improves
the compiler overall and, as a side benefit, makes it easy to be
bug compatible with obscure and non-standard floating point arithmetic for
rare hosts, that's another matter.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
       [not found] <Pine.LNX.4.20.0108091704430.27212-100000@moshier.ne.mediaone.net>
@ 2001-08-09 14:29 ` Stephen L Moshier
  2001-08-10 12:25   ` Joe Buck
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen L Moshier @ 2001-08-09 14:29 UTC (permalink / raw)
  To: gcc

> <<My point is that gcc should feel free to apply transformations that
> do not change results in IEEE arithmetic, even if they
> might change the results in some other arithmetic (old Cray
> arithmetic, say, where x / 2.0 may not have been equivalent to x *
> 0.5).
> >>
>
> I am not so sure that's acceptable, especially when we have quite a
> few pseudo-IEEE machines around.

Currently this particular transformation 1.0/2.0 -> 0.5 is
checked for exactness in gcc's idea of the target's arithmetic and
it is further filtered by a machine-dependent function check_float_value.

We do not in general try to be bug compatible with all target
arithmetics, but I'm sure no one would mind if you wanted to put
in the effort to implement that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
@ 2001-08-09 13:13 dewar
  0 siblings, 0 replies; 8+ messages in thread
From: dewar @ 2001-08-09 13:13 UTC (permalink / raw)
  To: dewar, lucier; +Cc: gcc, toon

<<My point is that gcc should feel free to apply transformations that
do not change results in IEEE arithmetic, even if they
might change the results in some other arithmetic (old Cray
arithmetic, say, where x / 2.0 may not have been equivalent to x * 0.5).
>>

I am not so sure that's acceptable, especially when we have quite a few
pseudo-IEEE machines around.

For sure, if we are talking about transformations that change the results
on a given  port, they come under the auspices of this document as to what
are and are not acceptable transformations.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
@ 2001-08-09 13:07 Stephen L Moshier
  0 siblings, 0 replies; 8+ messages in thread
From: Stephen L Moshier @ 2001-08-09 13:07 UTC (permalink / raw)
  To: Toon Moene; +Cc: gcc

> A*B + A*C -> A*(B+C)

Currently you don't seem to have this available under fast math.
If you want it, here is where the transformation was turned off.

   * combine.c (apply_distributive_law):  Enable if fast-math.

*** combine.c	2001/08/06 12:46:27	1.1
--- combine.c	2001/08/09 19:57:10
*************** apply_distributive_law (x)
*** 7773,7779 ****
    /* Distributivity is not true for floating point.
       It can change the value.  So don't do it.
       -- rms and moshier@world.std.com.  */
!   if (FLOAT_MODE_P (GET_MODE (x)))
      return x;
  
    /* The outer operation can only be one of the following:  */
--- 7773,7779 ----
    /* Distributivity is not true for floating point.
       It can change the value.  So don't do it.
       -- rms and moshier@world.std.com.  */
!   if (FLOAT_MODE_P (GET_MODE (x)) && ! flag_fast_math)
      return x;
  
    /* The outer operation can only be one of the following:  */


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Third Draft "Unsafe fp optimizations" project description.
@ 2001-08-09 12:21 Brad Lucier
  0 siblings, 0 replies; 8+ messages in thread
From: Brad Lucier @ 2001-08-09 12:21 UTC (permalink / raw)
  To: toon; +Cc: Brad Lucier, gcc

Just a few small suggestions:

   * Those who need full precision in order to guarantee the results.

"who" -> "that" and "the results" -> "acceptable results"

   * Those which are less sensible to occasional loss of accuracy.

"which" -> "that" and "sensible" -> "sensitive"

   * All temporaries generated for a single expression [should] always [be]

"[should] always [be]" -> "[should be]"

  2. Rearrangements whose only effect is a loss of accuracy.

     Rationale: Users might be able to bound the effect of this
     rearrangement.

     Example: A*A*...*A -> different order of evaluation (compare a*a*a*a
     with t=a*a; t=t*t). Savings: Potentially many multiplies, at the cost
     of some temporaries.

Although I cannot come up with an example where "((A*A)*A)*A" overflows
and "t=A*A; t*t" does not (or vice versa), I doubt that there are *any*
transformations that can change the value of an expression that do not,
for some inputs, cause the result to overflow or underflow to zero when
the original result did not (and vice versa).

Finally, I suggest that the document say explicity that gcc
reserves the right to apply any transformation that does not change
the value of an expression or the settings of the exception flags
under IEEE 754 arithmetic.  

Brad Lucier

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Third Draft "Unsafe fp optimizations" project description.
@ 2001-08-08 13:55 Toon Moene
  0 siblings, 0 replies; 8+ messages in thread
From: Toon Moene @ 2001-08-08 13:55 UTC (permalink / raw)
  To: gcc

I hope I dealt with all remarks on the Second Draft correctly in the
Third Draft, attached below.

If this draft doesn't provoke requests for major reorganisation, I
propose to have the Fourth Draft be the initial content of the Web Page.

After all, it's supposed to be an "Open Project", so we're allowed to
improve upon it while it is sitting there ...

Thanks for all input.

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
Transformations that change the meaning of floating point expressions

Introduction

The debate on the extent of rearrangement of floating point expressions
allowed to the compiler when optimizing is a recurring theme on GCC's
mailing lists. On this page we try to provide some structure to this
discussion. It is understood that all of the rearrangements described here
are only performed with the express permission of the user (i.e., an
explicitly specified command line option).

Rationale

Why would someone forego numerical accuracy for speed ? Isn't the fast but
wrong answer useless ?

In numerical problems, there are roughly two kinds of computations:

   * Those who need full precision in order to guarantee the results.
   * Those which are less sensible to occasional loss of accuracy.

Computational problems of the second kind generally have been beaten on,
simplified and approximated in a heroic attempt to fit them into limitations
of present day computers. Especially the loss of accuracy due to these
approximations could easily overwhelm that resulting from changing its
floating point arithmetic slightly. The most obvious example of this is the
first person shooting game: While the physics of reflection, refraction and
scattering of electromagnetic radiation with wavelengths between 400 and 800
nm has been significantly approximated, what would make the game absolutely
useless is the frequency of updating the image dropping below 20 per second.
Rearranging the floating point arithmetic with associated loss of a few
Units of the Last Position (ULP) could compare favourably to further
approximation of the physics involved.

Caveat: The loss of accuracy will not be the only effect - see below.

Aim of the project

The project will provide the GCC community with a classification of
rearrangements of floating point expressions. Based on the classification,
recommendations will be made on how to offer the users the possibility to
instruct the compiler to perform rearrangements from a particular class. The
classification will be based on the following criteria (courtesy of Robert
Dewar):

   * The transformation is well-understood.
   * It is definitely an optimization.
   * All of its numerical effects are well-documented (with an emphasis on
     the "special effects").

(actually, Robert wrote: "does not introduce surprises" as the last
criterion, but it's more useful to actually list the "special effects",
i.e., anything that's not simply a loss of accuracy).

Once the recommendations are agreed upon, a set of patches can be deviced to
change the compiler to implement the recommendations; parallel to this
effort, the documentation should be updated to reflect the new reality
(i.e., name and function of new flags, changes w.r.t. previous behaviour,
etc.).

As soon as this is done, we can close the project, indicate that status on
this web page and keep it for future reference on our consensus.

Preliminaries

Obviously, it is useless to talk about the ill effects of rearranging
floating point expressions without having a solid reference. To simplify the
analysis below, the discussion is in terms of the floating point model
supported by the ISO/IEC 9899 Standard (commonly known as C99), which refers
to the IEC 60559 Standard (the successor to the IEEE-754 Standard).

Another limitation we allow ourselves is to only treat rearrangements of
expressions using +, -, * and /. All other changes do not belong to the
domain of the compiler proper.

Unfortunately, at present GCC doesn't guarantee IEC 60559 conformance by
default on all targets that can support it. A well-known exception is the
ix86; the following summary of the defects is courtesy of Brad Lucier:

   * All temporaries generated for a single expression [should] always [be]
     maintained in extended precision, even when spilled to the stack
   * Each assignment to a variable [should be] stored to memory. (And, if
     the value of that variable is used later by dereferencing its lvalue,
     the value is loaded from memory and the temporary that was stored to
     memory is not re-used.)

where the [should be]'s indicate how it isn't, at present.

Language requirements

GCC presently supports five languages: C, C++, Objective C, Java and
Fortran. Of these, Fortran has the "loosest" requirements on floating point
operations (basically, one could say that floating point accuracy in Fortran
is a "quality of implementation" issue), while Java has the most
restrictive, because it requires implementations to supply the same answers
on all targets (this is definitely not a goal of the IEC 60559 Standard).

It is understood that users who will apply the outcome of this project do
know the extent to which they are violating the respective language
standard. We might consider to issue appropriate warning messages.

Classification

The classification below should enable us to offer users a choice of
allowable rearrangements based on the user's knowledge of the floating point
values present in and generated by the computations in his/her program;
hence the classification using subsets of the set of all floating point
values (in a particular set: single precision or double precision).

  1. Rearrangements whose only effect is for a small subset of all inputs.

     Rationale: Users might know the computational effects for those inputs.

     Example: Force underflow to zero. Savings may be large when denormal
     computation has to be emulated in the kernel. Special effects: Do not
     divide by underflowed numbers.

  2. Rearrangements whose only effect is a loss of accuracy.

     Rationale: Users might be able to bound the effect of this
     rearrangement.

     Example: A*A*...*A -> different order of evaluation (compare a*a*a*a
     with t=a*a; t=t*t). Savings: Potentially many multiplies, at the cost
     of some temporaries.

  3. Rearrangements whose effect is a loss of accuracy on a large subset of
     the inputs and a complete loss on a small subset of the inputs.

     Rationale: Users might know that their computations always fall in the
     "loss of accuracy" subset and be able to bound the effect of this
     rearrangement.

     Example: A*B + A*C -> A*(B+C). Will overflow for a small number of
     choices for B and C for which the original didn't overflow. Savings:
     One multiply and one temporary (register pressure).

     Example: B/A + C/A -> (B+C)/A. Will overflow for a small number of
     choices for B and C for which the original didn't overflow. Savings:
     One divide and one temporary (register pressure).

     Example: A/B -> A*(1/B). Will overflow if B is a denormal, whereas the
     original might not. Savings: One divide changed to a multiply - might
     be large in case B is a loop invariant.

     Remark: On some targets, the intermediate calculations will be done in
     extended precision (with its extended range). In that case the problem
     indicated above does not exist; see however the restriction for ix86
     floating point arithmetic in chapter "Preliminaries".

  4. Rearrangements whose effect is a loss of accuracy on half of the inputs
     and a complete loss on the other half of the inputs.

     Rationale: Users might know that their computations always fall in the
     "loss of accuracy" subset and be able to bound the effect of this
     rearrangement.

     Example: A/B/C -> A/(B*C). B*C will overflow for a quarter of the
     floating point numbers, whereas it will return zero for slightly fewer
     than a quarter of the numbers B, C.

     Example: Evaluation of Z1/Z2 (Z1, Z2 complex; Z1 = A + B i, Z2 = C + D
     i) as ((A*C + B*D) + i (B*C - A*D)) / (C^2 + D^2). The denominator will
     overflow or underflow for slightly more than half of the values of C
     and D.

     Remark: On some targets, the intermediate calculations will be done in
     extended precision (with its extended range). In that case the problem
     indicated above does not exist; see however the restriction for ix86
     floating point arithmetic in chapter "Preliminaries".

Open issues

We should come up with a further classification for complex arithmetic.

Recommendations

None yet.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-08-10 12:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-09 12:46 Third Draft "Unsafe fp optimizations" project description dewar
2001-08-09 12:59 ` Brad Lucier
     [not found] <Pine.LNX.4.20.0108091704430.27212-100000@moshier.ne.mediaone.net>
2001-08-09 14:29 ` Stephen L Moshier
2001-08-10 12:25   ` Joe Buck
  -- strict thread matches above, loose matches on Subject: below --
2001-08-09 13:13 dewar
2001-08-09 13:07 Stephen L Moshier
2001-08-09 12:21 Brad Lucier
2001-08-08 13:55 Toon Moene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).