public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-19 13:00 N8TM
  0 siblings, 0 replies; 38+ messages in thread
From: N8TM @ 1998-12-19 13:00 UTC (permalink / raw)
  To: toon, egcs

In a message dated 12/19/98 12:39:34 PM Pacific Standard Time,
toon@moene.indiv.nluug.nl writes:

<< I tend to turn this remark around:  What we need in the g77 manual
 (despite the fact that it is not exclusively relevant to FORTRAN) is a
 section on the uses and pitfalls of floating point arithmetic.
 
 I'll set out to write this (this won't be easy, as I have to evade the
 obvious references for copyright reasons).>>

Excellent; if you are prepared for pre-publication suggestions or criticism,
let me know.
 
<< In the mean time, it would be useful for the compiler to warn about
 testing floating point variables for (in)equality.>>

I have used too many compilers which included such warnings, and find
them a hindrance.
 
<< HTH,
 
 -- 
 Toon Moene (toon@moene.indiv.nluug.nl)
  >>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-22 13:30 Toon Moene
  0 siblings, 0 replies; 38+ messages in thread
From: Toon Moene @ 1998-12-22 13:30 UTC (permalink / raw)
  To: d.love, egcs

> FWIW, the g77 manual has a reference (in 
> `Floating-point Errors') to a
> supplemented version but I don't remember details.  
> (Expert comments on the collection of references there 
> would be welcome.)

[ Well, I'm certainly not an expert on floating point
  arithmetic, but I'm working with them for 20 years now
  and learned the hard way to be careful ]

The `Supplement' mentioned in the docs more than covers everything I
wanted to write on this subject.

Sigh - I probably slept while you added this to the documentation ...

For those not having the g77 info stuff handy: See
http://www.validgh.com/

The reason people do not jump to this information right away might be
caused by the fact that it is titled "Floating point errors".  Those who
fall into the various traps floating point arithmetic lays out for them
invariably think of it as a "compiler error", not a "floating point
error" [not in the least because the example I showed will simply
"hang"] ;-)

Cheers,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-22 11:07 John Wehle
  0 siblings, 0 replies; 38+ messages in thread
From: John Wehle @ 1998-12-22 11:07 UTC (permalink / raw)
  To: N8TM; +Cc: egcs, pcg, rth, hjstein, toon, burley

> Contrary to an opinion put forth in this exchange, I see that alignment of
> 64-bit spills makes a measurable difference in performance on my Pentium 2.
> 
> I noticed, somewhat accidentally, that Livermore Fortran Kernel 8 runs 10%
> faster when linked with cygwin-b20.1 than with cygwin-b19/coolview.  I built
> the compiler today under cygwin-b19, and the performance of all the other
> kernels was unchanged from the previous version of egcs/g77.  Relinking the
> same .o with the different .dll made the difference, and it made no difference
> whether I ran under bash linked with one .dll or the other.

Just as another data point the BRL-CAD raytracing benchmarks run about 5%
faster when the compiler properly aligns doubles.  The current state of the
patch for this is:

  1) It only affects leaf functions.

  2) It aligns all registers spills as necessary and all simple uses
     of double / long double variables.

Open issues:

  1) The patch requires a frame pointer for those functions where that stack
     needs alignment.  I haven't run the BRL-CAD raytracing benchmarks with
     -fomit-frame-pointer to see if the proper alignment is worth requiring
     a frame.

  2) The patch currently doesn't provide alignment for variables such as:

     double a[10]

  3) The patch currently doesn't provide alignment in non-leaf functions.

  4) GDB will probably need updating due to the i386 prologue changes.

If I recall correctly, the main Pentium Pro / Pentium II performance hit
is when a double or long double crosses a cache boundary (which can happen
if they're not aligned correctly).

-- John
-------------------------------------------------------------------------
|   Feith Systems  |   Voice: 1-215-646-8000  |  Email: john@feith.com  |
|    John Wehle    |     Fax: 1-215-540-5495  |                         |
-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-19 15:17 Geert Bosch
  1998-12-20  8:09 ` Toon Moene
@ 1998-12-22  4:17 ` Dave Love
  1 sibling, 0 replies; 38+ messages in thread
From: Dave Love @ 1998-12-22  4:17 UTC (permalink / raw)
  To: egcs

>>>>> "GB" == Geert Bosch <bosch@gnat.com> writes:

 GB> If you want to know why, I advise you to read "What Every
 GB> Computer Scientist Should Know About Floating-Point Arithmetic",
 GB> by David Goldberg,

FWIW, the g77 manual has a reference (in `Floating-point Errors') to a
supplemented version but I don't remember details.  (Expert comments
on the collection of references there would be welcome.)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-21 23:30 N8TM
  0 siblings, 0 replies; 38+ messages in thread
From: N8TM @ 1998-12-21 23:30 UTC (permalink / raw)
  To: pcg, rth, hjstein, toon, burley; +Cc: egcs

Contrary to an opinion put forth in this exchange, I see that alignment of
64-bit spills makes a measurable difference in performance on my Pentium 2.

I noticed, somewhat accidentally, that Livermore Fortran Kernel 8 runs 10%
faster when linked with cygwin-b20.1 than with cygwin-b19/coolview.  I built
the compiler today under cygwin-b19, and the performance of all the other
kernels was unchanged from the previous version of egcs/g77.  Relinking the
same .o with the different .dll made the difference, and it made no difference
whether I ran under bash linked with one .dll or the other.

Examining the code, 9 loop invariant REAL*8 scalars are spilled outside the 2
innermost loops.  Each is restored once inside the inner loop.  There are 15
REAL*8 memory accesses directly to COMMON in the inner loop, and I believe 33
floating point operations.  In addition, 5 pointers are spilled and restored
in the inner loop.  The 10% increase in execution time for a mis-aligned stack
would indicate that the penalty for restoring a spilled REAL*8 is twice as
great when it is mis-aligned, even though it surely would stay in level 1
cache in the absence of cache mapping conflicts.

As I had mentioned several times earlier, I had noticed that the -O2 code was
running slower on W95 than -Os code, while this effect was not repeated on
linux-gnulibc1.  Today's finding confirms that effects like this stemmed from
mis-alignment of the stack, together with the smaller number of spills
generated with -Os.  With the up-to-date versions of both g77 and cygwin-b20,
there no longer are any Livermore Kernels which run slower with -O2 than -Os. 

Not to say there are no challenges left!  I still find a few cases where the
commercial compiler lf90 4.50g runs 40% faster than g77 (as well as a smaller
number where g77 excels).  Apparently, there are no 80-bit spills or mis-
aligned COMMONs in that Lahey version, unlike the current l95.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-20 13:51 ` Marc Lehmann
@ 1998-12-20 13:52   ` Marc Lehmann
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Lehmann @ 1998-12-20 13:52 UTC (permalink / raw)
  To: Marc Lehmann, N8TM; +Cc: egcs

On Sun, Dec 20, 1998 at 10:51:23PM +0100, Marc Lehmann wrote:
> I was quite suprosed of it myself. But I only got difefrences for the first
          ^^^^^ surprised                         ^^^^^ differences

I've got the flu and don't find the right keys or so...

--
Happy New Year, I'll be away from 21. Dec to 7. Jan

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-19 14:23 N8TM
@ 1998-12-20 13:51 ` Marc Lehmann
  1998-12-20 13:52   ` Marc Lehmann
  0 siblings, 1 reply; 38+ messages in thread
From: Marc Lehmann @ 1998-12-20 13:51 UTC (permalink / raw)
  To: N8TM, pcg; +Cc: egcs

On Sat, Dec 19, 1998 at 05:22:59PM -0500, N8TM@aol.com wrote:
>  I have no idea how valid these results are (I'm probably not measuring the
>  fst), but xfmode spills seem to be expensive.
>   >>
> Thanks for this indication.  That would reinforce my opinion that double
> spills might be preferred where the syntax indicates single (float) precision,
> with xfmode reserved for those cases where the syntax indicates double.  I
> still would wish to be assured of a mechanism to align the spills, unless
> tests could show that is unnecessary.  I have to be skeptical when the
> compiler (lf95) which uses xfmode spills suffers so from mis-alignment of most
> declared double arrays.

I was quite suprosed of it myself. But I only got difefrences for the first
iteration (when the data is not in the cache)

--
Happy New Year, I'll be away from 21. Dec to 7. Jan

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-19 15:17 Geert Bosch
@ 1998-12-20  8:09 ` Toon Moene
  1998-12-22  4:17 ` Dave Love
  1 sibling, 0 replies; 38+ messages in thread
From: Toon Moene @ 1998-12-20  8:09 UTC (permalink / raw)
  To: Geert Bosch, egcs

Geert Bosch wrote:
> 
> On Sat, 19 Dec 1998 21:37:49 +0100, Toon Moene wrote:
> 
>   In the mean time, it would be useful for the compiler to warn about
>   testing floating point variables for (in)equality.
> 
> Testing for equality is perfectly fine on systems with IEEE arithmetic
> and many algorithms would be impossible to write efficiently if one would
> regard floating-point as a fuzzy kind of real value. Your statement would
> be true in the pre-IEEE era, but fortunately fpt arithmetic is well-defined
> on the large majority of current systems.

Oh, but I wasn't suggesting that the floating point *instructions* are
not to be trusted,

> If you want to know why, I advise you to read "What Every Computer Scientist
> Should Know About Floating-Point Arithmetic", by David Goldberg, in ACM
> Computing Surveys, vol. 23 nr. 1, march 1991, available in PostScript at:
> http://swift.lanl.gov/Internal/Computing/SunOS_Compilers/common-tools/numerical_comp_guide/goldberg1.ps

[ Thanks for the reference - I read through most of it
  now; you probably refer to the section "Languages and
  compilers" ? ]

What I am thinking of is what I presented earlier, namely that in the
following root-finding function:

      SUBROUTINE FINDROOT(GUESS)
  10  FINDROOT = [some-expression-involving-GUESS]
      IF (FINDROOT .EQ. GUESS) RETURN
      GUESS = FINDROOT
      GOTO 10
      END

the .EQ. is a mistake.  Even when using exact rounding (i.e. the IEEE
Standard), it is impossible to prove that,  given some arbitrary GUESS
and arbitrary some-expression-involving-GUESS, the "solution" FINDROOT
and GUESS will *not* oscillate between two numbers one bit apart
indefinitely.

[ Note that in Fortran, the name of a function is a
  local variable which will magically turn into the
  function result upon return ]

This is the case that always fails on the i386 [we get a bug report for
something like this about once a month on fortran@gnu.org] because
FINDROOT will end up in a register (being freshly computed) and GUESS
will be retrieved from memory because some-expression will be so huge as
to drive all floating point values out of their registers.

In My Not So Humble Opinion, it is a good thing that this fails on the
i386, because it cannot be proven to always succeed on *any* floating
point implementation, so people better learn (or else ;-).

Regards,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-19 15:17 Geert Bosch
  1998-12-20  8:09 ` Toon Moene
  1998-12-22  4:17 ` Dave Love
  0 siblings, 2 replies; 38+ messages in thread
From: Geert Bosch @ 1998-12-19 15:17 UTC (permalink / raw)
  To: egcs, N8TM, Toon Moene

On Sat, 19 Dec 1998 21:37:49 +0100, Toon Moene wrote:

  In the mean time, it would be useful for the compiler to warn about
  testing floating point variables for (in)equality.

Testing for equality is perfectly fine on systems with IEEE arithmetic 
and many algorithms would be impossible to write efficiently if one would 
regard floating-point as a fuzzy kind of real value. Your statement would 
be true in the pre-IEEE era, but fortunately fpt arithmetic is well-defined 
on the large majority of current systems.

If you want to know why, I advise you to read "What Every Computer Scientist 
Should Know About Floating-Point Arithmetic", by David Goldberg, in ACM 
Computing Surveys, vol. 23 nr. 1, march 1991, available in PostScript at:
http://swift.lanl.gov/Internal/Computing/SunOS_Compilers/common-tools/numerical_comp_guide/goldberg1.ps

Regards,
   Geert



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-19 12:39 ` Toon Moene
@ 1998-12-19 14:42   ` Dave Love
  0 siblings, 0 replies; 38+ messages in thread
From: Dave Love @ 1998-12-19 14:42 UTC (permalink / raw)
  To: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

 Toon> I tend to turn this remark around: What we need in the g77
 Toon> manual (despite the fact that it is not exclusively relevant to
 Toon> FORTRAN) is a section on the uses and pitfalls of floating
 Toon> point arithmetic.

We already have such a section, don't we? (not that I'd want to
discourage doc additions!).

 Toon> I'll set out to write this (this won't be easy, as I have to
 Toon> evade the obvious references for copyright reasons).

We already make what I thought were some obvious references, but no
matter.  What's the problem with such references?  (I don't think we
really need such stuff to be free per GNU docs, because it isn't
actually program doc.)

 Toon> In the mean time, it would be useful for the compiler to warn
 Toon> about testing floating point variables for (in)equality.

I think ftnchek (Toolpack?) will do that if it helps.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-19 14:26 N8TM
  0 siblings, 0 replies; 38+ messages in thread
From: N8TM @ 1998-12-19 14:26 UTC (permalink / raw)
  To: pcg, rth, hjstein, toon; +Cc: egcs

In a message dated 12/19/98 1:41:05 PM Pacific Standard Time, pcg@goof.com
writes:

<< Maybe, but compared to what? Nobody so far has brought some
 alternative with the same (good) semantics. People wanting speed
 do not need to use xfmode spilling. >>
If the xfmode spilling is an option, and aligned storage is available, I could
see no objection.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-19 14:23 N8TM
  1998-12-20 13:51 ` Marc Lehmann
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-19 14:23 UTC (permalink / raw)
  To: pcg; +Cc: egcs

In a message dated 12/19/98 1:39:03 PM Pacific Standard Time, pcg@goof.com
writes:

<< Ok, _some_ data: if everything is in the cache, on my p-ii,
 
         fldl %0
         fxam
         fstpl %0
         fwait
 
 takes 3 cycles regardless of how the memory is aligned.
 
 The code sequence:
 
         fldt %0
         fxam
         fstpt %0
 	fwait
 
 takes 6 cycles.
 
 I have no idea how valid these results are (I'm probably not measuring the
 fst), but xfmode spills seem to be expensive.
  >>
Thanks for this indication.  That would reinforce my opinion that double
spills might be preferred where the syntax indicates single (float) precision,
with xfmode reserved for those cases where the syntax indicates double.  I
still would wish to be assured of a mechanism to align the spills, unless
tests could show that is unnecessary.  I have to be skeptical when the
compiler (lf95) which uses xfmode spills suffers so from mis-alignment of most
declared double arrays.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 18:37     ` Marc Lehmann
@ 1998-12-19 14:03       ` Dave Love
  0 siblings, 0 replies; 38+ messages in thread
From: Dave Love @ 1998-12-19 14:03 UTC (permalink / raw)
  To: egcs

>>>>> "Marc" == Marc Lehmann <pcg@goof.com> writes:

 >> What's the guarantee that crt0 sets up extended precision on
 >> whichever systems are of interest?

 Marc> I have no idea. Do _you_ have one?

No, or I wouldn't have asked.  AFAIR the startup settings of the FPU
changed at some stage in the past, though.

 >> Is it currently consistent across all the x86 platforms we run on?

 Marc> Definitely not. But does that mean we should break linux-libm
 Marc> (for example) because solaris behavuour wasn't consistent
 Marc> before(?).

Of course not.  I need libm to work; that's why I keep asking the
question.  I presume that glibc would need to conform on another
platform that did define this.

 Marc> How about implementing the __setfpucw functionality as found on
 Marc> linux? Creating a library (-lrdble) should be trivial then.

I don't have the information, expertise or interest in that.  I called
__setfpucw, but I'm not even sure it works across Linuxes.  I'd do
newlib and cygwin32 as well, given appropriate doc and a way of
figuring out that they're in use; AFAIR they seem to have the facility
but don't say how to drive it.  I'd expect people with an interest to
contribute the (presumably trivial) additions.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 14:25     ` Gerald Pfeifer
@ 1998-12-19 13:50       ` Dave Love
  0 siblings, 0 replies; 38+ messages in thread
From: Dave Love @ 1998-12-19 13:50 UTC (permalink / raw)
  To: egcs

>>>>> "GP" == Gerald Pfeifer <pfeifer@dbai.tuwien.ac.at> writes:

 GP> If you make code like that availabe, please also consider submitting an
 GP> update to that page.

I was hoping for somewhere from which to make it sensibly available.
If I put it up here it may not be very long-lived and I don't know
whether I can rely on ftp @gnu either.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 22:36 ` Richard Henderson
@ 1998-12-19 13:41   ` Marc Lehmann
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Lehmann @ 1998-12-19 13:41 UTC (permalink / raw)
  To: Richard Henderson, N8TM, hjstein, toon; +Cc: egcs

On Fri, Dec 18, 1998 at 10:36:08PM -0800, Richard Henderson wrote:
> And before I even did that, someone would have to do a much better
> job convincing me that it was even a good idea.  Cause from where
> I'm sitting now, I agree with Toon that the idea is losing all the
> way around. 

Maybe, but compared to what? Nobody so far has brought some
alternative with the same (good) semantics. People wanting speed
do not need to use xfmode spilling.

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 23:07 N8TM
@ 1998-12-19 13:39 ` Marc Lehmann
  0 siblings, 0 replies; 38+ messages in thread
From: Marc Lehmann @ 1998-12-19 13:39 UTC (permalink / raw)
  To: N8TM; +Cc: egcs

On Sat, Dec 19, 1998 at 02:07:02AM -0500, N8TM@aol.com wrote:
> In a message dated 12/18/98 10:36:15 PM Pacific Standard Time, rth@cygnus.com
> writes:
> 
> << I have not tried quantifing the change.  I would want to examine
>  things more closely, however, because 25% seems low to me. >>
> 
> Some proponents of the idea felt that spills were so rare that no difference
> would be seen regardless of the efficiency of an 80-bit spilling
> implementation. My 25% figure is for a complete execution of the application;
> certainly there must be sections of this application where the 80-bit spills
> are doubling the time spent.  That means the 80-bit spills, a majority of them
> mis-aligned, are taking several times as long as the 32-bit spills.

Ok, _some_ data: if everything is in the cache, on my p-ii,

        fldl %0
        fxam
        fstpl %0
        fwait

takes 3 cycles regardless of how the memory is aligned.

The code sequence:

        fldt %0
        fxam
        fstpt %0
	fwait

takes 6 cycles.

I have no idea how valid these results are (I'm probably not measuring the
fst), but xfmode spills seem to be expensive.

I still think its a much better solution than -ffloat-store (slow) and
64 bit precision (changing too many things we have no control over).

PS: I'm not subscribed at the moment ;)

--
Happy New Year, I'll be away from 21. Dec to 7. Jan

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-19  9:05 N8TM
@ 1998-12-19 12:39 ` Toon Moene
  1998-12-19 14:42   ` Dave Love
  0 siblings, 1 reply; 38+ messages in thread
From: Toon Moene @ 1998-12-19 12:39 UTC (permalink / raw)
  To: N8TM, egcs

N8TM@aol.com wrote:

> In a message dated 12/19/98 6:46:15 AM Pacific Standard Time,
> emil@skatter.usask.ca writes:

> <<  I very much
>  appreciate your proposal AND I endorse it completely. I am more than willing
> to
>  pay a performance penalty in order to get numerically accurate results with
> less
>  programming on my part.  >>
> I would like to join in thanking Craig for raising this issue and offering to
> work on it.    My primary objection to it was that the performance penalty
> would be too large if the problem of mis-aligned spills were not solved.  With
> that qualification, I endorse it also.

I tend to turn this remark around:  What we need in the g77 manual
(despite the fact that it is not exclusively relevant to FORTRAN) is a
section on the uses and pitfalls of floating point arithmetic.

I'll set out to write this (this won't be easy, as I have to evade the
obvious references for copyright reasons).

In the mean time, it would be useful for the compiler to warn about
testing floating point variables for (in)equality.

HTH,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-19  9:05 N8TM
  1998-12-19 12:39 ` Toon Moene
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-19  9:05 UTC (permalink / raw)
  To: emil, burley, egcs

In a message dated 12/19/98 6:46:15 AM Pacific Standard Time,
emil@skatter.usask.ca writes:

<<  I very much
 appreciate your proposal AND I endorse it completely. I am more than willing
to
 pay a performance penalty in order to get numerically accurate results with
less
 programming on my part.  >>
I would like to join in thanking Craig for raising this issue and offering to
work on it.    My primary objection to it was that the performance penalty
would be too large if the problem of mis-aligned spills were not solved.  With
that qualification, I endorse it also.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-18 23:07 N8TM
  1998-12-19 13:39 ` Marc Lehmann
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-18 23:07 UTC (permalink / raw)
  To: rth, hjstein, toon; +Cc: egcs

In a message dated 12/18/98 10:36:15 PM Pacific Standard Time, rth@cygnus.com
writes:

<< I have not tried quantifing the change.  I would want to examine
 things more closely, however, because 25% seems low to me. >>

Some proponents of the idea felt that spills were so rare that no difference
would be seen regardless of the efficiency of an 80-bit spilling
implementation. My 25% figure is for a complete execution of the application;
certainly there must be sections of this application where the 80-bit spills
are doubling the time spent.  That means the 80-bit spills, a majority of them
mis-aligned, are taking several times as long as the 32-bit spills.

 As I'm seeing so many implementations where 64-bit spills are mis-aligned,
I'd like to see the alignment problem solved before I'm stuck with wider
spills.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 21:58 N8TM
@ 1998-12-18 22:36 ` Richard Henderson
  1998-12-19 13:41   ` Marc Lehmann
  0 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 1998-12-18 22:36 UTC (permalink / raw)
  To: N8TM, rth, hjstein, toon; +Cc: egcs

On Sat, Dec 19, 1998 at 12:58:10AM -0500, N8TM@aol.com wrote:
> How much extra time?

One extra cycle on read; since we're committed to read-modify-write
anyway, probably one to three extra cycles on write depending on
if we actually straddle a 16-byte boundary.

> Is it feasible to make the XFmode spills use aligned addresses,
> and would alignment be as much of an improvement as in DFmode?

If we were to spill in XFmode, then yes, alignment would be just 
as important as in DFmode.

> The only quantification I've seen is my test of one application
> indicating that changing spills from SFmode to XFmode appears to
> make that application run 25% longer on a PPro.

I have not tried quantifing the change.  I would want to examine
things more closely, however, because 25% seems low to me.

And before I even did that, someone would have to do a much better
job convincing me that it was even a good idea.  Cause from where
I'm sitting now, I agree with Toon that the idea is losing all the
way around. 


r~

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-18 21:58 N8TM
  1998-12-18 22:36 ` Richard Henderson
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-18 21:58 UTC (permalink / raw)
  To: rth, hjstein, toon; +Cc: egcs

In a message dated 12/18/98 4:03:09 PM Pacific Standard Time, rth@cygnus.com
writes:

<< On the contrary.  If you work with SFmode values, they'll be spilled
 in SFmode.  And XFmode reads/writes to unaligned (mod 16) addresses
 takes extra time.
  >>
How much extra time?  Is it feasible to make the XFmode spills use aligned
addresses, and would alignment be as much of an improvement as in DFmode?  The
only quantification I've seen is my test of one application indicating that
changing spills from SFmode to XFmode appears to make that application run 25%
longer on a PPro.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 12:14   ` Dave Love
  1998-12-18 14:25     ` Gerald Pfeifer
@ 1998-12-18 18:37     ` Marc Lehmann
  1998-12-19 14:03       ` Dave Love
  1 sibling, 1 reply; 38+ messages in thread
From: Marc Lehmann @ 1998-12-18 18:37 UTC (permalink / raw)
  To: egcs

On Fri, Dec 18, 1998 at 08:14:22PM +0000, Dave Love wrote:
> >>>>> "Marc" == Marc Lehmann <pcg@goof.com> writes:
> 
>  Marc> This might break any third-party libraries and/or the system
>  Marc> libm if it uses extended precision to implement
>  Marc> double-precision operations.
> 
> What's the guarantee that crt0 sets up extended precision on whichever
> systems are of interest?

I have no idea. Do _you_ have one?

>  Marc> IAW, this functionality must be off by default, similar to
>  Marc> -malign-double.
> 
> Is it currently consistent across all the x86 platforms we run on?

Definitely not. But does that mean we should break linux-libm (for example)
because solaris behavuour wasn't consistent before(?).

> (modulo f2000 intrinsics) is a trivial piece of runtime so that people
> can say `-lrdble' (or something) and (probably) be done.  Similarly
> for floating point traps/masks.

The question (for me) is wether libm functions will retain
double precision then? Sure, the deviations that could introduce
are pretty minor, but, after all, the deviations caused by using
extended precision are similarly "minor".

> I'd at least like to put the code somewhere people can retrieve it to
> use/extend if they like.  Could I put it in the contrib directory, for
> instance?

How about implementing the __setfpucw functionality as found
on linux? Creating a library (-lrdble) should be trivial then.

--
Happy New Year, I'll be away from 21. Dec to 7. Jan

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-18 12:14   ` Dave Love
@ 1998-12-18 14:25     ` Gerald Pfeifer
  1998-12-19 13:50       ` Dave Love
  1998-12-18 18:37     ` Marc Lehmann
  1 sibling, 1 reply; 38+ messages in thread
From: Gerald Pfeifer @ 1998-12-18 14:25 UTC (permalink / raw)
  To: Dave Love; +Cc: egcs

On 18 Dec 1998, Dave Love wrote:
> I'd at least like to put the code somewhere people can retrieve it to
> use/extend if they like.  Could I put it in the contrib directory, for
> instance?

I believe that would make an excellent addition to our forthcoming
"egcstensions" page at http://egcs.cygnus.com/egcstensions.html , which
I'll announce shortly.

If you make code like that availabe, please also consider submitting an
update to that page.

Gerald
-- 
Gerald Pfeifer (Jerry)      Vienna University of Technology
pfeifer@dbai.tuwien.ac.at   http://www.dbai.tuwien.ac.at/~pfeifer/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-17 12:35 ` Marc Lehmann
@ 1998-12-18 12:14   ` Dave Love
  1998-12-18 14:25     ` Gerald Pfeifer
  1998-12-18 18:37     ` Marc Lehmann
  0 siblings, 2 replies; 38+ messages in thread
From: Dave Love @ 1998-12-18 12:14 UTC (permalink / raw)
  To: egcs

>>>>> "Marc" == Marc Lehmann <pcg@goof.com> writes:

 Marc> This might break any third-party libraries and/or the system
 Marc> libm if it uses extended precision to implement
 Marc> double-precision operations.

What's the guarantee that crt0 sets up extended precision on whichever
systems are of interest?

 Marc> IAW, this functionality must be off by default, similar to
 Marc> -malign-double.

Is it currently consistent across all the x86 platforms we run on?
I'm fairly sure not.  However, all _I_ particularly want to include
(modulo f2000 intrinsics) is a trivial piece of runtime so that people
can say `-lrdble' (or something) and (probably) be done.  Similarly
for floating point traps/masks.

I'd at least like to put the code somewhere people can retrieve it to
use/extend if they like.  Could I put it in the contrib directory, for
instance?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-17  1:43 N8TM
@ 1998-12-17 12:35 ` Marc Lehmann
  1998-12-18 12:14   ` Dave Love
  0 siblings, 1 reply; 38+ messages in thread
From: Marc Lehmann @ 1998-12-17 12:35 UTC (permalink / raw)
  To: N8TM; +Cc: egcs

On Thu, Dec 17, 1998 at 04:42:40AM -0500, N8TM@aol.com wrote:
>  IAW, how is 64 bit rounding mode going to be faster? For me, it seems this
>  creates a similar situation to the float->integer conversion, i.e. save and
>  restoring the control word with each assignment. >>

> Although I haven't seen anyone specify this, I assume they mean to leave
> 64-bit mode set throughout the program, or at least for the duration of any

This might break any third-party libraries and/or the system libm if it uses
extended precision to implement double-precision operations.

IAW, this functionality must be off by default, similar to -malign-double.

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg@goof.com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-17  1:43 N8TM
  1998-12-17 12:35 ` Marc Lehmann
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-17  1:43 UTC (permalink / raw)
  To: pcg, egcs

In a message dated 12/16/98 12:36:17 PM Pacific Standard Time, pcg@goof.com
writes:

<< I still don't see what the 64 bit precision idea gives us, in terms of
 performance. First, it doesn't give us full ieee, second, it kills
 performance, depending on where the rounding mode is set (before each
 assignment? resetting it to normal before each long double assignment?)
 
 IAW, how is 64 bit rounding mode going to be faster? For me, it seems this
 creates a similar situation to the float->integer conversion, i.e. save and
 restoring the control word with each assignment. >>
Although I haven't seen anyone specify this, I assume they mean to leave
64-bit mode set throughout the program, or at least for the duration of any
intensive computing.  I've tried running Livermore Fortran Kernels this way,
and it does speed up division and sqrt(), as it should.  It works reasonably
well as long as all arithmetic is intended to be ordinary single or double
precision.  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-16  6:10 N8TM
  0 siblings, 0 replies; 38+ messages in thread
From: N8TM @ 1998-12-16  6:10 UTC (permalink / raw)
  To: hjstein, egcs; +Cc: ejr, jbuck, egcs

In a message dated 12/16/98 12:36:10 AM Pacific Standard Time,
hjstein@bfr.co.il writes:

<<  > Maybe that option could be implied by -ffast-math.
 
 I'd much rather have more precise control over it.  Doesn't
 -ffast-math imply various sorts of liberties to be taken? >>

Yes, there are too many unrelated liberties collected under -ffast-math
already.  I've never found a situation where changing the treatment of
comparisons gave any benefit, and I leave it off for that reason.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-15  0:05 N8TM
@ 1998-12-15 10:01 ` Joe Buck
  0 siblings, 0 replies; 38+ messages in thread
From: Joe Buck @ 1998-12-15 10:01 UTC (permalink / raw)
  To: N8TM; +Cc: burley, jbuck, ejr, hjstein, egcs

N8TM@aol.com writes:

> burley@gnu.org writes:

No, Craig didn't write what you quote, I did:

>  >... avoids transformations that can change numerical results,
>  >such as pre-evaluating expressions with 64-bit that would otherwise
>  >be evaluated using 80-bit precision at runtime >>

N8TM again:

> A good point.  Among other things, this will require a good strtold (?)
> conversion from decimal to binary, which I don't think is available.

No, it suffices to read the constants the user provided as double if they
are double precision literals.  And if the compiler
supports long double it already has the "strtold" equivalent.

> At least this needs some planning.

What it means is that the compiler would need to evaluate constant
expressions using long double, and store them in the executable as long
double, in any case where the unoptimized code would do this on the
same processor.

That is, if we have (yes, this is a contrived example)

double one_third(double z)
{
	double x, y;
	x = 1.0;
	y = 3.0;
	return z * (x / y);
}

if x, y, and z are in registers, we will compute an 80-bit approximation
to 1/3.  If the compiler changed this to

double one_third(double z)
{
	return z * 0.33333333333333333;
}

we have a 64-bit approximation to 1/3.  The compiler must instead produce

double one_third(double z)
{
	return z * 0.3333333333333333333333L;
}

to make the results invariant under optimization.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-15  0:05 N8TM
  1998-12-15 10:01 ` Joe Buck
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-15  0:05 UTC (permalink / raw)
  To: burley, jbuck; +Cc: ejr, hjstein, egcs

In a message dated 12/14/98 11:01:15 PM Pacific Standard Time, burley@gnu.org
writes:

<< avoids transformations that can change numerical results,
 >such as pre-evaluating expressions with 64-bit that would otherwise
 >be evaluated using 80-bit precision at runtime >>

A good point.  Among other things, this will require a good strtold (?)
conversion from decimal to binary, which I don't think is available.  At least
this needs some planning.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-14  9:25       ` Joe Buck
  1998-12-14 14:30         ` Edward Jason Riedy
@ 1998-12-15  0:04         ` Craig Burley
  1 sibling, 0 replies; 38+ messages in thread
From: Craig Burley @ 1998-12-15  0:04 UTC (permalink / raw)
  To: jbuck; +Cc: burley

>What would the performance cost be if we spilled ix86 FP registers
>as 80 bits?  Unfortunately I'm not familiar enough with the ix86
>instruction set to answer this question for myself.  How difficult
>would it be to change gcc to do 80 bit spills, without changing
>the size of float and double?

I think the performance cost is likely to be pretty minor, mostly
not noticeable, since it doesn't affect any code that isn't already
spilling 80-bit values by chopping them down.  (Though I guess
any code that makes certain uses of functions returning FP values
are likely to do such spills...but a bit of extra cache usage
around a procedure call isn't usually a noticeable hit against
performance, right?)

It's the difficulty of changing gcc that I think is the biggest
problem.

>It seems to me that it should cost a lot less than -ffloat-store
>and should get rid of the unpredictable behavior.  (Yes, it still
>doesn't match IEEE unless more is done, but it stops weirdness like
>root-finding algorithms that fail to converge if you're unlucky).

What I've come to believe is that, without this change, -ffloat-store
is useless except for the lucky and the very, very careful (who
also accept comparatively poor performance).  With this change,
-ffloat-store adds reasonable stability on top of reasonable
stability.

I don't feel comfortable saying that even the combination of this
change and -ffloat-store results in complete stability/predictability,
however, because I just don't know enough to say that, and I tend
to be pessimistic about such things.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-14  9:25       ` Joe Buck
@ 1998-12-14 14:30         ` Edward Jason Riedy
  1998-12-15  0:04         ` Craig Burley
  1 sibling, 0 replies; 38+ messages in thread
From: Edward Jason Riedy @ 1998-12-14 14:30 UTC (permalink / raw)
  To: Joe Buck; +Cc: egcs

Oh well.  And Joe Buck writes:
 - What would the performance cost be if we spilled ix86 FP registers
 - as 80 bits?

FYI, this is Dr. Kahan's suggested fix whenever any of us (us == grad 
students at Berkeley who ask him about it) mention the truncation
problem.

Dr. Kahan's original intent was to have the on-chip stack be just the
top few cells in the total FP stack.  It's outlined in a paper from 1989
(with a 1990 prefix (modified in 1994) and a 1998 addendum) titled
``How Intel 80x87 Stack Over/Underflow Should Have Been Handled.''
It's in the FP98 notes, on the off chance any of y'all have them.

Quick summary of points from that paper (which I've seen mentioned 
here, I think, but not with details):
	* He forcasts that only 1280 bytes (128 80-bit words) of memory 
	for stack extension would be ``almost always ample.''

	* Differences in the 80x87 family make engineering the intended
	behavior nasty.  It also involves OS help for the trap handlers.
		* The 80287 has two major variants.
		* Not all opcodes are recorded the same way through the
		80x87 family.
		* The 80387 has undocumented anomalies.
		* <80387 don't support many IEEE 754 operations, and they
		would need emulated.

	* Other co-processors (namely the Weiteks) don't have 80-bit
	precision.  (Like I said, 1989.  Not so much an issue now.)

	* Some IEEE functions aren't in <80387 FPUs, making drivers
	more difficult to implement.

In this paper, part of the problem he mentions is that programs would
need to determine which FPU exists at run-time.  When he wrote the paper,
that wasn't a common operation.  The diversification of the 80x86 family 
(MMX, 3Dnow, etc) has made it quite common.

It would be _really, really, really_ nice if someone could make the 
whole thing work as intended.  It's quite possible with the free Unices 
and gcc, especially since Linux / *BSD don't bother too much about pre-
80386 chips.  I want to look at it, but I won't be able to start for 
a few months due to other commitments (and lack of understanding of the
relevant gcc / Linux / glibc code, but I'm working on it).

If there's interest, I'll try to convince Dr. Kahan to post this paper 
on-line.  It's a nice outline of the issues involved.

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-14  8:49     ` Craig Burley
@ 1998-12-14  9:25       ` Joe Buck
  1998-12-14 14:30         ` Edward Jason Riedy
  1998-12-15  0:04         ` Craig Burley
  0 siblings, 2 replies; 38+ messages in thread
From: Joe Buck @ 1998-12-14  9:25 UTC (permalink / raw)
  To: Craig Burley; +Cc: moshier, egcs, burley

What would the performance cost be if we spilled ix86 FP registers
as 80 bits?  Unfortunately I'm not familiar enough with the ix86
instruction set to answer this question for myself.  How difficult
would it be to change gcc to do 80 bit spills, without changing
the size of float and double?

It seems to me that it should cost a lot less than -ffloat-store
and should get rid of the unpredictable behavior.  (Yes, it still
doesn't match IEEE unless more is done, but it stops weirdness like
root-finding algorithms that fail to converge if you're unlucky).



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-13 15:18   ` Stephen L Moshier
@ 1998-12-14  8:49     ` Craig Burley
  1998-12-14  9:25       ` Joe Buck
  0 siblings, 1 reply; 38+ messages in thread
From: Craig Burley @ 1998-12-14  8:49 UTC (permalink / raw)
  To: moshier; +Cc: burley

>The Intel and 68k floating-point behavior has existed some fifteen
>years and has never stopped anyone I know from writing good-quality,
>portable programs.

That info doesn't help anyone AFAICT.  For example, you didn't specify
whether those programs made heavy use of floating-point, or whether they
met the price/performance expectations compared to other machines
that *don't* offer the 80-bit-FP behavior, or whether they were
compiled with a compiler that doesn't have the problems addressed
by my proposal.  (Not all compilers spill 80-bit FP values to 64 bits
like gcc does, apparently.)

>We have some real problems that ought to have more priority.  An example
>is that float complex does not work on alphas.  There is no workaround
>for that bug.  It is hard to fix properly, and it is something that
>needs fixing.

Yes, but that is also a longstanding bug, across more machines than
just Alpha, and AFAIK it has never stopped anyone from writing good-
quality, portable programs -- because, to do that, they either write
the operations out longhand (necessary for C if you say "portable"
anyway), thus avoiding the bug entirely, or just use Fortran (which has
had -femulate-complex as the default for some time now to work around the
bug).

In this case, -fno-emulate-complex increases performance at some
cost of possibly encountering the code-generation bug(s).  g77
might be the only gcc front end that supports a built-in complex
type and supports switching between use of the gcc back end's
built-in complex support; AFAIK, Ada (GNAT) does the former but
always uses emulation (the equivalent of g77's default,
-femulate-complex).  I don't know about C++ (g++) or Pascal (gpc),
or any of the others, offhand.  I think gcc might be the only
front end that provides a native complex type but doesn't offer
emulation (though using gcc's native complex type does not exactly
make for portable code, if by "portable" one means "compiled by
the compiler of your choice...").

If you'd take a moment to review the reason I originally submitted
my proposal here, you might notice that what triggered it was my
awareness that people were seriously talking about rewriting the x86
machine description from scratch, or nearly so.  (Maybe I wasn't
too clear about this motivation in my email, but it *was* why I
proposed it then, instead of waiting to study the issues further.
I was trying to make sure my proposal didn't miss a crucial window
of opportunity, a window in the early phase of a redesign.)

If we're going to rewrite substantial areas of the ix86 compiler,
such as the machine descriptions, we might as well make sure we
tell it the truth: that x86 FP registers are 80-bit, and that they should
therefore normally be spilled to 80-bit-containing temporaries,
not 64-bit ones.  That seems like a no-brainer to me, and it is,
essentially, all I'm proposing, or at least the bulk of it.  (E.g.
I'd like to see an option that gets the old behavior, and maybe there
are a few "unexpected" places that'd still chop while spilling unless
specifically fixed, outside of the machine-description and obviously
related areas.)

Calling 80-bit FP registers a "feature" is perhaps appropriate when
referring to the hardware, and, even so, debatable when getting
strict 64-bit behavior is so difficult that it is impossible to do
so without a substantial drop in performance (as is the case with the
x86 architecture and all implementations to date, AFAICT)...but let's
put that aside.

Calling 80-bit FP registers a "feature" in terms of *compiler* support
of an 80-bit-FP machine, when that compiler unpredictably
decides whether to make use of that feature, is IMO completely
ludicrous.

For example, I wouldn't mind a feature on my system whereby it automatically
backed up my files as I edited and wrote them.  However, if it randomly
decided to back up only 80% of a file depending on the precise timing
of my keyboard hits and mouse movements within the previous 5 minutes,
then it isn't a feature -- it's a bug, and I'd want that behavior fixed
to be more consistent, or the feature removed entirely.  Sure, you could
just yell at me about how I should be making my *own* backups, but
if you give me a feature that, when poked and prodded at, *pretends*
to reliably back up my files but, on occasion, decides on its own to
not bother doing so quite properly, then *you* are at fault, not me,
when I get burnt by relying on this behavior.  (That is: if you call
such automatic back-up a "feature", then that means it should be
relied upon.  If it's *not* a feature, then, as a behavior that
affects performance and/or consistency of results, it's a bug.  The
proponents of the x86's 80-bit FP call it a "feature".  Therefore,
it must be made as *predictable* in normal use as possible, or it
is, in fact, a bug.)

The fact is, gcc does not, and never has, properly supported its
implicit use of extended precision (as mandated by performance
concerns on the underlying hardware), and this is becoming more
and more noticable as more and more people use gcc, g77, g++, and
so on on x86 machines.  That few people in the past have noticed
this is irrelevant: more and more people notice this every day,
in trivial examples, and we don't have *any* clue as to how many
people *should* have noticed this in more complicated code that's
in production, with latent bugs, thanks to gcc's misbehavior, but
haven't managed to notice the bug (or didn't bother to report it).

Therefore, on the whole, 80-bit FP support in gcc on machines like the
x86 is not a feature, it's a bug.  We can either fix the bug, or,
as some would recommend, eliminate the feature by not using the
extra precision (by storing/reloading every single computation to a
64-bit value).  The former hardly affects performance, while preserving
the underlying hardware feature, while the latter greatly affects
performance.

I think most gcc/g77/g++ users would prefer the former.  Personally,
I don't have a strong opinion: I tend to prefer consistent behavior
across *all* GNU-supported machines (e.g. I'd like -mieee to be the
default on Alphas), but it seems that most of the industry prefers
the underlying machinery, and its performance capabilities, be *more*
exposed for FP, as compared to other things that GNU (and UNIX in general)
rightly hides behind a consistent, portable interface.  When it comes
to FP behavior, I tend to discount my own opinions in favor of what
others, with more experience, think, and the FP experts seem to have
already discounted different FP behavior across systems, but not
(yet) different FP behavior across function calls (which gcc gives them
today).

And, setting FP modes to 32 or 64 bit on the x86 is simply not a
feasible solution for gcc to offer as a default at this point in time.

Someday, when it can dynamically recognize the expected default
mode of any assembler code (including object files created before
this support, meaning default to extended, 80-bit mode), tag every
function, even every assembly-code snippet, with the mode requirements
(including "don't care" or "use caller" or whatever), emit those tags
for each snippet emitted by gcc, collect all the info on those tags at
link time, and optimize away re-settings of the FPU to the values
it will already have...*then* we can consider that as being a
worthwhile default, because it might actually not have miserable
performance.  (I'd like to see such linker optimizations include not
just FPU settings, but better decisions regarding where to allocate
temporaries, e.g. on the stack vs. on the heap, for example.)

But, for now, setting the FP mode is something we can really recommend
only to specific users to do on specific programs in specific cases
with, probably, some combination of a lot of work and finger-crossing,
since there's lots of underlying library code (libg2c, libm, who-
knows-what) that might or might not assume that the FP mode is in
the default, 80-bit state.  Any easy way out seems to be a *slow* way
out, and people don't generally want gcc to produce slow code.  Even
studying all the existing library code doesn't help, because there's
currently, AFAIK, no way to mark up the code (whether in assembler,
C, etc.) as "studied", "assumes 80-bit", "works in prevailing mode",
and so on, in a way a linker, and other tools, would recognize.

That's also why the store/reload-every-computation approach isn't
feasible.  Though it might make for more consistent IEEE behavior
across gcc targets, it still won't complete the job for at least
two reasons:

  -  Store/reload, in addition to being slow, doesn't really produce
     consistent IEEE behavior on x86 (same with setting the FP mode
     to 64 bits, apparently).  For example, it rounds twice, instead
     of once (not a problem for setting the FP mode, I think).

  -  Other gcc targets don't provide full IEEE behavior by default
     anyway (even those supporting the format, like Alphas, don't
     default to supporting the full range of the format, unless
     options like -mieee are used).

As Tim Hollebeek pointed out to me in private email, if my proposal
is adopted, it would reduce the number of cases where a change in
optimization level makes a substantial difference in the behavior
of FP code.  That is one of several reasonable conclusions resulting
from my general point that the current behavior of this so-called
"feature" is, for all intends and purposes, random: sometimes you get
it, sometimes you don't, and those time differentials can even
occur (in theory, at least) across different invocations of the
same function by the exact same code while running a single executable.

And, these reductions would take place in the most unpredictable "space"
of where gcc currently chops 80-bit results into 64-bit ones.  There'd
still be potential for such chopping, but they'd generally involve
constructs more visible to the programmer, in the source code (though
I have some reservations about just how acceptable even *this* is going
to be in the long run, which is why I haven't painted my proposal
as a rosy cure-all for peoples' FP problems).

I don't really mind if my proposal isn't adopted ASAP.  If I get to
where I think it has to be done ASAP, I'll try and do it myself (something
I considered long ago re the float-complex bugs, and essentially did
within g77 by implementing -femulate-complex).

But, it will be really sad if the consensus on my proposal *now*
becomes sufficiently negative that the pending rewrite of the ix86
machine description is done without any regard for my proposal, so
that the new description itself requires *another* rewrite down the
road if my proposal is ever to be adopted.

And that is really all I want to do *now* -- prevent the rewrite of
the x86 machine description from assuming the current behavior is,
and always will be, acceptable, because it almost certainly isn't,
and, at least, it is entirely reasonable to someday offer an option
to get the behavior I propose, just as we're already offering
-ffloat-store (which is almost useless, except for lucky people or
people accepting slow performance, on the x86, given what we now
know about gcc's handling of FP code).

(I don't think we'd be arguing about any of this is it was discovered
that gcc sometimes spilled 32-bit integers to 29-bit temporaries,
discarding ones in bits 28-30, even though it could be claimed that
programmers shouldn't *assume* they always get the full 32-bit range
from `int', since the language standard doesn't define `int' as 32
bits.  I bet lots of code would continue running just fine if we
made this change in gcc.  I bet *some* code wouldn't run just fine,
and would be very painful to fix -- especially code written and
debugged using other compilers -- and it'd be pretty silly to claim
there's no reason to make gcc accommodate that code because chopping
32-bit integers to 29-bit ones on a random basis is just another
thing with which good programmers are able to contend.  So I don't
understand why this reasoning is applied to FP results.  To make this
example more pertinent, assume gcc on some machines did 32-bit-int
computations in 64 bits, thus not overflowing/underflowing in cases
where it might normally, but that it randomly chopped the 64-bit
results back down to 32 bits.  That'd be even more nightmarish, even
if it affected less code overall.  Would we refuse to even provide
an *option* to get no chopping down of 64-bit intermediate results to
32 bits?)

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-13 10:49 ` Craig Burley
@ 1998-12-13 15:18   ` Stephen L Moshier
  1998-12-14  8:49     ` Craig Burley
  0 siblings, 1 reply; 38+ messages in thread
From: Stephen L Moshier @ 1998-12-13 15:18 UTC (permalink / raw)
  To: Craig Burley; +Cc: egcs

The Intel and 68k floating-point behavior has existed some fifteen
years and has never stopped anyone I know from writing good-quality,
portable programs.

We have some real problems that ought to have more priority.  An example
is that float complex does not work on alphas.  There is no workaround
for that bug.  It is hard to fix properly, and it is something that
needs fixing.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-13  6:19 Stephen L Moshier
@ 1998-12-13 10:49 ` Craig Burley
  1998-12-13 15:18   ` Stephen L Moshier
  0 siblings, 1 reply; 38+ messages in thread
From: Craig Burley @ 1998-12-13 10:49 UTC (permalink / raw)
  To: moshier; +Cc: burley

>Spilling of fp registers was very rare before the fforce-mem flag was
>turned by default.  In fact there was a compiler bug that would
>overflow the x87 register stack before any fp register actually got spilled.
>Running with -fno-force-mem will tend to relieve any actual pressure on fp
>register allocations.

That makes sense.  Note that my proposal attempts to *completely* address
this particular problem completely -- to make the spilling of FP
registers *never* change the values.  Reducing the problem to the point
where we could say "spilling tends to happen less" would probably
not be worthwhile to anyone in the numerical programming community.

For example, -fno-force-mem does not make any of the actual examples
we've been discussing start working.  My proposal does, because it
fixes the spills themselves, not the likelihood of whether they occur
in the first place.

>Compiler-generated temporaries are not the same thing as spilling.

Indeed.

>The ffloat-store switch usually will not work on them, as you can
>see by stepping through some compilations.

I wonder if we should come to some agreement on terminology when
discussing this issue.  I'd suggest:

  user-named variable: a variable named in the program being compiled,
  its data type being assigned either explicitly or implicitly.

  compiler-generated temporary: a variable invented by the compiler to
  represent an intermediate computation, such as the `a * b' in
  `d = a * b + c'.

  spill: relocation of a variable from one physical location (such as
  a hardware register) to another (such as memory) while the code is
  running, usually to make room in the first location for another
  variable.

-ffloat-store affects only user-named floating-point variables, by
making sure they aren't permitted to carry any more precision than
is normal for their data type.  The "store" and related terminology
(used in the documentation), misleads some people into thinking this
relates to registers and thus, somehow, compiler-generated temporaries
and/or spills.

-fforce-mem affects where the compiler physically places variables
(user-named and compiler-generated), and can thus affect spills.
Since -fno-force-mem does not force all such variables into memory,
it is not really strongly related to -ffloat-store, or to spilling.
That is, -fforce-mem forces memory operands (whatever that means!)
into *pseudo* registers, so the compiler can perform more optimizations
on them, but -fno-force-mem does not force operands *out* of pseudo
registers.

My proposal affects what happens to values that are spilled.  Ideally,
it would make spills never cause a change in value.

If my proposal is adopted, I think it'd render another proposed
option actually useful -- one that is like -ffloat-store, but applies
to *all* variables, compiler-generated as well as user-named.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-13  6:19 Stephen L Moshier
  1998-12-13 10:49 ` Craig Burley
  0 siblings, 1 reply; 38+ messages in thread
From: Stephen L Moshier @ 1998-12-13  6:19 UTC (permalink / raw)
  To: burley, egcs

Spilling of fp registers was very rare before the fforce-mem flag was
turned by default.  In fact there was a compiler bug that would
overflow the x87 register stack before any fp register actually got spilled.
Running with -fno-force-mem will tend to relieve any actual pressure on fp
register allocations.

Compiler-generated temporaries are not the same thing as spilling.
The ffloat-store switch usually will not work on them, as you can
see by stepping through some compilations.




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
  1998-12-03  6:34 N8TM
@ 1998-12-04 15:23 ` Craig Burley
  0 siblings, 0 replies; 38+ messages in thread
From: Craig Burley @ 1998-12-04 15:23 UTC (permalink / raw)
  To: N8TM; +Cc: burley

>T: For a single-precision calculation, performing the register spills in
>double would provide enough extra precision, without significant impact on
>performance, if aligned storage can be used.

Perhaps, though I'm still concerned about *any* "visible" effects
on the results of computations resulting from internal compiler
decisions about whether and when to spill values.  So if spilling
80-bit values to 64-bit values cannot possibly change the resulting
calculations (of a reasonably working program of course!), fine,
otherwise, I'd rather have it played safe as a default.

>T: There's some uncertainty here, where the desire to maintain performance
>causes us to keep the extra precision, although the programmer might
>conceivably not want it.  In order to turn it off in a "fine-grained" manner,
>the programmer must program in a "float-store" which I do by invoking an
>external function which returns the rounded-off value (can't be in-lined).

Yup, and it'd be nice to offer more formal, documented facilities for
this sort of thing someday.

>T: I think what you are getting at is that it's usually acceptable for the
>results to be calculated in the declared precision; extra precision is usually
>desirable, but unpredictable combinations of extra precision and no extra
>precision may be disastrous.  See Kahan's writings about the quadratic
>formula.  Your proposal would make an improvement here.

That feedback is helpful, and does seem to reflect what I was trying to
say originally.  (I haven't seen Kahan's writings, or at least very little
of them, at this point.)

>>C: REAL*16 seems to be asked for fairly often.)
>
>T:  Probably by people who don' t recognize how much performance hit the Intel
>processors will take going from REAL*10 to REAL*16.  If the Lahey/Fuji f95
>compiler gets the alignment problems fixed so that REAL(kind=8) returns to
>good performance, I think this will become more evident.

I think a substantial portion of the audience asking for REAL*16 is
*non-Intel*.  SPARC and Alpha people come to mind.  I agree that those
who want enough extra precision to more reliably compute 64-bit results
from 64-bit inputs would likely prefer the faster, native support
provided by REAL*10 on Intel, and ideally "we" (g77/egcs/whatever) would
be able to provide REAL*10 somewhat faster than REAL*16 on other machines
as well, even though, unlike on Intels, the REAL*10 would be emulated.

>>C:  Probably.  But we're not even at 64-bit aligned storage for stack
> variables (which is where spills must happen, for the most part) yet,
> and IMO code that requires FP spills, on the x86 anyway, is probably
> not going to notice the lack of alignment due to its complexity.
>
>T:  I believe that i686-pc-linux-gnulibc1 is trying with some success to do
>aligned spills, and that that's the reason why -O2 is often faster running
>than -Os on that target, while -O2 is slower than -Os on the same code on the
>targets which don't have double alignments on the stack.

I don't think aligned spills happen reliably at all on any *released*
version of egcs or gcc yet (well, except maybe for old versions of
gcc patched with those big g77 patches that *seemed* to do most of the
aligned-double thing).  But it looks like egcs 1.2 or 1.3 will align
doubles on the stack, covering spills, at or near a rock-solid level
of reliability.

>T: The improvement in accuracy depends on getting extended precision results
>from built-in math functions, so it would require a math-inline option as well
>as the 80-bit register spills.  I don't know whether it can be done
>effectively say by taking care to make the math-inline headers of libc6 more
>reliable.

That's definitely off my radar at the moment, but, certainly, if the
compiler decides to call library (or inline) functions for constructs
not explicitly, in the code, involving such calls, and those functions
are not 80-bit, the result might indeed be similar to spilling to 64-bit
values in that the programmer doesn't expect a sudden loss of precision
there.

I'm thinking, for example, of complex divides, which g77 implements
by avoiding the back end's version and going straight for c_div (or
whatever) in libF77, to support a larger domain of inputs with
greater accuracy.

Though, in this example, the loss of precision is a bit easier to
predict: it currently happens for complex divides.  Someday, though,
we might decide to have it apply to complex multiplies, and/or it
might be desirable to have the compiler choose, based on less visible
data (than the source code) to do a call rather than in-line the code.
It's important to preserve the precision in such cases.

(I think most of the above was hand-waved by me, originally, when
I said something like "There's probably *lots* of things not quite
right with how the Intel does floating point", but if I didn't include
egcs/gcc/g77 along with the Intel as at least *possible* culprits, I
should have.)

>T:  That might be too much to expect.  It's true that there could be
>situations where adding code might cause a named variable to be spilled to its
>declared precision where a simpler version used extended precision, but I
>doubt it's feasible to prevent that.  I'll suggest a less ambitious goal:
>that the recognition of common sub-expressions should not lead to reduced
>precision:
>
>	a = b*c + d*e
>	f = d*e*g + h
>
>If the compiler decides to treat d*e as a common sub-expression, in order to
>save an operation, but then finds that this expression needs to spill, that
>spill and restore should be full precision.  Otherwise, we get back to the
>unpredictable situations.

Nothing about the above sounds wrong to me, but I don't really know enough
to say for sure whether I think it all makes sense, I'm afraid.

> >C: P.S. Most, if not all of this, is the result of widespread disagreement
> over what a simple type declaration like `REAL*8 A' or `double a;' really
> means.  The simple view is "it means that the variable must be capable
> of holding the specified precision", but so many people really expect
> it to mean so much more, in terms of whether operations on the variable
> may, might, or must involve more precision, etc.  And, since the
> predominant languages give those people no straightforward way to express
> what they *do* really want, how surprising is it that they "overload" the
> "simple" view of what a type definition really means?
>  >>
>
>T: This is getting off-topic.  I might think that f90 declarations like
>
>	a = REAL(selected_real_kind(15))
>	b = REAL(selected_real_kind(18))
>
>could allow the programmer to express intent in more detail while retaining
>portability, but I don't think any existing compilers implement this in a
>useful way.

It is a bit off-topic in this particular thread, but I was pointing out
a general rule about how we design languages and features.  If we don't
give users an *explicit* way to say *exactly* what they mean, they will
tend to discover ways to *effect* that meaning and, not only that, will
grow to define those ways as *meaning* what we didn't give them in
the first place.  That's usually a problem, because those ways usually
mean something *else* as well, and separating those meanings, once
established in the minds of the user base, becomes very difficult.

We should remember this when designing new features, new option names,
and so on.  Either focus on the *intent* consistently, or the
*implementation* consistently, but don't mush them together (as
was incorrectly done when naming the `-ffloat-store' option, IMO),
for example.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86
@ 1998-12-03  6:34 N8TM
  1998-12-04 15:23 ` Craig Burley
  0 siblings, 1 reply; 38+ messages in thread
From: N8TM @ 1998-12-03  6:34 UTC (permalink / raw)
  To: tprince, burley, egcs

In a message dated 12/2/98 burley@gnu.org writes:
C: Craig Burley
T: Tim
<<C: the loads/stores involving the variables themselves would be
 single-precision, but the operations are done in, or produce results
 in, extended (80-bit) precision.  These should, according to my
 proposal, be *spilled* as 80-bit, not 64-bit or 32-bit, values,
 though when written to destinations (user-named variables), they'd
 then (normally) be chopped down to size, per -ffloat-store and
 what-not.
 
T: For a single-precision calculation, performing the register spills in
double would provide enough extra precision, without significant impact on
performance, if aligned storage can be used. Certainly, 80-bit spills would be
fine if they didn't impact performance.  This is like going back to the old
days of the GE600/Honeywell6000 architecture, where the floating point
register was 80 bits wide (only 8 bits for the exponent!) but there was no
efficient way to spill the full register width, nor would there have been much
use for it, considering how much of the extra precision was lost due to under-
flows.

 >>C:In other words, the default for x86 code generation should
 >>apparently be that, when the compiler generates an intermediate result,
 >>it *always* uses maximum available precision for that result, even
 >>if it has to spill the result to memory.  (I *think* it can do this while
 >>obeying the current FP mode, but don't have time to check right
 >>now.)
 >>[...]
 >
 >T: In the case where e is used in a subsequent calculation, we
 >don't want to force a store and reload unless -ffloat-store is
 >invoked.
 
 >C: Correct, AFAIK.

T: There's some uncertainty here, where the desire to maintain performance
causes us to keep the extra precision, although the programmer might
conceivably not want it.  In order to turn it off in a "fine-grained" manner,
the programmer must program in a "float-store" which I do by invoking an
external function which returns the rounded-off value (can't be in-lined).
 
 >T: But I'm not sure you can always apply the same rules to
 >storage to a named variable (it might be stored in a structure or
 >COMMON block) as to register spills, which aren't visible in the
 >source code.
 
>C:  No, I don't think you can, and that's what my proposal and email
 were trying to clarify (less than successfully, I gather!).
 
>C: That is, I was trying to focus my proposal on only the compiler-
 generated temporaries that get spilled and chopped down to "size"
 at the same time.
 
 >T: This is a more
 >difficult question to solve and I'm confused about what
 >connection you are making between that and the spilled
 >temporaries.
 
>C:  In my proposal, essentially none, except that it used to confuse me,
 and I believe it still confuses others, that there are pretty bright-
 line distinctions between compiler-generated temporaries and user-named
 variables, in terms of precisions the compiler is, or should be,
 permitted to employ for each class.  (But not all the distinctions
 are so clear, it seems.)
 
 
>C:  With compiler-generated temporaries, it is, again, helpful or hurtful,
 and normally permitted, for the compiler to employ *more* than the
 implicit precision of the operation, but the problem with the gcc
 back end, on the x86 at least, is that it (apparently) sometimes
 employs *less*, specifically, when spilling those temporaries.  (That
 is, when the temporary needs to be copied from the register in which
 it "lives" to a memory location, the gcc back end apparently is
 happy to chop the temporary down to fit into a smaller memory location.)
 
 >C: My proposal deals only with this latter deficiency (as I now think it
 is), that is, it recommends that precision *reduction* of compiler-
 generated temporaries no longer happen (at least not by default).
 
  
>C:  -  The compiler provides no way to "force" available excess precision
      to be reliably used for programmer-named variables anyplace that
      is possible (say, within a module).  Some compilers offer explicit
      extended type declarations (REAL*10 in Fortran; `long double' in C?),
      but g77 doesn't yet.  So, whether a named variable carries the
      (possible) excess precision of its computed value into subsequent
      calculations is at the whim of the compiler's optimization phases.
 
T: I think what you are getting at is that it's usually acceptable for the
results to be calculated in the declared precision; extra precision is usually
desirable, but unpredictable combinations of extra precision and no extra
precision may be disastrous.  See Kahan's writings about the quadratic
formula.  Your proposal would make an improvement here.
  
>C: REAL*16 seems to be asked for fairly often.)

T:  Probably by people who don' t recognize how much performance hit the Intel
processors will take going from REAL*10 to REAL*16.  If the Lahey/Fuji f95
compiler gets the alignment problems fixed so that REAL(kind=8) returns to
good performance, I think this will become more evident.
 

 >T: I suspect the 96 bits must be written to a 128-bit aligned storage
 >location to minimize the performance hit.
 
>C:  Probably.  But we're not even at 64-bit aligned storage for stack
 variables (which is where spills must happen, for the most part) yet,
 and IMO code that requires FP spills, on the x86 anyway, is probably
 not going to notice the lack of alignment due to its complexity.

T:  I believe that i686-pc-linux-gnulibc1 is trying with some success to do
aligned spills, and that that's the reason why -O2 is often faster running
than -Os on that target, while -O2 is slower than -Os on the same code on the
targets which don't have double alignments on the stack.
 
 
 >T: If someone does manage to implement this, I would like to study
 >the effect on the complex math functions of libF77, using Cody's
 >CELEFUNT test suite.  I have demonstrated already that the
 >extended double facility shows to good advantage in the double
 >complex functions.  The single complex functions already
 >accomplish what we are talking about by using double
 >declarations for all locals, and that gives them a big advantage
 >over certain vendors' libraries.
 
>C:  Right now, my impression is that the effect would be nil *unless*
 these codes are complicated enough to cause spills of temporaries
 in the first place.

T: The improvement in accuracy depends on getting extended precision results
from built-in math functions, so it would require a math-inline option as well
as the 80-bit register spills.  I don't know whether it can be done
effectively say by taking care to make the math-inline headers of libc6 more
reliable.
 
 
>C:  First, the main goal of my proposal is to reduce unpredictable loss
 of precision on machines like x86, where programmers should be
 aware their code will often employ extended precision (and thus might
 depend on it).
 
>C:  However, if -ffloat-store is not used, then perhaps this reduction
 would not be complete, and lead to rarer, yet even more obscure and
 hard-to-find, bugs, unless we indeed make sure that even spills of
 named variables carry never chop the values of those variables (which
 might be in extended precision).

T:  That might be too much to expect.  It's true that there could be
situations where adding code might cause a named variable to be spilled to its
declared precision where a simpler version used extended precision, but I
doubt it's feasible to prevent that.  I'll suggest a less ambitious goal:
that the recognition of common sub-expressions should not lead to reduced
precision:

	a = b*c + d*e
	f = d*e*g + h

If the compiler decides to treat d*e as a common sub-expression, in order to
save an operation, but then finds that this expression needs to spill, that
spill and restore should be full precision.  Otherwise, we get back to the
unpredictable situations.
 
 
         tq vm, (burley)
 
 >C: P.S. Most, if not all of this, is the result of widespread disagreement
 over what a simple type declaration like `REAL*8 A' or `double a;' really
 means.  The simple view is "it means that the variable must be capable
 of holding the specified precision", but so many people really expect
 it to mean so much more, in terms of whether operations on the variable
 may, might, or must involve more precision, etc.  And, since the
 predominant languages give those people no straightforward way to express
 what they *do* really want, how surprising is it that they "overload" the
 "simple" view of what a type definition really means?
  >>

T: This is getting off-topic.  I might think that f90 declarations like

	a = REAL(selected_real_kind(15))
	b = REAL(selected_real_kind(18))

could allow the programmer to express intent in more detail while retaining
portability, but I don't think any existing compilers implement this in a
useful way.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~1998-12-22 13:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-19 13:00 FLOATING-POINT CONSISTENCY, -FFLOAT-STORE, AND X86 N8TM
  -- strict thread matches above, loose matches on Subject: below --
1998-12-22 13:30 Toon Moene
1998-12-22 11:07 John Wehle
1998-12-21 23:30 N8TM
1998-12-19 15:17 Geert Bosch
1998-12-20  8:09 ` Toon Moene
1998-12-22  4:17 ` Dave Love
1998-12-19 14:26 N8TM
1998-12-19 14:23 N8TM
1998-12-20 13:51 ` Marc Lehmann
1998-12-20 13:52   ` Marc Lehmann
1998-12-19  9:05 N8TM
1998-12-19 12:39 ` Toon Moene
1998-12-19 14:42   ` Dave Love
1998-12-18 23:07 N8TM
1998-12-19 13:39 ` Marc Lehmann
1998-12-18 21:58 N8TM
1998-12-18 22:36 ` Richard Henderson
1998-12-19 13:41   ` Marc Lehmann
1998-12-17  1:43 N8TM
1998-12-17 12:35 ` Marc Lehmann
1998-12-18 12:14   ` Dave Love
1998-12-18 14:25     ` Gerald Pfeifer
1998-12-19 13:50       ` Dave Love
1998-12-18 18:37     ` Marc Lehmann
1998-12-19 14:03       ` Dave Love
1998-12-16  6:10 N8TM
1998-12-15  0:05 N8TM
1998-12-15 10:01 ` Joe Buck
1998-12-13  6:19 Stephen L Moshier
1998-12-13 10:49 ` Craig Burley
1998-12-13 15:18   ` Stephen L Moshier
1998-12-14  8:49     ` Craig Burley
1998-12-14  9:25       ` Joe Buck
1998-12-14 14:30         ` Edward Jason Riedy
1998-12-15  0:04         ` Craig Burley
1998-12-03  6:34 N8TM
1998-12-04 15:23 ` Craig Burley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).