From mboxrd@z Thu Jan 1 00:00:00 1970 From: tprince@cat.e-mail.com To: egcs@cygnus.com, tprince@computer.org Subject: /internet Date: Fri, 04 Dec 1998 17:41:00 -0000 Message-id: <4.19981204.20.41.11.410443@cat.e-mail.com> X-SW-Source: 1998-12/msg00139.html >T: I think what you are getting at is that it's usually acceptable for the >results to be calculated in the declared precision; extra precision is usually >desirable, but unpredictable combinations of extra precision and no extra >precision may be disastrous. See Kahan's writings about the quadratic >formula. Your proposal would make an improvement here. >C:That feedback is helpful, and does seem to reflect what I was trying to >say originally. (I haven't seen Kahan's writings, or at least very little >of them, at this point.) T: Look at http://http.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps I figured out how to do his quadratic algorithm in C with volatiles on the R8K and PowerPC (neither of which I use any more) but it needed function calls to do it in g77. If someone gets the fused MACs going for hppa2.0, those issues will come up there. It really is possible to find these fused MAC's introducing NaN's in a program which works correctly otherwise, and I've not been able to track them down in a program bigger than that quadratic formula thing. The R10K has a gentler form of fused MAC where the behavior has been made to be generally the same as with individual IEEE-754 compliant operations. Kahan didn't analyze what might happen with unfavorable combinations of two rounding modes on an Intel-like processor; I'm suspecting it would not be good news. Has anyone looked into this? >>C: I think a substantial portion of the audience asking for REAL*16 is >*non-Intel*. SPARC and Alpha people come to mind. I agree that those >who want enough extra precision to more reliably compute 64-bit results >from 64-bit inputs would likely prefer the faster, native support >provided by REAL*10 on Intel, and ideally "we" (g77/egcs/whatever) would >be able to provide REAL*10 somewhat faster than REAL*16 on other machines >as well, even though, unlike on Intels, the REAL*10 would be emulated. T: There are 2 major varieties of REAL*16. The one which HP (and, I believe, Sun Lahey and DEC) use is the more accurate and slower one, which conforms nominally to IEEE P854 and has roughly the same exponent range as the Intel REAL*10. SGI and IBM use a faster version, which is facilitated by the fused Multiply-accumulate instructions, which has roughly 6 fewer bits of precision, a range less than that of double precision, and doesn't conform to IEEE P854. T; In both the HP and SGI libraries, the math functions give up accuracy so as not to lose as much speed, so it is possible in either case to wind up with little more accuracy than you would get with a carefully implemented REAL*10. I don't know about the other vendors' libraries. On the pentiums, some of the math functions inherently take advantage of the full precision (log() but not log10(), sqrt(), sin()/cos(), tan(), atan()), while a few require more of the style of programming found in non-Intel math libraries, but with asm() mixed in, putting the proper usage of clobbers to the test. >>C:I don't think aligned spills happen reliably at all on any *released* >version of egcs or gcc yet (well, except maybe for old versions of >gcc patched with those big g77 patches that *seemed* to do most of the >aligned-double thing). But it looks like egcs 1.2 or 1.3 will align >doubles on the stack, covering spills, at or near a rock-solid level >of reliability. T: Treatment of spills in general seems to be one area where gnu has some room for improvement, in comparison to commercial compilers, particularly for Intel. I'm sure amazed that Lahey lost track of their alignments for lf95, but they seem otherwise to be able to avoid spill performance problems. >>C: if the >compiler decides to call library (or inline) functions for constructs >not explicitly, in the code, involving such calls, and those functions >are not 80-bit, the result might indeed be similar to spilling to 64-bit >values in that the programmer doesn't expect a sudden loss of precision >there. >>C:I'm thinking, for example, of complex divides, which g77 implements >by avoiding the back end's version and going straight for c_div (or >whatever) in libF77, to support a larger domain of inputs with >greater accuracy. T: There, of course, the straightforward use of extended precision takes care of the situation more effectively, where special-case coding is needed otherwise. But that can be done by using conditional compilation inside c_div, according to whether the target architecture has long double of greater precision and range than double. >>C:Though, in this example, the loss of precision is a bit easier to >predict: it currently happens for complex divides. Someday, though, >we might decide to have it apply to complex multiplies, and/or it >might be desirable to have the compiler choose, based on less visible >data (than the source code) to do a call rather than in-line the code. >It's important to preserve the precision in such cases. T: It's more a question of avoiding unexpected exceptions. The overhead of the function call is not a serious matter for c_div, but it could be for multiplication. I looked up some of the implementations when you brought this up over a year ago, and the only one I found which takes special precautions on complex multiplication was VAX/VMS. It's needed more on VAX floating point, as even with the precautions, the range of working operands is less than with IEEE floating point and no special precautions. Dr. Timothy C. Prince Consulting Engineer Solar Turbines, a Caterpillar Company alternate e-mail: tprince@computer.org To: INTERNET - IBMMAIL N3356140 - IBMMAIL