public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/32180]  New: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor
@ 2007-06-01 15:56 rob1weld at aol dot com
  2007-06-01 16:16 ` [Bug target/32180] " pinskia at gcc dot gnu dot org
                   ` (25 more replies)
  0 siblings, 26 replies; 27+ messages in thread
From: rob1weld at aol dot com @ 2007-06-01 15:56 UTC (permalink / raw)
  To: gcc-bugs

GCC 4.3.0 compiled on Linux does NOT pass as many tests as GCC 3.4.4 for
Cygwin.

How seriously will people take _newer_ versions gcc if it can't pass the same
tests as older versions did. Something has slipped over the years.

Now that I have a great compiler I decide to do some tests to see how well it
worked. I've done the usual "make -i check" tests and submitted the results.

I decided to try some math library tests available on the internet.

These tests are designed to catch flaws not to check trivial math operations.
They were written by people whose life's work is mathmatics.


My Cygwin gcc 3.4.4 passed "almost" all these tests, my linux compilers did
not. I am not the one who built _all_ the linux versions of gcc, I only built
the 4.2.0 and 4.3.0 versions. Please obtain and build these tests yourselves.

I did not run every single test on every possible version but I did run the
shortest test on every gcc I have. I am satisfied that gcc 4.3.0 does not pass
all the tests that I tired and that Cygwin gcc 3.4.4 passed almost
every test I tried.


Here are some of my notes:


Platform            GCC Version                     Output File Name
i686-pc-cygwin      gcc 3.4.4 release              
paranoia_3.4.4_release-cygwin.txt
i686-pc-linux-gnu   gcc 4.2.0 20070501             
paranoia_4.2.0_20070501-linux.txt
i686-pc-linux-gnu   gcc 4.3.0 20070529             
paranoia_4.3.0_20070529-linux.txt
i486-linux          gcc 3.3.5 (Debian 1:3.3.5-13)  
paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt
i486-linux-gnu      gcc 3.4.6 (Debian 3.4.6-5)     
paranoia_3.4.6_Debian-3.4.6-5-linux.txt
i486-linux-gnu      gcc 4.1.2 (Debian 4.1.1-21)    
paranoia_4.1.2_Debian-4.1.1-21-linux.txt


All these diffs produce no output - the linux tests are all the same result.

# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_4.3.0_20070529-linux.txt
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_3.4.6_Debian-3.4.6-5-linux.txt   
# diff -Naur paranoia_4.2.0_20070501-linux.txt
paranoia_4.1.2_Debian-4.1.1-21-linux.txt
# 


Here is the diff for Cygwin vs. Linux:

# diff -Naur paranoia_3.4.4_release-cygwin.txt
paranoia_4.3.0_20070529-linux.txt
--- paranoia_3.4.4_release-cygwin.txt   2007-05-31 11:02:18.000000000 -0700
+++ paranoia_4.3.0_20070529-linux.txt   2007-05-31 11:04:38.000000000 -0700
@@ -127,7 +127,8 @@
 Test for sqrt monotonicity.
 sqrt has passed a test for Monotonicity.
 Testing whether sqrt is rounded or chopped.
-Square root appears to be correctly rounded.
+Square root is neither chopped nor correctly rounded.
+Observed errors run from -5.0000000e-01 to 5.0000000e-01 ulps.

 To continue, press RETURN
 Diagnosis resumes after milestone Number 90          Page: 7
@@ -152,7 +153,11 @@
 This computed value is O.K.

 Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1.
-Accuracy seems adequate.
+DEFECT:  Calculated 7.38905609548934539e+00 for
+       (1 + (-1.11022302462515654e-16) ^ (-1.80143985094819840e+16);
+       differs from correct value by -3.44130679508225512e-09 .
+       This much error may spoil financial
+       calculations involving tiny interest rates.
 Testing powers Z^Q at four nearly extreme values.
  ... no discrepancies found.

@@ -188,7 +193,9 @@
 Diagnosis resumes after milestone Number 220          Page: 10


+The number of  DEFECTs  discovered =         1.
 The number of  FLAWs  discovered =           1.

-The arithmetic diagnosed seems Satisfactory though flawed.
+The arithmetic diagnosed may be Acceptable
+despite inconvenient Defects.
 END OF TEST.


Both Cygwin's gcc 3.4.4 and Linux gcc 3.3.5, 3.4.6, 4.1.1, 4.2.0 and 4.3.0
have a lack of a sticky bit which is considered a "flaw" by the test; but
this may be "Satisfactory".


Checking rounding on multiply, divide and add/subtract.
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
FLAW:  lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.


Only the Linux gcc compilers and not Cygwin's 3.4.4 (release) version are off
on one of the calculations by -3.44130679508225512e-09 . It is not much to some
people, for others it is a lot. This led me to more testing.



Here is another math test - http://www.netlib.org/fp/ucbtest.tgz :

# ucbREADME/linux.sh

Total   60 tests:  pass   59,  flags err    0,  value err   1,     acosd
Total  352 tests:  pass  352,  flags err    0,  value err   0,     addd
Total   77 tests:  pass   77,  flags err    0,  value err   0,     asind
Total  104 tests:  pass  104,  flags err    0,  value err   0,     atan2d
Total   57 tests:  pass   57,  flags err    0,  value err   0,     atand
Total  126 tests:  pass  126,  flags err    0,  value err   0,     cabsd
Total   99 tests:  pass   99,  flags err    0,  value err   0,     ceild
Total   53 tests:  pass   53,  flags err    0,  value err   0,     cosd
Total   68 tests:  pass   56,  flags err    0,  value err  12,     coshd
Total  383 tests:  pass  383,  flags err    0,  value err   0,     divd
Total   97 tests:  pass   86,  flags err    0,  value err  11,     expd
Total   37 tests:  pass   37,  flags err    0,  value err   0,     fabsd
Total  103 tests:  pass  103,  flags err    0,  value err   0,     floord
Total  352 tests:  pass  352,  flags err    0,  value err   0,     fmodd
Total  126 tests:  pass  126,  flags err    0,  value err   0,     hypotd
Total   89 tests:  pass   89,  flags err    0,  value err   0,     log10d
Total   83 tests:  pass   83,  flags err    0,  value err   0,     logd
Total  340 tests:  pass  340,  flags err    0,  value err   0,     muld
Total 1543 tests:  pass 1505,  flags err    0,  value err  38,     powd
Total   52 tests:  pass   52,  flags err    0,  value err   0,     sind
Total   72 tests:  pass   66,  flags err    0,  value err   6,     sinhd
Total  102 tests:  pass  102,  flags err    0,  value err   0,     sqrtd
Total  321 tests:  pass  321,  flags err    0,  value err   0,     subd
Total   54 tests:  pass   54,  flags err    0,  value err   0,     tand
Total   72 tests:  pass   68,  flags err    0,  value err   4,     tanhd


That doesn't happen with Cygwin's gcc 3.4.4.



Here is the TestFloat / SoftFloat tests:

TestFloat is a program for testing whether a computer's floating-point conforms
to the IEC/IEEE Standard for Binary Floating-point Arithmetic. TestFloat works
by comparing the behavior of the machine's floating-point with that of the
SoftFloat software implementation of floating-point. Any differences found are
reported as probable errors in the machine's floating-point. 

http://www.jhauser.us/arithmetic/TestFloat.html

TestFloat and SoftFloat source files
compress'ed tar archive, TestFloat-2a.tar.Z [150 kB].  
http://www.jhauser.us/arithmetic/TestFloat-2a.tar.Z
compress'ed tar archive, SoftFloat-2b.tar.Z [165 kB]. 
http://www.jhauser.us/arithmetic/SoftFloat-2b.tar.Z



Here are the results of only two of the tests:

# ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_add >
/dev/null 
Testing float32_add, rounding nearest_even.
46464 tests total.
46464 tests performed; 39069 errors found.
Testing float32_add, rounding to_zero.
46464 tests total.
46464 tests performed; 39099 errors found.
Testing float32_add, rounding down.
46464 tests total.
46464 tests performed; 39213 errors found.
Testing float32_add, rounding up.
46464 tests total.
46464 tests performed; 39150 errors found.

# ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_eq >
/dev/null 
Testing float32_eq.
46464 tests total.
46464 tests performed; 1321 errors found.



Finally I tested GSL - GNU Scientific Library
http://www.gnu.org/software/gsl

The GNU Scientific Library (GSL) is a numerical library for C and C++
programmers. The current version is GSL-1.9. It was released on 21 February
2007. This is a stable release.
ftp://ftp.gnu.org/gnu/gsl/
02/20/2007 03:36PM      2,574,939 gsl-1.9.tar.gz
ftp://ftp.gnu.org/gnu/gsl/gsl-1.9.tar.gz

When compiled and checked GSL works flawlessly under Cygwin 3.4.4 but fails
both
the interpolation and sort tests under gcc 4.3.0 20070529 .


WinXP - i686-pc-cygwin:
$ cd /cygdrive/c/gsl-1.9
$ make -i -k check 2>&1 | tee check_1_log.txt
$ grep -B 2 -A 2 fail check_1_log.txt
$ (Prints NOTHING)


Debian - i686-pc-linux-gnu:
# cd /root/downloads/gsl-1.9
$ make -i -k check 2>&1 | tee check_1_log.txt
# grep -B 2 -A 2 fail check_1_log.txt
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
--
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)



Cygwin:
make[2]: Entering directory `/cygdrive/c/gsl-1.9/interpolation'
Completed [1100/1100]
PASS: test.exe
==================
All 1 tests passed
==================
make[2]: Leaving directory `/cygdrive/c/gsl-1.9/interpolation'


Cygwin:
make[2]: Entering directory `/cygdrive/c/gsl-1.9/sort'
Completed [21600/21600]
PASS: test.exe
==================
All 1 tests passed
==================
make[2]: Leaving directory `/cygdrive/c/gsl-1.9/sort'


Linux:
make[2]: Entering directory `/root/downloads/gsl-1.9/interpolation'
FAIL: gsl_interp_eval_e linear [7]
FAIL: gsl_interp_eval_deriv_e linear [8]
FAIL: linear deriv 0 (0 observed vs 5.30544087554984718e-315 expected) [test
uses subnormal value] [11]
FAIL: linear integ 0 (0 observed vs 2.19361877441406383 expected) [12]
(Over 900 lines of FAIL:)
FAIL: cspline-periodic 60 (4.99961591105934785e-270 observed vs
6.8941659943411544 expected) [1072]
FAIL: cspline-periodic deriv 60 (0 observed vs 7.16157787531728253e-313
expected) [test uses subnormal value] [1073]
FAIL: cspline-periodic integ 60 (0 observed vs 6.89057922363291553 expected)
[1074]
FAIL: cspline periodic 3pt interpolation [1075]
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
make[2]: Leaving directory `/root/downloads/gsl-1.9/interpolation'


Linux:
make[2]: Entering directory `/root/downloads/gsl-1.9/sort'
FAIL: indexing gsl_vector_char, n = 128, stride = 1, ordered [19999]
FAIL: sorting, gsl_vector_char, n = 128, stride = 1, ordered [20000]
FAIL: smallest, gsl_vector_char, n = 128, stride = 1, ordered [20001]
FAIL: largest, gsl_vector_char, n = 128, stride = 1, ordered [20002]
(Over 120 lines of FAIL:)
FAIL: sorting, gsl_vector_char, n = 512, stride = 3, randomized [21596]
FAIL: smallest, gsl_vector_char, n = 512, stride = 3, randomized [21597]
FAIL: largest, gsl_vector_char, n = 512, stride = 3, randomized [21598]
FAIL: test
===================
1 of 1 tests failed
===================
make[2]: [check-TESTS] Error 1 (ignored)
make[2]: Leaving directory `/root/downloads/gsl-1.9/sort'



GCC need a Ph.D. of math to give it a once over. If an old version of gcc can
pass these tests there is an error preventing newer versions from passing.


DRAFT Standard for Floating-Point Arithmetic P754 - Draft 1.3.0
Modified at 17:15 GMT on February 23, 2007
http://www.validlab.com/754R/drafts/archive/2007-02-23.pdf


-- 
           Summary: Paranoia UCB GSL TestFloat libm tests fail - accuracy of
                    recent gcc math poor
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: rob1weld at aol dot com
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32180


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2010-04-12  1:54 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-01 15:56 [Bug c/32180] New: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor rob1weld at aol dot com
2007-06-01 16:16 ` [Bug target/32180] " pinskia at gcc dot gnu dot org
2007-06-03 13:16 ` rob1weld at aol dot com
2007-06-03 15:15 ` rob1weld at aol dot com
2007-06-03 16:05 ` rob1weld at aol dot com
2007-06-03 16:35 ` kargl at gcc dot gnu dot org
2007-06-03 21:12 ` jvdelisle at gcc dot gnu dot org
2007-06-04  8:58 ` rob1weld at aol dot com
2007-06-04  9:07 ` rob1weld at aol dot com
2007-06-04  9:16 ` rob1weld at aol dot com
2007-06-04 17:32 ` kargl at gcc dot gnu dot org
2007-06-05 17:22 ` rob1weld at aol dot com
2007-06-07 13:42 ` rob1weld at aol dot com
2007-06-07 13:49 ` rob1weld at aol dot com
2007-06-08  0:23 ` rob1weld at aol dot com
2007-06-12 17:50 ` rob1weld at aol dot com
2007-06-13  7:53 ` rob1weld at aol dot com
2007-06-13 11:30 ` joseph at codesourcery dot com
2007-06-13 17:53 ` kargl at gcc dot gnu dot org
2007-06-14  8:14 ` rob1weld at aol dot com
2007-06-15 21:23 ` rob1weld at aol dot com
2007-06-15 22:23 ` kargl at gcc dot gnu dot org
2007-06-15 22:43 ` joseph at codesourcery dot com
2007-06-16 18:12 ` rob1weld at aol dot com
2007-06-17 20:52 ` rob1weld at aol dot com
2010-02-20 23:43 ` manu at gcc dot gnu dot org
2010-04-12  1:54 ` rob1weld at aol dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).