From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8317 invoked by alias); 1 Jun 2007 15:56:48 -0000 Received: (qmail 8298 invoked by uid 48); 1 Jun 2007 15:56:37 -0000 Date: Fri, 01 Jun 2007 15:56:00 -0000 Subject: [Bug c/32180] New: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor X-Bugzilla-Reason: CC Message-ID: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "rob1weld at aol dot com" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2007-06/txt/msg00045.txt.bz2 GCC 4.3.0 compiled on Linux does NOT pass as many tests as GCC 3.4.4 for Cygwin. How seriously will people take _newer_ versions gcc if it can't pass the same tests as older versions did. Something has slipped over the years. Now that I have a great compiler I decide to do some tests to see how well it worked. I've done the usual "make -i check" tests and submitted the results. I decided to try some math library tests available on the internet. These tests are designed to catch flaws not to check trivial math operations. They were written by people whose life's work is mathmatics. My Cygwin gcc 3.4.4 passed "almost" all these tests, my linux compilers did not. I am not the one who built _all_ the linux versions of gcc, I only built the 4.2.0 and 4.3.0 versions. Please obtain and build these tests yourselves. I did not run every single test on every possible version but I did run the shortest test on every gcc I have. I am satisfied that gcc 4.3.0 does not pass all the tests that I tired and that Cygwin gcc 3.4.4 passed almost every test I tried. Here are some of my notes: Platform GCC Version Output File Name i686-pc-cygwin gcc 3.4.4 release paranoia_3.4.4_release-cygwin.txt i686-pc-linux-gnu gcc 4.2.0 20070501 paranoia_4.2.0_20070501-linux.txt i686-pc-linux-gnu gcc 4.3.0 20070529 paranoia_4.3.0_20070529-linux.txt i486-linux gcc 3.3.5 (Debian 1:3.3.5-13) paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt i486-linux-gnu gcc 3.4.6 (Debian 3.4.6-5) paranoia_3.4.6_Debian-3.4.6-5-linux.txt i486-linux-gnu gcc 4.1.2 (Debian 4.1.1-21) paranoia_4.1.2_Debian-4.1.1-21-linux.txt All these diffs produce no output - the linux tests are all the same result. # diff -Naur paranoia_4.2.0_20070501-linux.txt paranoia_4.3.0_20070529-linux.txt # diff -Naur paranoia_4.2.0_20070501-linux.txt paranoia_3.3.5_Debian-1:3.3.5-13-linux.txt # diff -Naur paranoia_4.2.0_20070501-linux.txt paranoia_3.4.6_Debian-3.4.6-5-linux.txt # diff -Naur paranoia_4.2.0_20070501-linux.txt paranoia_4.1.2_Debian-4.1.1-21-linux.txt # Here is the diff for Cygwin vs. Linux: # diff -Naur paranoia_3.4.4_release-cygwin.txt paranoia_4.3.0_20070529-linux.txt --- paranoia_3.4.4_release-cygwin.txt 2007-05-31 11:02:18.000000000 -0700 +++ paranoia_4.3.0_20070529-linux.txt 2007-05-31 11:04:38.000000000 -0700 @@ -127,7 +127,8 @@ Test for sqrt monotonicity. sqrt has passed a test for Monotonicity. Testing whether sqrt is rounded or chopped. -Square root appears to be correctly rounded. +Square root is neither chopped nor correctly rounded. +Observed errors run from -5.0000000e-01 to 5.0000000e-01 ulps. To continue, press RETURN Diagnosis resumes after milestone Number 90 Page: 7 @@ -152,7 +153,11 @@ This computed value is O.K. Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1. -Accuracy seems adequate. +DEFECT: Calculated 7.38905609548934539e+00 for + (1 + (-1.11022302462515654e-16) ^ (-1.80143985094819840e+16); + differs from correct value by -3.44130679508225512e-09 . + This much error may spoil financial + calculations involving tiny interest rates. Testing powers Z^Q at four nearly extreme values. ... no discrepancies found. @@ -188,7 +193,9 @@ Diagnosis resumes after milestone Number 220 Page: 10 +The number of DEFECTs discovered = 1. The number of FLAWs discovered = 1. -The arithmetic diagnosed seems Satisfactory though flawed. +The arithmetic diagnosed may be Acceptable +despite inconvenient Defects. END OF TEST. Both Cygwin's gcc 3.4.4 and Linux gcc 3.3.5, 3.4.6, 4.1.1, 4.2.0 and 4.3.0 have a lack of a sticky bit which is considered a "flaw" by the test; but this may be "Satisfactory". Checking rounding on multiply, divide and add/subtract. * is neither chopped nor correctly rounded. / is neither chopped nor correctly rounded. Addition/Subtraction neither rounds nor chops. Sticky bit used incorrectly or not at all. FLAW: lack(s) of guard digits or failure(s) to correctly round or chop (noted above) count as one flaw in the final tally below. Only the Linux gcc compilers and not Cygwin's 3.4.4 (release) version are off on one of the calculations by -3.44130679508225512e-09 . It is not much to some people, for others it is a lot. This led me to more testing. Here is another math test - http://www.netlib.org/fp/ucbtest.tgz : # ucbREADME/linux.sh Total 60 tests: pass 59, flags err 0, value err 1, acosd Total 352 tests: pass 352, flags err 0, value err 0, addd Total 77 tests: pass 77, flags err 0, value err 0, asind Total 104 tests: pass 104, flags err 0, value err 0, atan2d Total 57 tests: pass 57, flags err 0, value err 0, atand Total 126 tests: pass 126, flags err 0, value err 0, cabsd Total 99 tests: pass 99, flags err 0, value err 0, ceild Total 53 tests: pass 53, flags err 0, value err 0, cosd Total 68 tests: pass 56, flags err 0, value err 12, coshd Total 383 tests: pass 383, flags err 0, value err 0, divd Total 97 tests: pass 86, flags err 0, value err 11, expd Total 37 tests: pass 37, flags err 0, value err 0, fabsd Total 103 tests: pass 103, flags err 0, value err 0, floord Total 352 tests: pass 352, flags err 0, value err 0, fmodd Total 126 tests: pass 126, flags err 0, value err 0, hypotd Total 89 tests: pass 89, flags err 0, value err 0, log10d Total 83 tests: pass 83, flags err 0, value err 0, logd Total 340 tests: pass 340, flags err 0, value err 0, muld Total 1543 tests: pass 1505, flags err 0, value err 38, powd Total 52 tests: pass 52, flags err 0, value err 0, sind Total 72 tests: pass 66, flags err 0, value err 6, sinhd Total 102 tests: pass 102, flags err 0, value err 0, sqrtd Total 321 tests: pass 321, flags err 0, value err 0, subd Total 54 tests: pass 54, flags err 0, value err 0, tand Total 72 tests: pass 68, flags err 0, value err 4, tanhd That doesn't happen with Cygwin's gcc 3.4.4. Here is the TestFloat / SoftFloat tests: TestFloat is a program for testing whether a computer's floating-point conforms to the IEC/IEEE Standard for Binary Floating-point Arithmetic. TestFloat works by comparing the behavior of the machine's floating-point with that of the SoftFloat software implementation of floating-point. Any differences found are reported as probable errors in the machine's floating-point. http://www.jhauser.us/arithmetic/TestFloat.html TestFloat and SoftFloat source files compress'ed tar archive, TestFloat-2a.tar.Z [150 kB]. http://www.jhauser.us/arithmetic/TestFloat-2a.tar.Z compress'ed tar archive, SoftFloat-2b.tar.Z [165 kB]. http://www.jhauser.us/arithmetic/SoftFloat-2b.tar.Z Here are the results of only two of the tests: # ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_add > /dev/null Testing float32_add, rounding nearest_even. 46464 tests total. 46464 tests performed; 39069 errors found. Testing float32_add, rounding to_zero. 46464 tests total. 46464 tests performed; 39099 errors found. Testing float32_add, rounding down. 46464 tests total. 46464 tests performed; 39213 errors found. Testing float32_add, rounding up. 46464 tests total. 46464 tests performed; 39150 errors found. # ./testsoftfloat -level 1 -errors 2000000 int32_to_float32 float32_eq > /dev/null Testing float32_eq. 46464 tests total. 46464 tests performed; 1321 errors found. Finally I tested GSL - GNU Scientific Library http://www.gnu.org/software/gsl The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The current version is GSL-1.9. It was released on 21 February 2007. This is a stable release. ftp://ftp.gnu.org/gnu/gsl/ 02/20/2007 03:36PM 2,574,939 gsl-1.9.tar.gz ftp://ftp.gnu.org/gnu/gsl/gsl-1.9.tar.gz When compiled and checked GSL works flawlessly under Cygwin 3.4.4 but fails both the interpolation and sort tests under gcc 4.3.0 20070529 . WinXP - i686-pc-cygwin: $ cd /cygdrive/c/gsl-1.9 $ make -i -k check 2>&1 | tee check_1_log.txt $ grep -B 2 -A 2 fail check_1_log.txt $ (Prints NOTHING) Debian - i686-pc-linux-gnu: # cd /root/downloads/gsl-1.9 $ make -i -k check 2>&1 | tee check_1_log.txt # grep -B 2 -A 2 fail check_1_log.txt FAIL: test =================== 1 of 1 tests failed =================== make[2]: [check-TESTS] Error 1 (ignored) -- FAIL: test =================== 1 of 1 tests failed =================== make[2]: [check-TESTS] Error 1 (ignored) Cygwin: make[2]: Entering directory `/cygdrive/c/gsl-1.9/interpolation' Completed [1100/1100] PASS: test.exe ================== All 1 tests passed ================== make[2]: Leaving directory `/cygdrive/c/gsl-1.9/interpolation' Cygwin: make[2]: Entering directory `/cygdrive/c/gsl-1.9/sort' Completed [21600/21600] PASS: test.exe ================== All 1 tests passed ================== make[2]: Leaving directory `/cygdrive/c/gsl-1.9/sort' Linux: make[2]: Entering directory `/root/downloads/gsl-1.9/interpolation' FAIL: gsl_interp_eval_e linear [7] FAIL: gsl_interp_eval_deriv_e linear [8] FAIL: linear deriv 0 (0 observed vs 5.30544087554984718e-315 expected) [test uses subnormal value] [11] FAIL: linear integ 0 (0 observed vs 2.19361877441406383 expected) [12] (Over 900 lines of FAIL:) FAIL: cspline-periodic 60 (4.99961591105934785e-270 observed vs 6.8941659943411544 expected) [1072] FAIL: cspline-periodic deriv 60 (0 observed vs 7.16157787531728253e-313 expected) [test uses subnormal value] [1073] FAIL: cspline-periodic integ 60 (0 observed vs 6.89057922363291553 expected) [1074] FAIL: cspline periodic 3pt interpolation [1075] FAIL: test =================== 1 of 1 tests failed =================== make[2]: [check-TESTS] Error 1 (ignored) make[2]: Leaving directory `/root/downloads/gsl-1.9/interpolation' Linux: make[2]: Entering directory `/root/downloads/gsl-1.9/sort' FAIL: indexing gsl_vector_char, n = 128, stride = 1, ordered [19999] FAIL: sorting, gsl_vector_char, n = 128, stride = 1, ordered [20000] FAIL: smallest, gsl_vector_char, n = 128, stride = 1, ordered [20001] FAIL: largest, gsl_vector_char, n = 128, stride = 1, ordered [20002] (Over 120 lines of FAIL:) FAIL: sorting, gsl_vector_char, n = 512, stride = 3, randomized [21596] FAIL: smallest, gsl_vector_char, n = 512, stride = 3, randomized [21597] FAIL: largest, gsl_vector_char, n = 512, stride = 3, randomized [21598] FAIL: test =================== 1 of 1 tests failed =================== make[2]: [check-TESTS] Error 1 (ignored) make[2]: Leaving directory `/root/downloads/gsl-1.9/sort' GCC need a Ph.D. of math to give it a once over. If an old version of gcc can pass these tests there is an error preventing newer versions from passing. DRAFT Standard for Floating-Point Arithmetic P754 - Draft 1.3.0 Modified at 17:15 GMT on February 23, 2007 http://www.validlab.com/754R/drafts/archive/2007-02-23.pdf -- Summary: Paranoia UCB GSL TestFloat libm tests fail - accuracy of recent gcc math poor Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rob1weld at aol dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32180