Examining bug #21838

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* Examining bug #21838
@ 2008-02-14 23:01 Frank Reininghaus
  2008-02-15  9:07 ` Jason Stover
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Frank Reininghaus @ 2008-02-14 23:01 UTC (permalink / raw)
  To: gsl-discuss

[-- Attachment #1: Type: text/plain, Size: 2107 bytes --]

Hi everyone,

I just tried to examine bug #21838 in gsl_cdf_fdist_P () (source file 
cdf/fdist.c), reported in

http://lists.gnu.org/archive/html/bug-gsl/2007-09/msg00023.html

with the attached testcase program (testcases, expected values and 
tolerance taken from cdf/test.c).

I examined the first testcase: It seems that the origin of the NaN is in 
beta_cont_frac () in the source file cdf/beta_inc.c. The loop aborts 
after the maximum number of iterations because fabs (delta_frac - 1.0) 
does not get small enough, and NaN is returned.

The funny thing is that I could influence this behaviour in a way I 
didn't expect:

1. Linking to the current git version of GSL which was compiled with the 
default setting -O2 yielded:

(P - eP) [0] = nan, tolerance = 2.3283064365386963e-10
(P - eP) [1] = nan, tolerance = 2.3283064365386963e-10
(P - eP) [2] = nan, tolerance = 2.3283064365386963e-10
(P - eP) [3] = 4.1520314247533996e-05, tolerance = 2.3283064365386963e-10

2. Linking to GSL which was compiled without optimisation (-O0):

(P - eP) [0] = nan, tolerance = 2.3283064365386963e-10
(P - eP) [1] = nan, tolerance = 2.3283064365386963e-10
(P - eP) [2] = 1.8060553053089734e-11, tolerance = 2.3283064365386963e-10
(P - eP) [3] = nan, tolerance = 2.3283064365386963e-10

3. Running it with Valgrind (independent of the optimisation settings):

(P - eP) [0] = 6.9538819147396680e-12, tolerance = 2.3283064365386963e-10
(P - eP) [1] = -7.3852035598065413e-12, tolerance = 2.3283064365386963e-10
(P - eP) [2] = -1.1277645484142340e-12, tolerance = 2.3283064365386963e-10
(P - eP) [3] = 4.1520320904209207e-05, tolerance = 2.3283064365386963e-10

Maybe these (at least for me) surprising observations could be used to 
isolate the cause of the bug. Does anyone have a good idea?

Regards,
Frank

P.S.: I compiled and ran the program on two systems I have access to and 
got exactly the same results:

1.
* Athlon 64 X2 in 32 bit mode
* Kubuntu 7.10
* gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2))

2.
* Pentium III
* Opensuse 10.2
* gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)

[-- Attachment #2: testcase-bug-21838.c --]
[-- Type: text/x-csrc, Size: 843 bytes --]

#include <gsl/gsl_cdf.h>
#include <stdio.h>

#include <gsl/gsl_math.h>

/* tolerance taken from cdf/test.c */
#define TEST_TOL6  (1048576.0*GSL_DBL_EPSILON)

int main () {
  double P [4];  /* P values calculated using gsl_cdf_fdist_P () */
  double eP [4]; /* expected values taken from cdf/test.c */
  
  P [0] = gsl_cdf_fdist_P (3.479082213465832574, 1, 4040712);
  eP [0] = 0.93785072763723411967;

  P [1] = gsl_cdf_fdist_P (3.002774644786533109, 1, 4040712);
  eP [1] = 0.91687787379476055771;

  P [2] = gsl_cdf_fdist_P (3.000854441173130827, 1, 4040712);
  eP [2] = 0.91677930719813578619;

  P [3] = gsl_cdf_fdist_P (3.000064021622133037, 1, 4040712);
  eP [3] = 0.91678021757407962678;

  int i;
  
  for (i = 0; i < 4; i++)
    printf ("(P - eP) [%i] = %.16e, tolerance = %.16e\n",
	    i, eP [i] - P [i], TEST_TOL6);

  return 0;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Examining bug #21838
  2008-02-14 23:01 Examining bug #21838 Frank Reininghaus
@ 2008-02-15  9:07 ` Jason Stover
  2008-02-20 18:04   ` Brian Gough
  2008-02-16 18:22 ` Brian Gough
  2008-02-18  9:56 ` Brian Gough
  2 siblings, 1 reply; 5+ messages in thread
From: Jason Stover @ 2008-02-15  9:07 UTC (permalink / raw)
  To: Frank Reininghaus; +Cc: gsl-discuss

On Fri, Feb 15, 2008 at 12:00:55AM +0100, Frank Reininghaus wrote:
> I just tried to examine bug #21838 in gsl_cdf_fdist_P () (source file 
> cdf/fdist.c), reported in
> 
> http://lists.gnu.org/archive/html/bug-gsl/2007-09/msg00023.html
> 
> with the attached testcase program (testcases, expected values and 
> tolerance taken from cdf/test.c).

The test cases have huge numbers of denominator degrees of freedom
(\nu_2 in the manual).  As \nu_2 --> infinity, the F density tends to
a gamma density. So it might be worthwhile to fix this by calling
gsl_cdf_gamma_[PQ] instead of beta_inc_AXPY() in such cases.

> 
> I examined the first testcase: It seems that the origin of the NaN is in 
> beta_cont_frac () in the source file cdf/beta_inc.c. The loop aborts 
> after the maximum number of iterations because fabs (delta_frac - 1.0) 
> does not get small enough, and NaN is returned.
> 
> The funny thing is that I could influence this behaviour in a way I 
> didn't expect:
> 
> 1. Linking to the current git version of GSL which was compiled with the 
> default setting -O2 yielded:
> 
> (P - eP) [0] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [1] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [2] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [3] = 4.1520314247533996e-05, tolerance = 2.3283064365386963e-10
> 
> 2. Linking to GSL which was compiled without optimisation (-O0):
> 
> (P - eP) [0] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [1] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [2] = 1.8060553053089734e-11, tolerance = 2.3283064365386963e-10
> (P - eP) [3] = nan, tolerance = 2.3283064365386963e-10
> 
> 3. Running it with Valgrind (independent of the optimisation settings):
> 
> (P - eP) [0] = 6.9538819147396680e-12, tolerance = 2.3283064365386963e-10
> (P - eP) [1] = -7.3852035598065413e-12, tolerance = 2.3283064365386963e-10
> (P - eP) [2] = -1.1277645484142340e-12, tolerance = 2.3283064365386963e-10
> (P - eP) [3] = 4.1520320904209207e-05, tolerance = 2.3283064365386963e-10
> 
> Maybe these (at least for me) surprising observations could be used to 
> isolate the cause of the bug. Does anyone have a good idea?
> 
> Regards,
> Frank
> 
> P.S.: I compiled and ran the program on two systems I have access to and 
> got exactly the same results:
> 
> 1.
> * Athlon 64 X2 in 32 bit mode
> * Kubuntu 7.10
> * gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2))
> 
> 2.
> * Pentium III
> * Opensuse 10.2
> * gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)



-- 
jstover@sdf.lonestar.org
SDF Public Access UNIX System - http://sdf.lonestar.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Examining bug #21838
  2008-02-14 23:01 Examining bug #21838 Frank Reininghaus
  2008-02-15  9:07 ` Jason Stover
@ 2008-02-16 18:22 ` Brian Gough
  2008-02-18  9:56 ` Brian Gough
  2 siblings, 0 replies; 5+ messages in thread
From: Brian Gough @ 2008-02-16 18:22 UTC (permalink / raw)
  To: Frank Reininghaus; +Cc: gsl-discuss

At Fri, 15 Feb 2008 00:00:55 +0100,
Frank Reininghaus wrote:
> I examined the first testcase: It seems that the origin of the NaN is in 
> beta_cont_frac () in the source file cdf/beta_inc.c. The loop aborts 
> after the maximum number of iterations because fabs (delta_frac - 1.0) 
> does not get small enough, and NaN is returned.
> 
> The funny thing is that I could influence this behaviour in a way I 
> didn't expect:

That's interesting -- it looks like there is a problem with extended
precision making the result worse.
> 
> (P - eP) [0] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [1] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [2] = nan, tolerance = 2.3283064365386963e-10
> (P - eP) [3] = 4.1520314247533996e-05, tolerance = 2.3283064365386963e-10

Here is a corrrection for the final value, it should be:

   eP [3] = 0.9167386972447996480399

The results should make more sense with that.  I'll commit that in the
test suite.

-- 
Brian Gough

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Examining bug #21838
  2008-02-14 23:01 Examining bug #21838 Frank Reininghaus
  2008-02-15  9:07 ` Jason Stover
  2008-02-16 18:22 ` Brian Gough
@ 2008-02-18  9:56 ` Brian Gough
  2 siblings, 0 replies; 5+ messages in thread
From: Brian Gough @ 2008-02-18  9:56 UTC (permalink / raw)
  To: Frank Reininghaus; +Cc: gsl-discuss

At Fri, 15 Feb 2008 00:00:55 +0100,
Frank Reininghaus wrote:
> Maybe these (at least for me) surprising observations could be used to 
> isolate the cause of the bug. Does anyone have a good idea?

It seems that in double precision the error term (delta_frac - 1) is
exactly zero at some point, so the continued fraction terminates.  It
looks like this happens before the desired accuracy is actually
reached though, so it seems to be undesirable.  In extended precision
that doesn't happen, the loop continues up to its maximum iteration
limit and NAN is returned to signal that.

If you add #include <gsl/gsl_ieee_utils.h>
and gsl_ieee_env_setup();
then run with GSL_IEEE_MODE=double-precision ./a.out 
you can see the difference.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Examining bug #21838
  2008-02-15  9:07 ` Jason Stover
@ 2008-02-20 18:04   ` Brian Gough
  0 siblings, 0 replies; 5+ messages in thread
From: Brian Gough @ 2008-02-20 18:04 UTC (permalink / raw)
  To: gsl-discuss

At Fri, 15 Feb 2008 00:30:43 +0000,
Jason Stover wrote:
> The test cases have huge numbers of denominator degrees of freedom
> (\nu_2 in the manual).  As \nu_2 --> infinity, the F density tends to
> a gamma density. So it might be worthwhile to fix this by calling
> gsl_cdf_gamma_[PQ] instead of beta_inc_AXPY() in such cases.

Thanks for the suggestion, I've added a patch for that.

diff --git a/cdf/beta_inc.c b/cdf/beta_inc.c
index 8c086d0..ac1d340 100644
--- a/cdf/beta_inc.c
+++ b/cdf/beta_inc.c
@@ -20,6 +20,8 @@
 /* Author:  G. Jungman */
 /* Modified for cdfs by Brian Gough, June 2003 */
 
+#include <gsl/gsl_sf_gamma.h>
+
 static double
 beta_cont_frac (const double a, const double b, const double x,
                 const double epsabs)
@@ -100,6 +102,8 @@ beta_cont_frac (const double a, const double b, const double x,
    absolute error when A*beta_inc is added to Y. (e.g. if Y >>
    A*beta_inc then the accuracy of beta_inc can be reduced) */
 
+
+
 static double
 beta_inc_AXPY (const double A, const double Y,
                const double a, const double b, const double x)
@@ -112,6 +116,18 @@ beta_inc_AXPY (const double A, const double Y,
     {
       return A * 1 + Y;
     }
+  else if (a > 1e5 && b < 10 && x > a / (a + b))
+    {
+      /* Handle asymptotic regime, large a, small b, x > peak [AS 26.5.17] */
+      double N = a + (b - 1.0) / 2.0;
+      return A * gsl_sf_gamma_inc_Q (b, -N * log (x)) + Y;
+    }
+  else if (b > 1e5 && a < 10 && x < b / (a + b))
+    {
+      /* Handle asymptotic regime, small a, large b, x < peak [AS 26.5.17] */
+      double N = b + (a - 1.0) / 2.0;
+      return A * gsl_sf_gamma_inc_P (a, -N * log1p (-x)) + Y;
+    }
   else
     {
       double ln_beta = gsl_sf_lnbeta (a, b);
-- 
1.5.3.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-02-20 18:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-14 23:01 Examining bug #21838 Frank Reininghaus
2008-02-15  9:07 ` Jason Stover
2008-02-20 18:04   ` Brian Gough
2008-02-16 18:22 ` Brian Gough
2008-02-18  9:56 ` Brian Gough

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).