CDF diff

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* CDF diff
@ 2003-01-10 18:43 Martin Jansche
  2003-01-14 22:02 ` Brian Gough
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Martin Jansche @ 2003-01-10 18:43 UTC (permalink / raw)
  To: gsl-discuss

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1873 bytes --]

Hi,

I'm attaching a diff against the anoncvs version that adds some
cumulative distribution functions for the following distributions:
gamma family (gamma, chisq, exponential), geometric, binomial, beta,
tdist, cauchy, logistic, pareto.  It includes modifications to test.c
to check whether a pdf integrates/sums up to the corresponding cdf
(the test messages are perhaps a bit too verbose; and some tests fail
for the pareto_cdf in the two rare cases where the simple numerical
integration of the pdf gives poor results).

A few related questions and suggestions: could somebody explain the
rationale for the implementation of the Erlang distribution?  I
thought it was a discrete distribution, but maybe I'm confused.

Second, it might be useful to have a functions that compute a definite
integral over a pdf with lower and upper bounds supplied by arguments.
This could be in addition to or in place of the _cdf functions.  The
reason for this is (discrete) distributions with no closed-form
formula for the cumulative density/mass.  So the design questions is,
should one have functions of the form foo_cdf(x,...) where x is the
upper bound of the sum/integral, or rather foo_cdf(a,b,...) that
return \sum_{x=a}^b foo_pdf(x,...) (resp. \int)?

Finally, would it be worth the added overhead to have separate types
for the parameters of each distribution?  For example, the current
gamma_pdf takes a shape parameter a and a scale parameter b; one could
put them in a struct and have several constructors (one in terms of
shape and scale, one in terms of shape in inverse scale, one in terms
of sample moments, etc.) that return such a struct, which could then
be passed to the _pdf(), _cdf() and variate functions.  I'd shy away
from using macros for this, since the macro could only be invoked
inside the argument list of a function call expression.

Thanks,
- martin

[-- Attachment #2: Type: TEXT/PLAIN, Size: 29292 bytes --]

Index: beta.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/beta.c,v
retrieving revision 1.10
diff -c -r1.10 beta.c
*** beta.c	23 Apr 2001 09:38:28 -0000	1.10
--- beta.c	10 Jan 2003 18:09:47 -0000
***************
*** 58,60 ****
--- 58,77 ----
        return p;
      }
  }
+ 
+ double
+ gsl_ran_beta_cdf (const double x, const double a, const double b)
+ {
+   if (x <= 0)
+     {
+       return 0;
+     }
+   else if (x >= 1)
+     {
+       return 1;
+     }
+   else
+     {
+       return gsl_sf_beta_inc (a, b, x);
+     }
+ }
Index: binomial.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/binomial.c,v
retrieving revision 1.13
diff -c -r1.13 binomial.c
*** binomial.c	24 Apr 2001 17:26:42 -0000	1.13
--- binomial.c	10 Jan 2003 18:09:47 -0000
***************
*** 85,87 ****
--- 85,101 ----
        return P;
      }
  }
+ 
+ double
+ gsl_ran_binomial_cdf (const unsigned int k, const double p, 
+ 		      const unsigned int n)
+ {
+   if (k >= n)
+     {
+       return 1;
+     }
+   else 
+     {
+       return 1 - gsl_sf_beta_inc (k+1, n-k, p);
+     }
+ }
Index: cauchy.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/cauchy.c,v
retrieving revision 1.10
diff -c -r1.10 cauchy.c
*** cauchy.c	17 Apr 2001 19:35:56 -0000	1.10
--- cauchy.c	10 Jan 2003 18:09:47 -0000
***************
*** 49,51 ****
--- 49,57 ----
    double p = (1 / (M_PI * a)) / (1 + u * u);
    return p;
  }
+ 
+ double
+ gsl_ran_cauchy_cdf (const double x, const double a)
+ {
+   return 0.5 + M_1_PI * atan (x/a);
+ }
Index: chisq.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/chisq.c,v
retrieving revision 1.11
diff -c -r1.11 chisq.c
*** chisq.c	23 Apr 2001 09:38:28 -0000	1.11
--- chisq.c	10 Jan 2003 18:09:47 -0000
***************
*** 19,24 ****
--- 19,25 ----
  
  #include <config.h>
  #include <math.h>
+ #include <gsl/gsl_sf_erf.h>
  #include <gsl/gsl_sf_gamma.h>
  #include <gsl/gsl_rng.h>
  #include <gsl/gsl_randist.h>
***************
*** 50,54 ****
--- 51,74 ----
        
        p = exp ((nu / 2 - 1) * log (x/2) - x/2 - lngamma) / 2;
        return p;
+     }
+ }
+ 
+ double
+ gsl_ran_chisq_cdf (const double x, const double nu)
+ {
+   if (x <= 0)
+     {
+       return 0;
+     }
+   else if (0 && nu == 1)
+     {
+       /* It might make sense to treat the case of a single Gaussian
+ 	 specially. */
+       return gsl_sf_erf ( sqrt (x/2) );
+     }
+   else
+     {
+       return gsl_sf_gamma_inc_P (nu/2, x/2);
      }
  }
Index: exponential.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/exponential.c,v
retrieving revision 1.11
diff -c -r1.11 exponential.c
*** exponential.c	4 May 2000 11:25:06 -0000	1.11
--- exponential.c	10 Jan 2003 18:09:47 -0000
***************
*** 19,24 ****
--- 19,25 ----
  
  #include <config.h>
  #include <math.h>
+ #include <gsl/gsl_math.h>
  #include <gsl/gsl_rng.h>
  #include <gsl/gsl_randist.h>
  
***************
*** 45,52 ****
      }
    else
      {
!       double p = exp (-x/mu)/mu;
        
        return p;
      }
  }
--- 46,67 ----
      }
    else
      {
!       double p = exp (-x/mu) / mu;
        
        return p;
+     }
+ }
+ 
+ double
+ gsl_ran_exponential_cdf (const double x, const double mu)
+ {
+   if (x <= 0)
+     {
+       return 0 ;
+     }
+   else
+     {
+       /* return 1 - exp (-x/mu); */
+       return fabs ( gsl_expm1 (-x/mu) );
      }
  }
Index: gamma.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/gamma.c,v
retrieving revision 1.20
diff -c -r1.20 gamma.c
*** gamma.c	23 Apr 2001 09:38:28 -0000	1.20
--- gamma.c	10 Jan 2003 18:09:47 -0000
***************
*** 138,143 ****
--- 138,146 ----
    return x;
  }
  
+ #define GSL_RAN_GAMMA_SHAPE_SCALE(shape,scale) (shape), 1/(scale)
+ #define GSL_RAN_GAMMA_SHAPE_INVSCALE(shape,invscale) (shape), (invscale)
+ 
  double
  gsl_ran_gamma_pdf (const double x, const double a, const double b)
  {
***************
*** 156,166 ****
      {
        return exp(-x/b)/b ;
      }
!   else 
      {
        double p;
        double lngamma = gsl_sf_lngamma (a);
        p = exp ((a - 1) * log (x/b) - x/b - lngamma)/b;
        return p;
      }
  }
--- 159,187 ----
      {
        return exp(-x/b)/b ;
      }
!   else
      {
        double p;
        double lngamma = gsl_sf_lngamma (a);
        p = exp ((a - 1) * log (x/b) - x/b - lngamma)/b;
        return p;
+     }
+ }
+ 
+ double
+ gsl_ran_gamma_cdf (const double x, const double a, const double b)
+ {
+   if (x <= 0)
+     {
+       return 0 ;
+     }
+   else if (a == 1)
+     /* exponential_cdf, see exponential.c */
+     {
+       return fabs ( gsl_expm1 (-x/b) );
+     }
+   else
+     {
+       return gsl_sf_gamma_inc_P (a, x/b);
      }
  }
Index: gauss.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/gauss.c,v
retrieving revision 1.19
diff -c -r1.19 gauss.c
*** gauss.c	4 Jun 2002 22:04:43 -0000	1.19
--- gauss.c	10 Jan 2003 18:09:47 -0000
***************
*** 22,27 ****
--- 22,28 ----
  #include <gsl/gsl_math.h>
  #include <gsl/gsl_rng.h>
  #include <gsl/gsl_randist.h>
+ #include <gsl/gsl_sf_erf.h>
  
  /* Of the two methods provided below, I think the Polar method is more
   * efficient, but only when you are actually producing two random
***************
*** 89,94 ****
--- 90,102 ----
    double u = x / fabs (sigma);
    double p = (1 / (sqrt (2 * M_PI) * fabs (sigma))) * exp (-u * u / 2);
    return p;
+ }
+ 
+ double
+ gsl_ran_gaussian_cdf (const double x, const double sigma)
+ {
+   double u = x / fabs (sigma);
+   return (1 + gsl_sf_erf (M_SQRT1_2 * u)) / 2;
  }
  
  double
Index: geometric.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/geometric.c,v
retrieving revision 1.9
diff -c -r1.9 geometric.c
*** geometric.c	4 May 2000 11:25:06 -0000	1.9
--- geometric.c	10 Jan 2003 18:09:47 -0000
***************
*** 19,24 ****
--- 19,25 ----
  
  #include <config.h>
  #include <math.h>
+ #include <gsl/gsl_math.h>
  #include <gsl/gsl_rng.h>
  #include <gsl/gsl_randist.h>
  
***************
*** 52,67 ****
  gsl_ran_geometric_pdf (const unsigned int k, const double p)
  {
    if (k == 0)
!     {
!       return 0 ;
!     }
!   else if (k == 1)
!     {
!       return p ;
!     }
    else
!     {
!       double P = p * pow (1 - p, k - 1.0);
!       return P;
!     }
  }
--- 53,68 ----
  gsl_ran_geometric_pdf (const unsigned int k, const double p)
  {
    if (k == 0)
!       return 0;
    else
!     return p * gsl_pow_int (1-p, k-1);
! }
! 
! double
! gsl_ran_geometric_cdf (const unsigned int k, const double p)
! {
!   if (k == 0)
!     return 0;
!   else
!     return 1 - gsl_pow_int (1-p, k);
  }
Index: gsl_randist.h
===================================================================
RCS file: /cvs/gsl/gsl/randist/gsl_randist.h,v
retrieving revision 1.39
diff -c -r1.39 gsl_randist.h
*** gsl_randist.h	10 Dec 2002 19:06:57 -0000	1.39
--- gsl_randist.h	10 Jan 2003 18:09:48 -0000
***************
*** 38,58 ****
--- 38,63 ----
  
  double gsl_ran_beta (const gsl_rng * r, const double a, const double b);
  double gsl_ran_beta_pdf (const double x, const double a, const double b);
+ double gsl_ran_beta_cdf (const double x, const double a, const double b);
  
  unsigned int gsl_ran_binomial (const gsl_rng * r, double p, unsigned int n);
  double gsl_ran_binomial_pdf (const unsigned int k, const double p, const unsigned int n);
+ double gsl_ran_binomial_cdf (const unsigned int k, const double p, const unsigned int n);
  
  double gsl_ran_exponential (const gsl_rng * r, const double mu);
  double gsl_ran_exponential_pdf (const double x, const double mu);
+ double gsl_ran_exponential_cdf (const double x, const double mu);
  
  double gsl_ran_exppow (const gsl_rng * r, const double a, const double b);
  double gsl_ran_exppow_pdf (const double x, const double a, const double b);
  
  double gsl_ran_cauchy (const gsl_rng * r, const double a);
  double gsl_ran_cauchy_pdf (const double x, const double a);
+ double gsl_ran_cauchy_cdf (const double x, const double a);
  
  double gsl_ran_chisq (const gsl_rng * r, const double nu);
  double gsl_ran_chisq_pdf (const double x, const double nu);
+ double gsl_ran_chisq_cdf (const double x, const double nu);
  
  void gsl_ran_dirichlet (const gsl_rng * r, const size_t K, const double alpha[], double theta[]);
  double gsl_ran_dirichlet_pdf (const size_t K, const double alpha[], const double theta[]);
***************
*** 70,79 ****
--- 75,86 ----
  double gsl_ran_gamma (const gsl_rng * r, const double a, const double b);
  double gsl_ran_gamma_int (const gsl_rng * r, const unsigned int a);
  double gsl_ran_gamma_pdf (const double x, const double a, const double b);
+ double gsl_ran_gamma_cdf (const double x, const double a, const double b);
  
  double gsl_ran_gaussian (const gsl_rng * r, const double sigma);
  double gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma);
  double gsl_ran_gaussian_pdf (const double x, const double sigma);
+ double gsl_ran_gaussian_cdf (const double x, const double sigma);
  
  double gsl_ran_ugaussian (const gsl_rng * r);
  double gsl_ran_ugaussian_ratio_method (const gsl_rng * r);
***************
*** 93,98 ****
--- 100,106 ----
  
  unsigned int gsl_ran_geometric (const gsl_rng * r, const double p);
  double gsl_ran_geometric_pdf (const unsigned int k, const double p);
+ double gsl_ran_geometric_cdf (const unsigned int k, const double p);
  
  unsigned int gsl_ran_hypergeometric (const gsl_rng * r, unsigned int n1, unsigned int n2, unsigned int t);
  double gsl_ran_hypergeometric_pdf (const unsigned int k, const unsigned int n1, const unsigned int n2, unsigned int t);
***************
*** 105,110 ****
--- 113,119 ----
  
  double gsl_ran_logistic (const gsl_rng * r, const double a);
  double gsl_ran_logistic_pdf (const double x, const double a);
+ double gsl_ran_logistic_cdf (const double x, const double a);
  
  double gsl_ran_lognormal (const gsl_rng * r, const double zeta, const double sigma);
  double gsl_ran_lognormal_pdf (const double x, const double zeta, const double sigma);
***************
*** 129,134 ****
--- 138,144 ----
  
  double gsl_ran_pareto (const gsl_rng * r, double a, const double b);
  double gsl_ran_pareto_pdf (const double x, const double a, const double b);
+ double gsl_ran_pareto_cdf (const double x, const double a, const double b);
  
  unsigned int gsl_ran_poisson (const gsl_rng * r, double mu);
  void gsl_ran_poisson_array (const gsl_rng * r, size_t n, unsigned int array[],
***************
*** 143,148 ****
--- 153,159 ----
  
  double gsl_ran_tdist (const gsl_rng * r, const double nu);
  double gsl_ran_tdist_pdf (const double x, const double nu);
+ double gsl_ran_tdist_cdf (const double x, const double nu);
  
  double gsl_ran_laplace (const gsl_rng * r, const double a);
  double gsl_ran_laplace_pdf (const double x, const double a);
Index: logistic.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/logistic.c,v
retrieving revision 1.11
diff -c -r1.11 logistic.c
*** logistic.c	17 Apr 2001 19:35:56 -0000	1.11
--- logistic.c	10 Jan 2003 18:09:48 -0000
***************
*** 51,53 ****
--- 51,59 ----
    double p = u / (fabs(a) * (1 + u) * (1 + u));
    return p;
  }
+ 
+ double
+ gsl_ran_logistic_cdf (const double x, const double a)
+ {
+   return 1 / (1 + exp(-x/fabs(a)));
+ }
Index: pareto.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/pareto.c,v
retrieving revision 1.9
diff -c -r1.9 pareto.c
*** pareto.c	21 Sep 2000 19:19:21 -0000	1.9
--- pareto.c	10 Jan 2003 18:09:48 -0000
***************
*** 52,54 ****
--- 52,59 ----
      }
  }
  
+ double
+ gsl_ran_pareto_cdf (const double x, const double a, const double b)
+ {
+   return (x >= b)? 1 - pow (b/x, a) : 0;
+ }
Index: tdist.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/tdist.c,v
retrieving revision 1.12
diff -c -r1.12 tdist.c
*** tdist.c	23 Apr 2001 09:38:28 -0000	1.12
--- tdist.c	10 Jan 2003 18:09:48 -0000
***************
*** 77,80 ****
    return p;
  }
  
! 
--- 77,93 ----
    return p;
  }
  
! double
! gsl_ran_tdist_cdf (const double x, const double nu)
! {
!   if (x == 0)
!     {
!       return 0.5;
!     }
!   else {
!     double d = 0.5 * gsl_sf_beta_inc (0.5*nu, 0.5, nu/(nu + x*x));
!     /* d as a function of x is symmetric about zero, but has a maximum
!        there; adjust as follows(?): */
!     return (x < 0)? d : 1 - d;
!   }
! }
Index: test.c
===================================================================
RCS file: /cvs/gsl/gsl/randist/test.c,v
retrieving revision 1.37
diff -c -r1.37 test.c
*** test.c	10 Dec 2002 19:06:58 -0000	1.37
--- test.c	10 Jan 2003 18:09:48 -0000
***************
*** 35,58 ****
  
  void testMoments (double (*f) (void), const char *name,
  		  double a, double b, double p);
! void testPDF (double (*f) (void), double (*pdf) (double), const char *name);
  void testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
! 		      const char *name);
  
  void test_shuffle (void);
  void test_choose (void);
  double test_beta (void);
  double test_beta_pdf (double x);
  double test_bernoulli (void);
  double test_bernoulli_pdf (unsigned int n);
  double test_binomial (void);
  double test_binomial_pdf (unsigned int n);
  double test_binomial_large (void);
  double test_binomial_large_pdf (unsigned int n);
  double test_cauchy (void);
  double test_cauchy_pdf (double x);
  double test_chisq (void);
  double test_chisq_pdf (double x);
  double test_dirichlet (void);
  double test_dirichlet_pdf (double x);
  void test_dirichlet_moments (void);
--- 35,64 ----
  
  void testMoments (double (*f) (void), const char *name,
  		  double a, double b, double p);
! void testPDF (double (*f) (void), double (*pdf) (double),
! 	      double (*cdf) (double), const char *name);
  void testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
! 		      double (*cdf) (unsigned int), const char *name);
  
  void test_shuffle (void);
  void test_choose (void);
  double test_beta (void);
  double test_beta_pdf (double x);
+ double test_beta_cdf (double x);
  double test_bernoulli (void);
  double test_bernoulli_pdf (unsigned int n);
  double test_binomial (void);
  double test_binomial_pdf (unsigned int n);
+ double test_binomial_cdf (unsigned int n);
  double test_binomial_large (void);
  double test_binomial_large_pdf (unsigned int n);
+ double test_binomial_large_cdf (unsigned int n);
  double test_cauchy (void);
  double test_cauchy_pdf (double x);
+ double test_cauchy_cdf (double x);
  double test_chisq (void);
  double test_chisq_pdf (double x);
+ double test_chisq_cdf (double x);
  double test_dirichlet (void);
  double test_dirichlet_pdf (double x);
  void test_dirichlet_moments (void);
***************
*** 64,69 ****
--- 70,76 ----
  double test_erlang_pdf (double x);
  double test_exponential (void);
  double test_exponential_pdf (double x);
+ double test_exponential_cdf (double x);
  double test_exppow0 (void);
  double test_exppow0_pdf (double x);
  double test_exppow1 (void);
***************
*** 80,93 ****
--- 87,103 ----
  double test_flat_pdf (double x);
  double test_gamma (void);
  double test_gamma_pdf (double x);
+ double test_gamma_cdf (double x);
  double test_gamma1 (void);
  double test_gamma1_pdf (double x);
+ double test_gamma1_cdf (double x);
  double test_gamma_int (void);
  double test_gamma_int_pdf (double x);
  double test_gamma_large (void);
  double test_gamma_large_pdf (double x);
  double test_gaussian (void);
  double test_gaussian_pdf (double x);
+ double test_gaussian_cdf (double x);
  double test_gaussian_ratio_method (void);
  double test_gaussian_ratio_method_pdf (double x);
  double test_gaussian_tail (void);
***************
*** 116,123 ****
--- 126,135 ----
  double test_gumbel2_pdf (double x);
  double test_geometric (void);
  double test_geometric_pdf (unsigned int x);
+ double test_geometric_cdf (unsigned int x);
  double test_geometric1 (void);
  double test_geometric1_pdf (unsigned int x);
+ double test_geometric1_cdf (unsigned int x);
  double test_hypergeometric1 (void);
  double test_hypergeometric1_pdf (unsigned int x);
  double test_hypergeometric2 (void);
***************
*** 154,159 ****
--- 166,172 ----
  double test_levy_skew2b_pdf (double x);
  double test_logistic (void);
  double test_logistic_pdf (double x);
+ double test_logistic_cdf (double x);
  double test_lognormal (void);
  double test_lognormal_pdf (double x);
  double test_logarithmic (void);
***************
*** 169,174 ****
--- 182,188 ----
  double test_pascal_pdf (unsigned int n);
  double test_pareto (void);
  double test_pareto_pdf (double x);
+ double test_pareto_cdf (double x);
  double test_poisson (void);
  double test_poisson_pdf (unsigned int x);
  double test_poisson_large (void);
***************
*** 189,196 ****
--- 203,212 ----
  double test_rayleigh_tail_pdf (double x);
  double test_tdist1 (void);
  double test_tdist1_pdf (double x);
+ double test_tdist1_cdf (double x);
  double test_tdist2 (void);
  double test_tdist2_pdf (double x);
+ double test_tdist2_cdf (double x);
  double test_laplace (void);
  double test_laplace_pdf (double x);
  double test_weibull (void);
***************
*** 209,215 ****
    r_global = gsl_rng_alloc (gsl_rng_default);
  
  #define FUNC(x)  test_ ## x,                     "test gsl_ran_" #x
! #define FUNC2(x) test_ ## x, test_ ## x ## _pdf, "test gsl_ran_" #x
  
    test_shuffle ();
    test_choose ();
--- 225,233 ----
    r_global = gsl_rng_alloc (gsl_rng_default);
  
  #define FUNC(x)  test_ ## x,                     "test gsl_ran_" #x
! #define FUNC2(x) test_ ## x, test_ ## x ## _pdf, NULL, "test gsl_ran_" #x
! #define FUNC3(x) test_ ## x, test_ ## x ## _pdf, test_ ## x ## _cdf, \
! 		 "test gsl_ran_" #x
  
    test_shuffle ();
    test_choose ();
***************
*** 228,239 ****
    test_dirichlet_moments ();
    test_multinomial_moments ();
  
!   testPDF (FUNC2 (beta));
!   testPDF (FUNC2 (cauchy));
!   testPDF (FUNC2 (chisq));
    testPDF (FUNC2 (dirichlet));
    testPDF (FUNC2 (erlang));
!   testPDF (FUNC2 (exponential));
  
    testPDF (FUNC2 (exppow0));
    testPDF (FUNC2 (exppow1));
--- 246,257 ----
    test_dirichlet_moments ();
    test_multinomial_moments ();
  
!   testPDF (FUNC3 (beta));	/* CDF available */
!   testPDF (FUNC3 (cauchy));	/* CDF available */
!   testPDF (FUNC3 (chisq));	/* CDF available */
    testPDF (FUNC2 (dirichlet));
    testPDF (FUNC2 (erlang));
!   testPDF (FUNC3 (exponential));/* CDF available */
  
    testPDF (FUNC2 (exppow0));
    testPDF (FUNC2 (exppow1));
***************
*** 243,253 ****
  
    testPDF (FUNC2 (fdist));
    testPDF (FUNC2 (flat));
!   testPDF (FUNC2 (gamma));
!   testPDF (FUNC2 (gamma1));
    testPDF (FUNC2 (gamma_int));
    testPDF (FUNC2 (gamma_large));
!   testPDF (FUNC2 (gaussian));
    testPDF (FUNC2 (gaussian_ratio_method));
    testPDF (FUNC2 (ugaussian));
    testPDF (FUNC2 (ugaussian_ratio_method));
--- 261,271 ----
  
    testPDF (FUNC2 (fdist));
    testPDF (FUNC2 (flat));
!   testPDF (FUNC3 (gamma));	/* CDF available */
!   testPDF (FUNC3 (gamma1));	/* CDF available */
    testPDF (FUNC2 (gamma_int));
    testPDF (FUNC2 (gamma_large));
!   testPDF (FUNC3 (gaussian));  /* CDF available */
    testPDF (FUNC2 (gaussian_ratio_method));
    testPDF (FUNC2 (ugaussian));
    testPDF (FUNC2 (ugaussian_ratio_method));
***************
*** 274,286 ****
    testPDF (FUNC2 (levy_skew2a));
    testPDF (FUNC2 (levy_skew1b));
    testPDF (FUNC2 (levy_skew2b));
!   testPDF (FUNC2 (logistic));
    testPDF (FUNC2 (lognormal));
!   testPDF (FUNC2 (pareto));
    testPDF (FUNC2 (rayleigh));
    testPDF (FUNC2 (rayleigh_tail));
!   testPDF (FUNC2 (tdist1));
!   testPDF (FUNC2 (tdist2));
    testPDF (FUNC2 (laplace));
    testPDF (FUNC2 (weibull));
    testPDF (FUNC2 (weibull1));
--- 292,304 ----
    testPDF (FUNC2 (levy_skew2a));
    testPDF (FUNC2 (levy_skew1b));
    testPDF (FUNC2 (levy_skew2b));
!   testPDF (FUNC3 (logistic));	/* CDF available */
    testPDF (FUNC2 (lognormal));
!   testPDF (FUNC3 (pareto));	/* CDF available */
    testPDF (FUNC2 (rayleigh));
    testPDF (FUNC2 (rayleigh_tail));
!   testPDF (FUNC3 (tdist1));	/* CDF available */
!   testPDF (FUNC3 (tdist2));	/* CDF available */
    testPDF (FUNC2 (laplace));
    testPDF (FUNC2 (weibull));
    testPDF (FUNC2 (weibull1));
***************
*** 296,305 ****
    testDiscretePDF (FUNC2 (poisson));
    testDiscretePDF (FUNC2 (poisson_large));
    testDiscretePDF (FUNC2 (bernoulli));
!   testDiscretePDF (FUNC2 (binomial));
!   testDiscretePDF (FUNC2 (binomial_large));
!   testDiscretePDF (FUNC2 (geometric));
!   testDiscretePDF (FUNC2 (geometric1));
    testDiscretePDF (FUNC2 (hypergeometric1));
    testDiscretePDF (FUNC2 (hypergeometric2));
    testDiscretePDF (FUNC2 (hypergeometric3));
--- 314,323 ----
    testDiscretePDF (FUNC2 (poisson));
    testDiscretePDF (FUNC2 (poisson_large));
    testDiscretePDF (FUNC2 (bernoulli));
!   testDiscretePDF (FUNC3 (binomial));  /* CDF available */
!   testDiscretePDF (FUNC3 (binomial_large));  /* CDF available */
!   testDiscretePDF (FUNC3 (geometric));  /* CDF available */
!   testDiscretePDF (FUNC3 (geometric1));  /* CDF available */
    testDiscretePDF (FUNC2 (hypergeometric1));
    testDiscretePDF (FUNC2 (hypergeometric2));
    testDiscretePDF (FUNC2 (hypergeometric3));
***************
*** 434,444 ****
  #define BINS 100
  
  void
! testPDF (double (*f) (void), double (*pdf) (double), const char *name)
  {
    double count[BINS], p[BINS];
!   double a = -5.0, b = +5.0;
!   double dx = (b - a) / BINS;
    int i, j, status = 0, status_i = 0;
  
    for (i = 0; i < BINS; i++)
--- 452,466 ----
  #define BINS 100
  
  void
! testPDF (double (*f) (void),
! 	 double (*pdf) (double),
! 	 double (*cdf) (double),
! 	 const char *name)
  {
    double count[BINS], p[BINS];
!   const double a = -5.0, b = +5.0;
!   const double dx = (b - a) / BINS;
!   double integral = 0;
    int i, j, status = 0, status_i = 0;
  
    for (i = 0; i < BINS; i++)
***************
*** 470,475 ****
--- 492,515 ----
  	sum += pdf (x + j * dx / STEPS);
  
        p[i] = 0.5 * (pdf (x) + 2 * sum + pdf (x + dx - 1e-7)) * dx / STEPS;
+       integral += p[i];
+ 
+       if (cdf != NULL)
+ 	/* leave this test in place as long as some _cdf() functions
+ 	   are missing */
+ 	{
+ 	  /* check whether the pdf integrates up to the cdf */
+ 	  double exact_integral = cdf (x + dx) - cdf (x);
+ 	  gsl_test_rel(exact_integral, p[i], 1e-3,
+ 		       "%s integral on [%g,%g)", name, x, x + dx);
+ 	}
+     }
+ 
+   if (cdf != NULL)
+     {
+       double exact_integral = cdf (b) - cdf (a);
+       gsl_test_rel(exact_integral, integral, 1e-6,
+ 		   "%s integral on [%g,%g)", name, a, b);
      }
  
    for (i = 0; i < BINS; i++)
***************
*** 498,507 ****
  
  void
  testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
! 		 const char *name)
  {
    double count[BINS], p[BINS];
    unsigned int i;
    int status = 0, status_i = 0;
  
    for (i = 0; i < BINS; i++)
--- 538,548 ----
  
  void
  testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
! 		 double (*cdf) (unsigned int), const char *name)
  {
    double count[BINS], p[BINS];
    unsigned int i;
+   double sum = 0;
    int status = 0, status_i = 0;
  
    for (i = 0; i < BINS; i++)
***************
*** 514,521 ****
  	count[r]++;
      }
  
!   for (i = 0; i < BINS; i++)
      p[i] = pdf (i);
  
    for (i = 0; i < BINS; i++)
      {
--- 555,566 ----
  	count[r]++;
      }
  
!   for (i = 0; i < BINS; i++) {
      p[i] = pdf (i);
+     sum += p[i];
+     if (cdf != NULL)
+       gsl_test_rel(cdf(i), sum, 1e-6, "%s sum_0^%d", name, i);
+   }
  
    for (i = 0; i < BINS; i++)
      {
***************
*** 553,558 ****
--- 598,609 ----
  {
    return gsl_ran_beta_pdf (x, 2.0, 3.0);
  }
+ double
+ test_beta_cdf (double x)
+ {
+   return gsl_ran_beta_cdf (x, 2.0, 3.0);
+ }
+ 
  
  double
  test_bernoulli (void)
***************
*** 580,585 ****
--- 631,642 ----
  }
  
  double
+ test_binomial_cdf (unsigned int n)
+ {
+   return gsl_ran_binomial_cdf (n, 0.3, 5);
+ }
+ 
+ double
  test_binomial_large (void)
  {
    return gsl_ran_binomial (r_global, 0.3, 55);
***************
*** 590,595 ****
--- 647,658 ----
  {
    return gsl_ran_binomial_pdf (n, 0.3, 55);
  }
+ double
+ test_binomial_large_cdf (unsigned int n)
+ {
+   return gsl_ran_binomial_cdf (n, 0.3, 55);
+ }
+ 
  
  double
  test_cauchy (void)
***************
*** 604,609 ****
--- 667,679 ----
  }
  
  double
+ test_cauchy_cdf (double x)
+ {
+   return gsl_ran_cauchy_cdf (x, 2.0);
+ }
+ 
+ 
+ double
  test_chisq (void)
  {
    return gsl_ran_chisq (r_global, 13.0);
***************
*** 616,621 ****
--- 686,697 ----
  }
  
  double
+ test_chisq_cdf (double x)
+ {
+   return gsl_ran_chisq_cdf (x, 13.0);
+ }
+ 
+ double
  test_dir2d (void)
  {
    double x = 0, y = 0, theta;
***************
*** 911,916 ****
--- 987,998 ----
  }
  
  double
+ test_exponential_cdf (double x)
+ {
+   return gsl_ran_exponential_cdf (x, 2.0);
+ }
+ 
+ double
  test_exppow0 (void)
  {
    return gsl_ran_exppow (r_global, 3.7, 0.3);
***************
*** 1008,1013 ****
--- 1090,1101 ----
  }
  
  double
+ test_gamma_cdf (double x)
+ {
+   return gsl_ran_gamma_cdf (x, 2.5, 2.17);
+ }
+ 
+ double
  test_gamma1 (void)
  {
    return gsl_ran_gamma (r_global, 1.0, 2.17);
***************
*** 1019,1024 ****
--- 1107,1117 ----
    return gsl_ran_gamma_pdf (x, 1.0, 2.17);
  }
  
+ double
+ test_gamma1_cdf (double x)
+ {
+   return gsl_ran_gamma_cdf (x, 1.0, 2.17);
+ }
  
  double
  test_gamma_int (void)
***************
*** 1057,1062 ****
--- 1150,1161 ----
  {
    return gsl_ran_gaussian_pdf (x, 3.0);
  }
+ double
+ 
+ test_gaussian_cdf (double x)
+ {
+   return gsl_ran_gaussian_cdf (x, 3.0);
+ }
  
  double
  test_gaussian_ratio_method (void)
***************
*** 1232,1237 ****
--- 1331,1342 ----
  }
  
  double
+ test_geometric_cdf (unsigned int n)
+ {
+   return gsl_ran_geometric_cdf (n, 0.5);
+ }
+ 
+ double
  test_geometric1 (void)
  {
    return gsl_ran_geometric (r_global, 1.0);
***************
*** 1244,1249 ****
--- 1349,1360 ----
  }
  
  double
+ test_geometric1_cdf (unsigned int n)
+ {
+   return gsl_ran_geometric_cdf (n, 1.0);
+ }
+ 
+ double
  test_hypergeometric1 (void)
  {
    return gsl_ran_hypergeometric (r_global, 5, 7, 4);
***************
*** 1491,1496 ****
--- 1602,1614 ----
  }
  
  double
+ test_logistic_cdf (double x)
+ {
+   return gsl_ran_logistic_cdf (x, 3.1);
+ }
+ 
+ 
+ double
  test_logarithmic (void)
  {
    return gsl_ran_logarithmic (r_global, 0.4);
***************
*** 1532,1538 ****
  double
  test_multinomial_pdf (unsigned int n_0)
  {
!   /* The margional distribution of just 1 variate  is binomial. */
    size_t K = 2;
    /* Test use of weights instead of probabilities */
    double p[] = { 0.4, 1.6};
--- 1650,1656 ----
  double
  test_multinomial_pdf (unsigned int n_0)
  {
!   /* The marginal distribution of just 1 variate is binomial. */
    size_t K = 2;
    /* Test use of weights instead of probabilities */
    double p[] = { 0.4, 1.6};
***************
*** 1603,1608 ****
--- 1721,1733 ----
  }
  
  double
+ test_pareto_cdf (double x)
+ {
+   return gsl_ran_pareto_cdf (x, 1.9, 2.75);
+ }
+ 
+ 
+ double
  test_rayleigh (void)
  {
    return gsl_ran_rayleigh (r_global, 1.9);
***************
*** 1665,1670 ****
--- 1790,1801 ----
  }
  
  double
+ test_tdist1_cdf (double x)
+ {
+   return gsl_ran_tdist_cdf (x, 1.75);
+ }
+ 
+ double
  test_tdist2 (void)
  {
    return gsl_ran_tdist (r_global, 12.75);
***************
*** 1674,1679 ****
--- 1805,1816 ----
  test_tdist2_pdf (double x)
  {
    return gsl_ran_tdist_pdf (x, 12.75);
+ }
+ 
+ double
+ test_tdist2_cdf (double x)
+ {
+   return gsl_ran_tdist_cdf (x, 12.75);
  }
  
  

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CDF diff
  2003-01-10 18:43 CDF diff Martin Jansche
@ 2003-01-14 22:02 ` Brian Gough
  2003-01-14 22:32 ` Brian Gough
  2003-01-15  3:31 ` Jason Hooper Stover
  2 siblings, 0 replies; 6+ messages in thread
From: Brian Gough @ 2003-01-14 22:02 UTC (permalink / raw)
  To: Martin Jansche; +Cc: Jason Hooper Stover, gsl-discuss

Martin Jansche writes:
 > I'm attaching a diff against the anoncvs version that adds some
 > cumulative distribution functions for the following distributions:
 > gamma family (gamma, chisq, exponential), geometric, binomial, beta,
 > tdist, cauchy, logistic, pareto. 

Thanks.  I'll forward this to Jason Stover who is developing the cdf
module.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CDF diff
  2003-01-10 18:43 CDF diff Martin Jansche
  2003-01-14 22:02 ` Brian Gough
@ 2003-01-14 22:32 ` Brian Gough
  2003-01-15  3:31 ` Jason Hooper Stover
  2 siblings, 0 replies; 6+ messages in thread
From: Brian Gough @ 2003-01-14 22:32 UTC (permalink / raw)
  To: Martin Jansche; +Cc: gsl-discuss

Martin Jansche writes:
 > A few related questions and suggestions: could somebody explain the
 > rationale for the implementation of the Erlang distribution?  I
 > thought it was a discrete distribution, but maybe I'm confused.

Probably shouldn't be there, or should have had an int argument, it's
a duplicate of gamma, but since it's already in there and working I've
left it alone.

 > Finally, would it be worth the added overhead to have separate types
 > for the parameters of each distribution?

I like to stick to plain C types as far as possible.

Brian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CDF diff
  2003-01-10 18:43 CDF diff Martin Jansche
  2003-01-14 22:02 ` Brian Gough
  2003-01-14 22:32 ` Brian Gough
@ 2003-01-15  3:31 ` Jason Hooper Stover
  2003-01-16  2:35   ` Martin Jansche
  2 siblings, 1 reply; 6+ messages in thread
From: Jason Hooper Stover @ 2003-01-15  3:31 UTC (permalink / raw)
  To: Martin Jansche; +Cc: gsl-discuss


Martin,

Thanks for the code, I'll work on adding it to the cdf directory.

> 
> A few related questions and suggestions: could somebody explain the
> rationale for the implementation of the Erlang distribution?  I
> thought it was a discrete distribution, but maybe I'm confused.

According to Kendall&Stuart(&Ord), the Erlang distribution is
another name for the Gamma distribution. They say "Erlang distribution"
is the name sometimes used in the queueing theory literature.

> 
> Second, it might be useful to have a functions that compute a definite
> integral over a pdf with lower and upper bounds supplied by arguments.
> This could be in addition to or in place of the _cdf functions.  The
> reason for this is (discrete) distributions with no closed-form
> formula for the cumulative density/mass.  So the design questions is,
> should one have functions of the form foo_cdf(x,...) where x is the
> upper bound of the sum/integral, or rather foo_cdf(a,b,...) that
> return \sum_{x=a}^b foo_pdf(x,...) (resp. \int)?

The cdf module computes functions of the form Pr(Z<x) and
Pr(Z>x) for a random variable Z, so the argument to the functions
is either the upper or lower bound of the integral. 
I hadn't planned to compute Pr(a<Z<b) directly since such
a function isn't a "cumulative distribution function" 
according to the common definition. Also, users can 
compute Pr(a<Z<b) as Pr(Z<b)-Pr(Z<a).

> 
> Finally, would it be worth the added overhead to have separate types
> for the parameters of each distribution?  For example, the current
> gamma_pdf takes a shape parameter a and a scale parameter b; one could
> put them in a struct and have several constructors (one in terms of
> shape and scale, one in terms of shape in inverse scale, one in terms
> of sample moments, etc.) that return such a struct, which could then
> be passed to the _pdf(), _cdf() and variate functions.  I'd shy away
> from using macros for this, since the macro could only be invoked
> inside the argument list of a function call expression.

My initial reaction is that the added overhead wouldn't be worth the
effort for a library like GSL. E.g., such an approach might make sense 
for a higher-level program that runs statistical
routines for the exponential family. But I could be wrong; maybe it
would make sense in GSL. If you have some coded examples of this 
kind of thing, send them along.

-Jason

> 
> Thanks,
> - martin

> Index: beta.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/beta.c,v
> retrieving revision 1.10
> diff -c -r1.10 beta.c
> *** beta.c	23 Apr 2001 09:38:28 -0000	1.10
> --- beta.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 58,60 ****
> --- 58,77 ----
>         return p;
>       }
>   }
> + 
> + double
> + gsl_ran_beta_cdf (const double x, const double a, const double b)
> + {
> +   if (x <= 0)
> +     {
> +       return 0;
> +     }
> +   else if (x >= 1)
> +     {
> +       return 1;
> +     }
> +   else
> +     {
> +       return gsl_sf_beta_inc (a, b, x);
> +     }
> + }
> Index: binomial.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/binomial.c,v
> retrieving revision 1.13
> diff -c -r1.13 binomial.c
> *** binomial.c	24 Apr 2001 17:26:42 -0000	1.13
> --- binomial.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 85,87 ****
> --- 85,101 ----
>         return P;
>       }
>   }
> + 
> + double
> + gsl_ran_binomial_cdf (const unsigned int k, const double p, 
> + 		      const unsigned int n)
> + {
> +   if (k >= n)
> +     {
> +       return 1;
> +     }
> +   else 
> +     {
> +       return 1 - gsl_sf_beta_inc (k+1, n-k, p);
> +     }
> + }
> Index: cauchy.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/cauchy.c,v
> retrieving revision 1.10
> diff -c -r1.10 cauchy.c
> *** cauchy.c	17 Apr 2001 19:35:56 -0000	1.10
> --- cauchy.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 49,51 ****
> --- 49,57 ----
>     double p = (1 / (M_PI * a)) / (1 + u * u);
>     return p;
>   }
> + 
> + double
> + gsl_ran_cauchy_cdf (const double x, const double a)
> + {
> +   return 0.5 + M_1_PI * atan (x/a);
> + }
> Index: chisq.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/chisq.c,v
> retrieving revision 1.11
> diff -c -r1.11 chisq.c
> *** chisq.c	23 Apr 2001 09:38:28 -0000	1.11
> --- chisq.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 19,24 ****
> --- 19,25 ----
>   
>   #include <config.h>
>   #include <math.h>
> + #include <gsl/gsl_sf_erf.h>
>   #include <gsl/gsl_sf_gamma.h>
>   #include <gsl/gsl_rng.h>
>   #include <gsl/gsl_randist.h>
> ***************
> *** 50,54 ****
> --- 51,74 ----
>         
>         p = exp ((nu / 2 - 1) * log (x/2) - x/2 - lngamma) / 2;
>         return p;
> +     }
> + }
> + 
> + double
> + gsl_ran_chisq_cdf (const double x, const double nu)
> + {
> +   if (x <= 0)
> +     {
> +       return 0;
> +     }
> +   else if (0 && nu == 1)
> +     {
> +       /* It might make sense to treat the case of a single Gaussian
> + 	 specially. */
> +       return gsl_sf_erf ( sqrt (x/2) );
> +     }
> +   else
> +     {
> +       return gsl_sf_gamma_inc_P (nu/2, x/2);
>       }
>   }
> Index: exponential.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/exponential.c,v
> retrieving revision 1.11
> diff -c -r1.11 exponential.c
> *** exponential.c	4 May 2000 11:25:06 -0000	1.11
> --- exponential.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 19,24 ****
> --- 19,25 ----
>   
>   #include <config.h>
>   #include <math.h>
> + #include <gsl/gsl_math.h>
>   #include <gsl/gsl_rng.h>
>   #include <gsl/gsl_randist.h>
>   
> ***************
> *** 45,52 ****
>       }
>     else
>       {
> !       double p = exp (-x/mu)/mu;
>         
>         return p;
>       }
>   }
> --- 46,67 ----
>       }
>     else
>       {
> !       double p = exp (-x/mu) / mu;
>         
>         return p;
> +     }
> + }
> + 
> + double
> + gsl_ran_exponential_cdf (const double x, const double mu)
> + {
> +   if (x <= 0)
> +     {
> +       return 0 ;
> +     }
> +   else
> +     {
> +       /* return 1 - exp (-x/mu); */
> +       return fabs ( gsl_expm1 (-x/mu) );
>       }
>   }
> Index: gamma.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/gamma.c,v
> retrieving revision 1.20
> diff -c -r1.20 gamma.c
> *** gamma.c	23 Apr 2001 09:38:28 -0000	1.20
> --- gamma.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 138,143 ****
> --- 138,146 ----
>     return x;
>   }
>   
> + #define GSL_RAN_GAMMA_SHAPE_SCALE(shape,scale) (shape), 1/(scale)
> + #define GSL_RAN_GAMMA_SHAPE_INVSCALE(shape,invscale) (shape), (invscale)
> + 
>   double
>   gsl_ran_gamma_pdf (const double x, const double a, const double b)
>   {
> ***************
> *** 156,166 ****
>       {
>         return exp(-x/b)/b ;
>       }
> !   else 
>       {
>         double p;
>         double lngamma = gsl_sf_lngamma (a);
>         p = exp ((a - 1) * log (x/b) - x/b - lngamma)/b;
>         return p;
>       }
>   }
> --- 159,187 ----
>       {
>         return exp(-x/b)/b ;
>       }
> !   else
>       {
>         double p;
>         double lngamma = gsl_sf_lngamma (a);
>         p = exp ((a - 1) * log (x/b) - x/b - lngamma)/b;
>         return p;
> +     }
> + }
> + 
> + double
> + gsl_ran_gamma_cdf (const double x, const double a, const double b)
> + {
> +   if (x <= 0)
> +     {
> +       return 0 ;
> +     }
> +   else if (a == 1)
> +     /* exponential_cdf, see exponential.c */
> +     {
> +       return fabs ( gsl_expm1 (-x/b) );
> +     }
> +   else
> +     {
> +       return gsl_sf_gamma_inc_P (a, x/b);
>       }
>   }
> Index: gauss.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/gauss.c,v
> retrieving revision 1.19
> diff -c -r1.19 gauss.c
> *** gauss.c	4 Jun 2002 22:04:43 -0000	1.19
> --- gauss.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 22,27 ****
> --- 22,28 ----
>   #include <gsl/gsl_math.h>
>   #include <gsl/gsl_rng.h>
>   #include <gsl/gsl_randist.h>
> + #include <gsl/gsl_sf_erf.h>
>   
>   /* Of the two methods provided below, I think the Polar method is more
>    * efficient, but only when you are actually producing two random
> ***************
> *** 89,94 ****
> --- 90,102 ----
>     double u = x / fabs (sigma);
>     double p = (1 / (sqrt (2 * M_PI) * fabs (sigma))) * exp (-u * u / 2);
>     return p;
> + }
> + 
> + double
> + gsl_ran_gaussian_cdf (const double x, const double sigma)
> + {
> +   double u = x / fabs (sigma);
> +   return (1 + gsl_sf_erf (M_SQRT1_2 * u)) / 2;
>   }
>   
>   double
> Index: geometric.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/geometric.c,v
> retrieving revision 1.9
> diff -c -r1.9 geometric.c
> *** geometric.c	4 May 2000 11:25:06 -0000	1.9
> --- geometric.c	10 Jan 2003 18:09:47 -0000
> ***************
> *** 19,24 ****
> --- 19,25 ----
>   
>   #include <config.h>
>   #include <math.h>
> + #include <gsl/gsl_math.h>
>   #include <gsl/gsl_rng.h>
>   #include <gsl/gsl_randist.h>
>   
> ***************
> *** 52,67 ****
>   gsl_ran_geometric_pdf (const unsigned int k, const double p)
>   {
>     if (k == 0)
> !     {
> !       return 0 ;
> !     }
> !   else if (k == 1)
> !     {
> !       return p ;
> !     }
>     else
> !     {
> !       double P = p * pow (1 - p, k - 1.0);
> !       return P;
> !     }
>   }
> --- 53,68 ----
>   gsl_ran_geometric_pdf (const unsigned int k, const double p)
>   {
>     if (k == 0)
> !       return 0;
>     else
> !     return p * gsl_pow_int (1-p, k-1);
> ! }
> ! 
> ! double
> ! gsl_ran_geometric_cdf (const unsigned int k, const double p)
> ! {
> !   if (k == 0)
> !     return 0;
> !   else
> !     return 1 - gsl_pow_int (1-p, k);
>   }
> Index: gsl_randist.h
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/gsl_randist.h,v
> retrieving revision 1.39
> diff -c -r1.39 gsl_randist.h
> *** gsl_randist.h	10 Dec 2002 19:06:57 -0000	1.39
> --- gsl_randist.h	10 Jan 2003 18:09:48 -0000
> ***************
> *** 38,58 ****
> --- 38,63 ----
>   
>   double gsl_ran_beta (const gsl_rng * r, const double a, const double b);
>   double gsl_ran_beta_pdf (const double x, const double a, const double b);
> + double gsl_ran_beta_cdf (const double x, const double a, const double b);
>   
>   unsigned int gsl_ran_binomial (const gsl_rng * r, double p, unsigned int n);
>   double gsl_ran_binomial_pdf (const unsigned int k, const double p, const unsigned int n);
> + double gsl_ran_binomial_cdf (const unsigned int k, const double p, const unsigned int n);
>   
>   double gsl_ran_exponential (const gsl_rng * r, const double mu);
>   double gsl_ran_exponential_pdf (const double x, const double mu);
> + double gsl_ran_exponential_cdf (const double x, const double mu);
>   
>   double gsl_ran_exppow (const gsl_rng * r, const double a, const double b);
>   double gsl_ran_exppow_pdf (const double x, const double a, const double b);
>   
>   double gsl_ran_cauchy (const gsl_rng * r, const double a);
>   double gsl_ran_cauchy_pdf (const double x, const double a);
> + double gsl_ran_cauchy_cdf (const double x, const double a);
>   
>   double gsl_ran_chisq (const gsl_rng * r, const double nu);
>   double gsl_ran_chisq_pdf (const double x, const double nu);
> + double gsl_ran_chisq_cdf (const double x, const double nu);
>   
>   void gsl_ran_dirichlet (const gsl_rng * r, const size_t K, const double alpha[], double theta[]);
>   double gsl_ran_dirichlet_pdf (const size_t K, const double alpha[], const double theta[]);
> ***************
> *** 70,79 ****
> --- 75,86 ----
>   double gsl_ran_gamma (const gsl_rng * r, const double a, const double b);
>   double gsl_ran_gamma_int (const gsl_rng * r, const unsigned int a);
>   double gsl_ran_gamma_pdf (const double x, const double a, const double b);
> + double gsl_ran_gamma_cdf (const double x, const double a, const double b);
>   
>   double gsl_ran_gaussian (const gsl_rng * r, const double sigma);
>   double gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma);
>   double gsl_ran_gaussian_pdf (const double x, const double sigma);
> + double gsl_ran_gaussian_cdf (const double x, const double sigma);
>   
>   double gsl_ran_ugaussian (const gsl_rng * r);
>   double gsl_ran_ugaussian_ratio_method (const gsl_rng * r);
> ***************
> *** 93,98 ****
> --- 100,106 ----
>   
>   unsigned int gsl_ran_geometric (const gsl_rng * r, const double p);
>   double gsl_ran_geometric_pdf (const unsigned int k, const double p);
> + double gsl_ran_geometric_cdf (const unsigned int k, const double p);
>   
>   unsigned int gsl_ran_hypergeometric (const gsl_rng * r, unsigned int n1, unsigned int n2, unsigned int t);
>   double gsl_ran_hypergeometric_pdf (const unsigned int k, const unsigned int n1, const unsigned int n2, unsigned int t);
> ***************
> *** 105,110 ****
> --- 113,119 ----
>   
>   double gsl_ran_logistic (const gsl_rng * r, const double a);
>   double gsl_ran_logistic_pdf (const double x, const double a);
> + double gsl_ran_logistic_cdf (const double x, const double a);
>   
>   double gsl_ran_lognormal (const gsl_rng * r, const double zeta, const double sigma);
>   double gsl_ran_lognormal_pdf (const double x, const double zeta, const double sigma);
> ***************
> *** 129,134 ****
> --- 138,144 ----
>   
>   double gsl_ran_pareto (const gsl_rng * r, double a, const double b);
>   double gsl_ran_pareto_pdf (const double x, const double a, const double b);
> + double gsl_ran_pareto_cdf (const double x, const double a, const double b);
>   
>   unsigned int gsl_ran_poisson (const gsl_rng * r, double mu);
>   void gsl_ran_poisson_array (const gsl_rng * r, size_t n, unsigned int array[],
> ***************
> *** 143,148 ****
> --- 153,159 ----
>   
>   double gsl_ran_tdist (const gsl_rng * r, const double nu);
>   double gsl_ran_tdist_pdf (const double x, const double nu);
> + double gsl_ran_tdist_cdf (const double x, const double nu);
>   
>   double gsl_ran_laplace (const gsl_rng * r, const double a);
>   double gsl_ran_laplace_pdf (const double x, const double a);
> Index: logistic.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/logistic.c,v
> retrieving revision 1.11
> diff -c -r1.11 logistic.c
> *** logistic.c	17 Apr 2001 19:35:56 -0000	1.11
> --- logistic.c	10 Jan 2003 18:09:48 -0000
> ***************
> *** 51,53 ****
> --- 51,59 ----
>     double p = u / (fabs(a) * (1 + u) * (1 + u));
>     return p;
>   }
> + 
> + double
> + gsl_ran_logistic_cdf (const double x, const double a)
> + {
> +   return 1 / (1 + exp(-x/fabs(a)));
> + }
> Index: pareto.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/pareto.c,v
> retrieving revision 1.9
> diff -c -r1.9 pareto.c
> *** pareto.c	21 Sep 2000 19:19:21 -0000	1.9
> --- pareto.c	10 Jan 2003 18:09:48 -0000
> ***************
> *** 52,54 ****
> --- 52,59 ----
>       }
>   }
>   
> + double
> + gsl_ran_pareto_cdf (const double x, const double a, const double b)
> + {
> +   return (x >= b)? 1 - pow (b/x, a) : 0;
> + }
> Index: tdist.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/tdist.c,v
> retrieving revision 1.12
> diff -c -r1.12 tdist.c
> *** tdist.c	23 Apr 2001 09:38:28 -0000	1.12
> --- tdist.c	10 Jan 2003 18:09:48 -0000
> ***************
> *** 77,80 ****
>     return p;
>   }
>   
> ! 
> --- 77,93 ----
>     return p;
>   }
>   
> ! double
> ! gsl_ran_tdist_cdf (const double x, const double nu)
> ! {
> !   if (x == 0)
> !     {
> !       return 0.5;
> !     }
> !   else {
> !     double d = 0.5 * gsl_sf_beta_inc (0.5*nu, 0.5, nu/(nu + x*x));
> !     /* d as a function of x is symmetric about zero, but has a maximum
> !        there; adjust as follows(?): */
> !     return (x < 0)? d : 1 - d;
> !   }
> ! }
> Index: test.c
> ===================================================================
> RCS file: /cvs/gsl/gsl/randist/test.c,v
> retrieving revision 1.37
> diff -c -r1.37 test.c
> *** test.c	10 Dec 2002 19:06:58 -0000	1.37
> --- test.c	10 Jan 2003 18:09:48 -0000
> ***************
> *** 35,58 ****
>   
>   void testMoments (double (*f) (void), const char *name,
>   		  double a, double b, double p);
> ! void testPDF (double (*f) (void), double (*pdf) (double), const char *name);
>   void testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
> ! 		      const char *name);
>   
>   void test_shuffle (void);
>   void test_choose (void);
>   double test_beta (void);
>   double test_beta_pdf (double x);
>   double test_bernoulli (void);
>   double test_bernoulli_pdf (unsigned int n);
>   double test_binomial (void);
>   double test_binomial_pdf (unsigned int n);
>   double test_binomial_large (void);
>   double test_binomial_large_pdf (unsigned int n);
>   double test_cauchy (void);
>   double test_cauchy_pdf (double x);
>   double test_chisq (void);
>   double test_chisq_pdf (double x);
>   double test_dirichlet (void);
>   double test_dirichlet_pdf (double x);
>   void test_dirichlet_moments (void);
> --- 35,64 ----
>   
>   void testMoments (double (*f) (void), const char *name,
>   		  double a, double b, double p);
> ! void testPDF (double (*f) (void), double (*pdf) (double),
> ! 	      double (*cdf) (double), const char *name);
>   void testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
> ! 		      double (*cdf) (unsigned int), const char *name);
>   
>   void test_shuffle (void);
>   void test_choose (void);
>   double test_beta (void);
>   double test_beta_pdf (double x);
> + double test_beta_cdf (double x);
>   double test_bernoulli (void);
>   double test_bernoulli_pdf (unsigned int n);
>   double test_binomial (void);
>   double test_binomial_pdf (unsigned int n);
> + double test_binomial_cdf (unsigned int n);
>   double test_binomial_large (void);
>   double test_binomial_large_pdf (unsigned int n);
> + double test_binomial_large_cdf (unsigned int n);
>   double test_cauchy (void);
>   double test_cauchy_pdf (double x);
> + double test_cauchy_cdf (double x);
>   double test_chisq (void);
>   double test_chisq_pdf (double x);
> + double test_chisq_cdf (double x);
>   double test_dirichlet (void);
>   double test_dirichlet_pdf (double x);
>   void test_dirichlet_moments (void);
> ***************
> *** 64,69 ****
> --- 70,76 ----
>   double test_erlang_pdf (double x);
>   double test_exponential (void);
>   double test_exponential_pdf (double x);
> + double test_exponential_cdf (double x);
>   double test_exppow0 (void);
>   double test_exppow0_pdf (double x);
>   double test_exppow1 (void);
> ***************
> *** 80,93 ****
> --- 87,103 ----
>   double test_flat_pdf (double x);
>   double test_gamma (void);
>   double test_gamma_pdf (double x);
> + double test_gamma_cdf (double x);
>   double test_gamma1 (void);
>   double test_gamma1_pdf (double x);
> + double test_gamma1_cdf (double x);
>   double test_gamma_int (void);
>   double test_gamma_int_pdf (double x);
>   double test_gamma_large (void);
>   double test_gamma_large_pdf (double x);
>   double test_gaussian (void);
>   double test_gaussian_pdf (double x);
> + double test_gaussian_cdf (double x);
>   double test_gaussian_ratio_method (void);
>   double test_gaussian_ratio_method_pdf (double x);
>   double test_gaussian_tail (void);
> ***************
> *** 116,123 ****
> --- 126,135 ----
>   double test_gumbel2_pdf (double x);
>   double test_geometric (void);
>   double test_geometric_pdf (unsigned int x);
> + double test_geometric_cdf (unsigned int x);
>   double test_geometric1 (void);
>   double test_geometric1_pdf (unsigned int x);
> + double test_geometric1_cdf (unsigned int x);
>   double test_hypergeometric1 (void);
>   double test_hypergeometric1_pdf (unsigned int x);
>   double test_hypergeometric2 (void);
> ***************
> *** 154,159 ****
> --- 166,172 ----
>   double test_levy_skew2b_pdf (double x);
>   double test_logistic (void);
>   double test_logistic_pdf (double x);
> + double test_logistic_cdf (double x);
>   double test_lognormal (void);
>   double test_lognormal_pdf (double x);
>   double test_logarithmic (void);
> ***************
> *** 169,174 ****
> --- 182,188 ----
>   double test_pascal_pdf (unsigned int n);
>   double test_pareto (void);
>   double test_pareto_pdf (double x);
> + double test_pareto_cdf (double x);
>   double test_poisson (void);
>   double test_poisson_pdf (unsigned int x);
>   double test_poisson_large (void);
> ***************
> *** 189,196 ****
> --- 203,212 ----
>   double test_rayleigh_tail_pdf (double x);
>   double test_tdist1 (void);
>   double test_tdist1_pdf (double x);
> + double test_tdist1_cdf (double x);
>   double test_tdist2 (void);
>   double test_tdist2_pdf (double x);
> + double test_tdist2_cdf (double x);
>   double test_laplace (void);
>   double test_laplace_pdf (double x);
>   double test_weibull (void);
> ***************
> *** 209,215 ****
>     r_global = gsl_rng_alloc (gsl_rng_default);
>   
>   #define FUNC(x)  test_ ## x,                     "test gsl_ran_" #x
> ! #define FUNC2(x) test_ ## x, test_ ## x ## _pdf, "test gsl_ran_" #x
>   
>     test_shuffle ();
>     test_choose ();
> --- 225,233 ----
>     r_global = gsl_rng_alloc (gsl_rng_default);
>   
>   #define FUNC(x)  test_ ## x,                     "test gsl_ran_" #x
> ! #define FUNC2(x) test_ ## x, test_ ## x ## _pdf, NULL, "test gsl_ran_" #x
> ! #define FUNC3(x) test_ ## x, test_ ## x ## _pdf, test_ ## x ## _cdf, \
> ! 		 "test gsl_ran_" #x
>   
>     test_shuffle ();
>     test_choose ();
> ***************
> *** 228,239 ****
>     test_dirichlet_moments ();
>     test_multinomial_moments ();
>   
> !   testPDF (FUNC2 (beta));
> !   testPDF (FUNC2 (cauchy));
> !   testPDF (FUNC2 (chisq));
>     testPDF (FUNC2 (dirichlet));
>     testPDF (FUNC2 (erlang));
> !   testPDF (FUNC2 (exponential));
>   
>     testPDF (FUNC2 (exppow0));
>     testPDF (FUNC2 (exppow1));
> --- 246,257 ----
>     test_dirichlet_moments ();
>     test_multinomial_moments ();
>   
> !   testPDF (FUNC3 (beta));	/* CDF available */
> !   testPDF (FUNC3 (cauchy));	/* CDF available */
> !   testPDF (FUNC3 (chisq));	/* CDF available */
>     testPDF (FUNC2 (dirichlet));
>     testPDF (FUNC2 (erlang));
> !   testPDF (FUNC3 (exponential));/* CDF available */
>   
>     testPDF (FUNC2 (exppow0));
>     testPDF (FUNC2 (exppow1));
> ***************
> *** 243,253 ****
>   
>     testPDF (FUNC2 (fdist));
>     testPDF (FUNC2 (flat));
> !   testPDF (FUNC2 (gamma));
> !   testPDF (FUNC2 (gamma1));
>     testPDF (FUNC2 (gamma_int));
>     testPDF (FUNC2 (gamma_large));
> !   testPDF (FUNC2 (gaussian));
>     testPDF (FUNC2 (gaussian_ratio_method));
>     testPDF (FUNC2 (ugaussian));
>     testPDF (FUNC2 (ugaussian_ratio_method));
> --- 261,271 ----
>   
>     testPDF (FUNC2 (fdist));
>     testPDF (FUNC2 (flat));
> !   testPDF (FUNC3 (gamma));	/* CDF available */
> !   testPDF (FUNC3 (gamma1));	/* CDF available */
>     testPDF (FUNC2 (gamma_int));
>     testPDF (FUNC2 (gamma_large));
> !   testPDF (FUNC3 (gaussian));  /* CDF available */
>     testPDF (FUNC2 (gaussian_ratio_method));
>     testPDF (FUNC2 (ugaussian));
>     testPDF (FUNC2 (ugaussian_ratio_method));
> ***************
> *** 274,286 ****
>     testPDF (FUNC2 (levy_skew2a));
>     testPDF (FUNC2 (levy_skew1b));
>     testPDF (FUNC2 (levy_skew2b));
> !   testPDF (FUNC2 (logistic));
>     testPDF (FUNC2 (lognormal));
> !   testPDF (FUNC2 (pareto));
>     testPDF (FUNC2 (rayleigh));
>     testPDF (FUNC2 (rayleigh_tail));
> !   testPDF (FUNC2 (tdist1));
> !   testPDF (FUNC2 (tdist2));
>     testPDF (FUNC2 (laplace));
>     testPDF (FUNC2 (weibull));
>     testPDF (FUNC2 (weibull1));
> --- 292,304 ----
>     testPDF (FUNC2 (levy_skew2a));
>     testPDF (FUNC2 (levy_skew1b));
>     testPDF (FUNC2 (levy_skew2b));
> !   testPDF (FUNC3 (logistic));	/* CDF available */
>     testPDF (FUNC2 (lognormal));
> !   testPDF (FUNC3 (pareto));	/* CDF available */
>     testPDF (FUNC2 (rayleigh));
>     testPDF (FUNC2 (rayleigh_tail));
> !   testPDF (FUNC3 (tdist1));	/* CDF available */
> !   testPDF (FUNC3 (tdist2));	/* CDF available */
>     testPDF (FUNC2 (laplace));
>     testPDF (FUNC2 (weibull));
>     testPDF (FUNC2 (weibull1));
> ***************
> *** 296,305 ****
>     testDiscretePDF (FUNC2 (poisson));
>     testDiscretePDF (FUNC2 (poisson_large));
>     testDiscretePDF (FUNC2 (bernoulli));
> !   testDiscretePDF (FUNC2 (binomial));
> !   testDiscretePDF (FUNC2 (binomial_large));
> !   testDiscretePDF (FUNC2 (geometric));
> !   testDiscretePDF (FUNC2 (geometric1));
>     testDiscretePDF (FUNC2 (hypergeometric1));
>     testDiscretePDF (FUNC2 (hypergeometric2));
>     testDiscretePDF (FUNC2 (hypergeometric3));
> --- 314,323 ----
>     testDiscretePDF (FUNC2 (poisson));
>     testDiscretePDF (FUNC2 (poisson_large));
>     testDiscretePDF (FUNC2 (bernoulli));
> !   testDiscretePDF (FUNC3 (binomial));  /* CDF available */
> !   testDiscretePDF (FUNC3 (binomial_large));  /* CDF available */
> !   testDiscretePDF (FUNC3 (geometric));  /* CDF available */
> !   testDiscretePDF (FUNC3 (geometric1));  /* CDF available */
>     testDiscretePDF (FUNC2 (hypergeometric1));
>     testDiscretePDF (FUNC2 (hypergeometric2));
>     testDiscretePDF (FUNC2 (hypergeometric3));
> ***************
> *** 434,444 ****
>   #define BINS 100
>   
>   void
> ! testPDF (double (*f) (void), double (*pdf) (double), const char *name)
>   {
>     double count[BINS], p[BINS];
> !   double a = -5.0, b = +5.0;
> !   double dx = (b - a) / BINS;
>     int i, j, status = 0, status_i = 0;
>   
>     for (i = 0; i < BINS; i++)
> --- 452,466 ----
>   #define BINS 100
>   
>   void
> ! testPDF (double (*f) (void),
> ! 	 double (*pdf) (double),
> ! 	 double (*cdf) (double),
> ! 	 const char *name)
>   {
>     double count[BINS], p[BINS];
> !   const double a = -5.0, b = +5.0;
> !   const double dx = (b - a) / BINS;
> !   double integral = 0;
>     int i, j, status = 0, status_i = 0;
>   
>     for (i = 0; i < BINS; i++)
> ***************
> *** 470,475 ****
> --- 492,515 ----
>   	sum += pdf (x + j * dx / STEPS);
>   
>         p[i] = 0.5 * (pdf (x) + 2 * sum + pdf (x + dx - 1e-7)) * dx / STEPS;
> +       integral += p[i];
> + 
> +       if (cdf != NULL)
> + 	/* leave this test in place as long as some _cdf() functions
> + 	   are missing */
> + 	{
> + 	  /* check whether the pdf integrates up to the cdf */
> + 	  double exact_integral = cdf (x + dx) - cdf (x);
> + 	  gsl_test_rel(exact_integral, p[i], 1e-3,
> + 		       "%s integral on [%g,%g)", name, x, x + dx);
> + 	}
> +     }
> + 
> +   if (cdf != NULL)
> +     {
> +       double exact_integral = cdf (b) - cdf (a);
> +       gsl_test_rel(exact_integral, integral, 1e-6,
> + 		   "%s integral on [%g,%g)", name, a, b);
>       }
>   
>     for (i = 0; i < BINS; i++)
> ***************
> *** 498,507 ****
>   
>   void
>   testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
> ! 		 const char *name)
>   {
>     double count[BINS], p[BINS];
>     unsigned int i;
>     int status = 0, status_i = 0;
>   
>     for (i = 0; i < BINS; i++)
> --- 538,548 ----
>   
>   void
>   testDiscretePDF (double (*f) (void), double (*pdf) (unsigned int),
> ! 		 double (*cdf) (unsigned int), const char *name)
>   {
>     double count[BINS], p[BINS];
>     unsigned int i;
> +   double sum = 0;
>     int status = 0, status_i = 0;
>   
>     for (i = 0; i < BINS; i++)
> ***************
> *** 514,521 ****
>   	count[r]++;
>       }
>   
> !   for (i = 0; i < BINS; i++)
>       p[i] = pdf (i);
>   
>     for (i = 0; i < BINS; i++)
>       {
> --- 555,566 ----
>   	count[r]++;
>       }
>   
> !   for (i = 0; i < BINS; i++) {
>       p[i] = pdf (i);
> +     sum += p[i];
> +     if (cdf != NULL)
> +       gsl_test_rel(cdf(i), sum, 1e-6, "%s sum_0^%d", name, i);
> +   }
>   
>     for (i = 0; i < BINS; i++)
>       {
> ***************
> *** 553,558 ****
> --- 598,609 ----
>   {
>     return gsl_ran_beta_pdf (x, 2.0, 3.0);
>   }
> + double
> + test_beta_cdf (double x)
> + {
> +   return gsl_ran_beta_cdf (x, 2.0, 3.0);
> + }
> + 
>   
>   double
>   test_bernoulli (void)
> ***************
> *** 580,585 ****
> --- 631,642 ----
>   }
>   
>   double
> + test_binomial_cdf (unsigned int n)
> + {
> +   return gsl_ran_binomial_cdf (n, 0.3, 5);
> + }
> + 
> + double
>   test_binomial_large (void)
>   {
>     return gsl_ran_binomial (r_global, 0.3, 55);
> ***************
> *** 590,595 ****
> --- 647,658 ----
>   {
>     return gsl_ran_binomial_pdf (n, 0.3, 55);
>   }
> + double
> + test_binomial_large_cdf (unsigned int n)
> + {
> +   return gsl_ran_binomial_cdf (n, 0.3, 55);
> + }
> + 
>   
>   double
>   test_cauchy (void)
> ***************
> *** 604,609 ****
> --- 667,679 ----
>   }
>   
>   double
> + test_cauchy_cdf (double x)
> + {
> +   return gsl_ran_cauchy_cdf (x, 2.0);
> + }
> + 
> + 
> + double
>   test_chisq (void)
>   {
>     return gsl_ran_chisq (r_global, 13.0);
> ***************
> *** 616,621 ****
> --- 686,697 ----
>   }
>   
>   double
> + test_chisq_cdf (double x)
> + {
> +   return gsl_ran_chisq_cdf (x, 13.0);
> + }
> + 
> + double
>   test_dir2d (void)
>   {
>     double x = 0, y = 0, theta;
> ***************
> *** 911,916 ****
> --- 987,998 ----
>   }
>   
>   double
> + test_exponential_cdf (double x)
> + {
> +   return gsl_ran_exponential_cdf (x, 2.0);
> + }
> + 
> + double
>   test_exppow0 (void)
>   {
>     return gsl_ran_exppow (r_global, 3.7, 0.3);
> ***************
> *** 1008,1013 ****
> --- 1090,1101 ----
>   }
>   
>   double
> + test_gamma_cdf (double x)
> + {
> +   return gsl_ran_gamma_cdf (x, 2.5, 2.17);
> + }
> + 
> + double
>   test_gamma1 (void)
>   {
>     return gsl_ran_gamma (r_global, 1.0, 2.17);
> ***************
> *** 1019,1024 ****
> --- 1107,1117 ----
>     return gsl_ran_gamma_pdf (x, 1.0, 2.17);
>   }
>   
> + double
> + test_gamma1_cdf (double x)
> + {
> +   return gsl_ran_gamma_cdf (x, 1.0, 2.17);
> + }
>   
>   double
>   test_gamma_int (void)
> ***************
> *** 1057,1062 ****
> --- 1150,1161 ----
>   {
>     return gsl_ran_gaussian_pdf (x, 3.0);
>   }
> + double
> + 
> + test_gaussian_cdf (double x)
> + {
> +   return gsl_ran_gaussian_cdf (x, 3.0);
> + }
>   
>   double
>   test_gaussian_ratio_method (void)
> ***************
> *** 1232,1237 ****
> --- 1331,1342 ----
>   }
>   
>   double
> + test_geometric_cdf (unsigned int n)
> + {
> +   return gsl_ran_geometric_cdf (n, 0.5);
> + }
> + 
> + double
>   test_geometric1 (void)
>   {
>     return gsl_ran_geometric (r_global, 1.0);
> ***************
> *** 1244,1249 ****
> --- 1349,1360 ----
>   }
>   
>   double
> + test_geometric1_cdf (unsigned int n)
> + {
> +   return gsl_ran_geometric_cdf (n, 1.0);
> + }
> + 
> + double
>   test_hypergeometric1 (void)
>   {
>     return gsl_ran_hypergeometric (r_global, 5, 7, 4);
> ***************
> *** 1491,1496 ****
> --- 1602,1614 ----
>   }
>   
>   double
> + test_logistic_cdf (double x)
> + {
> +   return gsl_ran_logistic_cdf (x, 3.1);
> + }
> + 
> + 
> + double
>   test_logarithmic (void)
>   {
>     return gsl_ran_logarithmic (r_global, 0.4);
> ***************
> *** 1532,1538 ****
>   double
>   test_multinomial_pdf (unsigned int n_0)
>   {
> !   /* The margional distribution of just 1 variate  is binomial. */
>     size_t K = 2;
>     /* Test use of weights instead of probabilities */
>     double p[] = { 0.4, 1.6};
> --- 1650,1656 ----
>   double
>   test_multinomial_pdf (unsigned int n_0)
>   {
> !   /* The marginal distribution of just 1 variate is binomial. */
>     size_t K = 2;
>     /* Test use of weights instead of probabilities */
>     double p[] = { 0.4, 1.6};
> ***************
> *** 1603,1608 ****
> --- 1721,1733 ----
>   }
>   
>   double
> + test_pareto_cdf (double x)
> + {
> +   return gsl_ran_pareto_cdf (x, 1.9, 2.75);
> + }
> + 
> + 
> + double
>   test_rayleigh (void)
>   {
>     return gsl_ran_rayleigh (r_global, 1.9);
> ***************
> *** 1665,1670 ****
> --- 1790,1801 ----
>   }
>   
>   double
> + test_tdist1_cdf (double x)
> + {
> +   return gsl_ran_tdist_cdf (x, 1.75);
> + }
> + 
> + double
>   test_tdist2 (void)
>   {
>     return gsl_ran_tdist (r_global, 12.75);
> ***************
> *** 1674,1679 ****
> --- 1805,1816 ----
>   test_tdist2_pdf (double x)
>   {
>     return gsl_ran_tdist_pdf (x, 12.75);
> + }
> + 
> + double
> + test_tdist2_cdf (double x)
> + {
> +   return gsl_ran_tdist_cdf (x, 12.75);
>   }
>   
>   

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CDF diff
  2003-01-15  3:31 ` Jason Hooper Stover
@ 2003-01-16  2:35   ` Martin Jansche
  2003-01-16 14:05     ` Jason Hooper Stover
  0 siblings, 1 reply; 6+ messages in thread
From: Martin Jansche @ 2003-01-16  2:35 UTC (permalink / raw)
  To: Jason Hooper Stover; +Cc: gsl-discuss

On Tue, 14 Jan 2003, Jason Hooper Stover wrote:

> I hadn't planned to compute Pr(a<Z<b) directly since such
> a function isn't a "cumulative distribution function"
> according to the common definition. Also, users can
> compute Pr(a<Z<b) as Pr(Z<b)-Pr(Z<a).

True, but what if I need Pr(a<Z<b) for a discrete distribution where
Pr(Z<b) has to be computed via an explicit summation (or does that
never happen?) and a>>0?  OK, I guess this could be pointed out in
documentation and one could then do the summation in user code.

> effort for a library like GSL. E.g., such an approach might make
> sense for a higher-level program that runs statistical routines
> for the exponential family.  But I could be wrong; maybe it would
> make sense in GSL.  If you have some coded examples of this kind
> of thing, send them along.

For example, you could define a type

typedef struct {
  double shape;
  double scale;
  double pdf_lognorm;
  double pdf_x0;
} gsl_ran_gamma_t;

with a corresponding constructor

gsl_ran_gamma_t *
gsl_ran_gamma_alloc (const double shape,
                     const double inverse_scale)
{
  gsl_ran_gamma_t *p;
  double scale;

  if (shape <= 0 || inverse_scale <= 0)
    GSL_ERROR_VAL ("requires shape > 0, inverse_scale > 0",
                   GSL_EDOM, NULL);

  p = malloc (sizeof (gsl_ran_gamma_t));
  if (p == NULL)
    GSL_ERROR_VAL ("failed to allocate space for ran_gamma struct",
                   GSL_ENOMEM, NULL);

  scale = 1 / inverse_scale;
  p->shape = shape;
  p->scale = scale;

  if (shape == 1)
    {
      p->pdf_lognorm = log (scale);
      p->pdf_x0 = scale;
    }
  else
    {
      p->pdf_lognorm = shape * log (scale) - gsl_sf_lngamma (shape);
      p->pdf_x0 = 0;
    }
  return p;
}

Then you can define gamma_pdf and gamma_cdf like this:

double
gsl_ran_gamma_pdf (const double x, gsl_ran_gamma_t *params)
{
  if (x < 0)
    {
      return 0;
    }
  else if (x == 0)
    {
      return params->pdf_x0;
    }
  else
    {
      double a = params->shape - 1;
      double logp = params->pdf_lognorm;
      logp -= params->scale * x;
      if (a != 0)
        logp += a * log (x);
      return exp (logp);
    }
}

double
gsl_ran_gamma_cdf (const double x, gsl_ran_gamma_t *params)
{
  if (x <= 0)
    {
      return 0;
    }
  else
    {
      double y = x * params->scale;
      if (params->shape == 1)
        return fabs (gsl_expm1 (-y));
      else
        return gsl_sf_gamma_inc_P (params->shape, y);
    }
}

This has the disadvantage of introducing yet another type and
allocator for each family, and so of course it breaks compatibility.
But it has several advantages *if you compute many values for the same
distribution with fixed parameters* (or if you repeatedly sample from
the same distribution), which is not too uncommon IMO:

  First, domain checks on the parameters have to be done exactly once
  in the _alloc() function.  This is an easy way to avoid code
  duplication, since the same checks would have to be carried out for
  the pdf, the cdf, and any other functionality that may be added in
  the future (inverse cdf, mode, etc.).

  Second, computing the normalization term is typically an expensive
  operation (e.g. binomial likelihood is easy to compute, but the
  corresponding Beta density is more costly), but does not depend on
  the value of x; in general, all values that are independent of x
  can be precomputed in the _alloc() function and stored in the
  parameter struct.  In the code above, lngamma(shape) is evaluated
  exactly once.

  Third, the functions now have a form in which they could be used
  directly in conjunction with gsl_function(_fdf), e.g. if a user
  wanted to use a one-dimensional root finding algorithm to compute
  inverse cdfs.

  Fourth, this format would make it easy to incorporate distributions
  that are simple reparametrizations of existing distributions.  For
  example, one could now have a function

    gsl_ran_gamma_t *gsl_ran_chisq_alloc(const double df);

  and simply not implement gsl_ran_chisq_pdf separately from
  gsl_ran_gamma_pdf.  (Or if this should be hidden from the user, it'd
  be easy to generate the illusion of separate functionality with
  #defines).

It would even be possible to do this in a somewhat object-oriented way
and include function pointers into the struct so that one would get a
a gamma density via:

  gsl_ran_gamma_t *g = gsl_ran_gamma_alloc (1, 1);
  double x = g->pdf (x, g);  /* presumably hidden behind a macro */

Notice that gsl_ran_gamma_alloc could have quietly initialized the pdf
field with a pointer to a function that computes an exponential
density (because shape == 1).

One potentail counterargument against this design might be that it
makes it more difficult for the user to implement likelihood based
computations, since the parameters of the distribution would
presumably change frequently.  I wouldn't consider this a valid
objection though, since for efficiently computing likelihood for many
samples the code has to be structured very differently anyway.

- martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: CDF diff
  2003-01-16  2:35   ` Martin Jansche
@ 2003-01-16 14:05     ` Jason Hooper Stover
  0 siblings, 0 replies; 6+ messages in thread
From: Jason Hooper Stover @ 2003-01-16 14:05 UTC (permalink / raw)
  To: Martin Jansche; +Cc: gsl-discuss

On Wed, Jan 15, 2003 at 09:34:55PM -0500, Martin Jansche wrote:
> On Tue, 14 Jan 2003, Jason Hooper Stover wrote:
> 
> > I hadn't planned to compute Pr(a<Z<b) directly since such
> > a function isn't a "cumulative distribution function"
> > according to the common definition. Also, users can
> > compute Pr(a<Z<b) as Pr(Z<b)-Pr(Z<a).
> 
> True, but what if I need Pr(a<Z<b) for a discrete distribution where
> Pr(Z<b) has to be computed via an explicit summation (or does that
> never happen?) and a>>0?  OK, I guess this could be pointed out in
> documentation and one could then do the summation in user code.

Okay, now I see what you mean. I'd add code to handle this situation
but only after all the usual cdf's (computed as Pr(Z<t) or Pr(Z>t)
had been added to the cdf module.

I read about your design below. I agree with Brian Gough and
would like stick with basic types for the module. Your design 
looks good to me for some purposes, but that I can think of routines 
someone might write that uses cdf that your design might hinder. 
Since I don't know how exactly people will use cdf, I can't 
favor one user's design if it might hinder another's use 
of the module. The cost is the use of basic types everywhere, 
which might not be too elegant, but it does keep the library 
general and easier to learn to use. 

The cdf routines aren't in the release version 
yet anyway, and after they are, if users make it clear that 
such a design beats one which uses only basic types, I (or whoever 
maintains it) can change the code. Also, your design does
require users to learn the different types for each distribution, and
I have a hunch they won't want to learn any more than they have to
use the module.

-Jason

> > effort for a library like GSL. E.g., such an approach might make
> > sense for a higher-level program that runs statistical routines
> > for the exponential family.  But I could be wrong; maybe it would
> > make sense in GSL.  If you have some coded examples of this kind
> > of thing, send them along.
> 
> For example, you could define a type
> 
> typedef struct {
>   double shape;
>   double scale;
>   double pdf_lognorm;
>   double pdf_x0;
> } gsl_ran_gamma_t;
> 
> with a corresponding constructor
> 
> gsl_ran_gamma_t *
> gsl_ran_gamma_alloc (const double shape,
>                      const double inverse_scale)
> {
>   gsl_ran_gamma_t *p;
>   double scale;
> 
>   if (shape <= 0 || inverse_scale <= 0)
>     GSL_ERROR_VAL ("requires shape > 0, inverse_scale > 0",
>                    GSL_EDOM, NULL);
> 
>   p = malloc (sizeof (gsl_ran_gamma_t));
>   if (p == NULL)
>     GSL_ERROR_VAL ("failed to allocate space for ran_gamma struct",
>                    GSL_ENOMEM, NULL);
> 
>   scale = 1 / inverse_scale;
>   p->shape = shape;
>   p->scale = scale;
> 
>   if (shape == 1)
>     {
>       p->pdf_lognorm = log (scale);
>       p->pdf_x0 = scale;
>     }
>   else
>     {
>       p->pdf_lognorm = shape * log (scale) - gsl_sf_lngamma (shape);
>       p->pdf_x0 = 0;
>     }
>   return p;
> }
> 
> Then you can define gamma_pdf and gamma_cdf like this:
> 
> double
> gsl_ran_gamma_pdf (const double x, gsl_ran_gamma_t *params)
> {
>   if (x < 0)
>     {
>       return 0;
>     }
>   else if (x == 0)
>     {
>       return params->pdf_x0;
>     }
>   else
>     {
>       double a = params->shape - 1;
>       double logp = params->pdf_lognorm;
>       logp -= params->scale * x;
>       if (a != 0)
>         logp += a * log (x);
>       return exp (logp);
>     }
> }
> 
> double
> gsl_ran_gamma_cdf (const double x, gsl_ran_gamma_t *params)
> {
>   if (x <= 0)
>     {
>       return 0;
>     }
>   else
>     {
>       double y = x * params->scale;
>       if (params->shape == 1)
>         return fabs (gsl_expm1 (-y));
>       else
>         return gsl_sf_gamma_inc_P (params->shape, y);
>     }
> }
> 
> This has the disadvantage of introducing yet another type and
> allocator for each family, and so of course it breaks compatibility.
> But it has several advantages *if you compute many values for the same
> distribution with fixed parameters* (or if you repeatedly sample from
> the same distribution), which is not too uncommon IMO:
> 
>   First, domain checks on the parameters have to be done exactly once
>   in the _alloc() function.  This is an easy way to avoid code
>   duplication, since the same checks would have to be carried out for
>   the pdf, the cdf, and any other functionality that may be added in
>   the future (inverse cdf, mode, etc.).
> 
>   Second, computing the normalization term is typically an expensive
>   operation (e.g. binomial likelihood is easy to compute, but the
>   corresponding Beta density is more costly), but does not depend on
>   the value of x; in general, all values that are independent of x
>   can be precomputed in the _alloc() function and stored in the
>   parameter struct.  In the code above, lngamma(shape) is evaluated
>   exactly once.
> 
>   Third, the functions now have a form in which they could be used
>   directly in conjunction with gsl_function(_fdf), e.g. if a user
>   wanted to use a one-dimensional root finding algorithm to compute
>   inverse cdfs.
> 
>   Fourth, this format would make it easy to incorporate distributions
>   that are simple reparametrizations of existing distributions.  For
>   example, one could now have a function
> 
>     gsl_ran_gamma_t *gsl_ran_chisq_alloc(const double df);
> 
>   and simply not implement gsl_ran_chisq_pdf separately from
>   gsl_ran_gamma_pdf.  (Or if this should be hidden from the user, it'd
>   be easy to generate the illusion of separate functionality with
>   #defines).
> 
> It would even be possible to do this in a somewhat object-oriented way
> and include function pointers into the struct so that one would get a
> a gamma density via:
> 
>   gsl_ran_gamma_t *g = gsl_ran_gamma_alloc (1, 1);
>   double x = g->pdf (x, g);  /* presumably hidden behind a macro */
> 
> Notice that gsl_ran_gamma_alloc could have quietly initialized the pdf
> field with a pointer to a function that computes an exponential
> density (because shape == 1).
> 
> One potentail counterargument against this design might be that it
> makes it more difficult for the user to implement likelihood based
> computations, since the parameters of the distribution would
> presumably change frequently.  I wouldn't consider this a valid
> objection though, since for efficiently computing likelihood for many
> samples the code has to be structured very differently anyway.
> 
> - martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-01-16 14:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-10 18:43 CDF diff Martin Jansche
2003-01-14 22:02 ` Brian Gough
2003-01-14 22:32 ` Brian Gough
2003-01-15  3:31 ` Jason Hooper Stover
2003-01-16  2:35   ` Martin Jansche
2003-01-16 14:05     ` Jason Hooper Stover

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).