public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* accuracy of gsl_cdf_binomial_P
@ 2011-06-02  4:50 Z F
  2011-06-05 13:04 ` Well Howell
  0 siblings, 1 reply; 4+ messages in thread
From: Z F @ 2011-06-02  4:50 UTC (permalink / raw)
  To: gsl-discuss

Hello everybody,

I was wondering if someone could comment on the accuracy of gsl_cdf_binomial_P() function gsl implementation for large n (n is about a few thousand).
for different values of p and when the result of cdf is in the tails ( small less then 0.05 and large -- above 0.95)

Thank you very much

ZF

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: accuracy of gsl_cdf_binomial_P
  2011-06-02  4:50 accuracy of gsl_cdf_binomial_P Z F
@ 2011-06-05 13:04 ` Well Howell
  2011-06-06  2:20   ` Z F
  0 siblings, 1 reply; 4+ messages in thread
From: Well Howell @ 2011-06-05 13:04 UTC (permalink / raw)
  To: gsl-discuss

An interesting (but "homework-like" ~;) question - and fun to answer too.

Anyway, I'd probably compare GSL results with those from other sources.

I had easy access to gsl_cdf_binomial_P (v 1.14),  R pbinom(k,n,p), 
binomCDF
(Excel 2007) and dcdflib (Fortran - Brown, Lovato & Russel; U. Texas; 
November, 1997).

For a sample size of n=1000, a trial probability of p=0.01 and number of 
successes of
s=1 thru 40, the CDF values from dcdclib and the R 2.13.0 stats package 
pbinom()
function (http://cran.r-project.org/) show no difference.

Mean absolute deviations for these 40 tests, comparing pbinom with
gsl_cdf_binomial_P and with binomCDF, show  MAD of 2.319E-15 and 3.296E-15
respectively.

My "commend"?  Looks as if we all have to decide when to STOP
accumulating small terms, and some stop earlier than others.  While I always
test functions in Excel against other sources before release in a report,
anything showing a MAD below 4E-15 sure beats using my slide rule
(which didn't have an incomplete beta function anyway ~;).

Well Howell




On 6/2/2011 12:49 AM, Z F wrote:
> Hello everybody,
>
> I was wondering if someone could comment on the accuracy of gsl_cdf_binomial_P() function gsl implementation for large n (n is about a few thousand).
> for different values of p and when the result of cdf is in the tails ( small less then 0.05 and large -- above 0.95)
>
> Thank you very much
>
> ZF
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: accuracy of gsl_cdf_binomial_P
  2011-06-05 13:04 ` Well Howell
@ 2011-06-06  2:20   ` Z F
  2011-06-06 10:48     ` Well Howell
  0 siblings, 1 reply; 4+ messages in thread
From: Z F @ 2011-06-06  2:20 UTC (permalink / raw)
  To: gsl-discuss, well

Dear Well Howell,

--- On Sun, 6/5/11, Well Howell <whowell@superlink.net> wrote:

> An interesting (but "homework-like"
> ~;) question - and fun to answer too.
> 
> Anyway, I'd probably compare GSL results with those from
> other sources.
> 
> I had easy access to gsl_cdf_binomial_P (v 1.14),  R
> pbinom(k,n,p), 
> binomCDF
> (Excel 2007) and dcdflib (Fortran - Brown, Lovato &
> Russel; U. Texas; 
> November, 1997).
> 
> For a sample size of n=1000, a trial probability of p=0.01
> and number of 
> successes of
> s=1 thru 40, the CDF values from dcdclib and the R 2.13.0
> stats package 
> pbinom()
> function (http://cran.r-project.org/) show no
> difference.
> 

Thank you very much for your reply. 
It seems I was not clear with my question. I am not looking for a
comparison with other libraries, but rather for information regarding
the approximations used to obtain the values of CDF. What I am afraid of
is that a Gaussian approximation is used for a large sample, rendering
values in the tails of the distribution error-prone.

I someone could provide any info on the subject or maybe point in the "right direction" , I would highly appreciate it.


Thanks again

ZF


> 
> On 6/2/2011 12:49 AM, Z F wrote:
> > Hello everybody,
> >
> > I was wondering if someone could comment on the
> accuracy of gsl_cdf_binomial_P() function gsl implementation
> for large n (n is about a few thousand).
> > for different values of p and when the result of cdf
> is in the tails ( small less then 0.05 and large -- above
> 0.95)
> >
> > Thank you very much
> >
> > ZF
> >
> >
> 
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: accuracy of gsl_cdf_binomial_P
  2011-06-06  2:20   ` Z F
@ 2011-06-06 10:48     ` Well Howell
  0 siblings, 0 replies; 4+ messages in thread
From: Well Howell @ 2011-06-06 10:48 UTC (permalink / raw)
  To: gsl-discuss

I do see a testing function (beta_series) that only tries sample sizes 
smaller than
n=512, but I can't easily find any use of a gaussian approximation in 
the source
code for the 1.14 version of GSL.

I don't expect some of the other sources I tested against to use the 
gaussian
either, so my finding that all 4 methods agree within about ten times 
the IEEE
eps value of 2.2204E-16 would be proof enough for me to NOT fully read the
beta_inc.c source code.

Funny history - I was first asked if I was using the gaussian 
approximation to
the binomial in the mid 60's, and was able to answer that I was using 
the exact
binomial ~;)

On 6/5/2011 10:19 PM, Z F wrote:
> Dear Well Howell,
>
> --- On Sun, 6/5/11, Well Howell<whowell@superlink.net>  wrote:
>
>> An interesting (but "homework-like"
>> ~;) question - and fun to answer too.
>>
>> Anyway, I'd probably compare GSL results with those from
>> other sources.
>>
>> I had easy access to gsl_cdf_binomial_P (v 1.14),  R
>> pbinom(k,n,p),
>> binomCDF
>> (Excel 2007) and dcdflib (Fortran - Brown, Lovato&
>> Russel; U. Texas;
>> November, 1997).
>>
>> For a sample size of n=1000, a trial probability of p=0.01
>> and number of
>> successes of
>> s=1 thru 40, the CDF values from dcdclib and the R 2.13.0
>> stats package
>> pbinom()
>> function (http://cran.r-project.org/) show no
>> difference.
>>
> Thank you very much for your reply.
> It seems I was not clear with my question. I am not looking for a
> comparison with other libraries, but rather for information regarding
> the approximations used to obtain the values of CDF. What I am afraid of
> is that a Gaussian approximation is used for a large sample, rendering
> values in the tails of the distribution error-prone.
>
> I someone could provide any info on the subject or maybe point in the "right direction" , I would highly appreciate it.
>
>
> Thanks again
>
> ZF
>
>
>> On 6/2/2011 12:49 AM, Z F wrote:
>>> Hello everybody,
>>>
>>> I was wondering if someone could comment on the
>> accuracy of gsl_cdf_binomial_P() function gsl implementation
>> for large n (n is about a few thousand).
>>> for different values of p and when the result of cdf
>> is in the tails ( small less then 0.05 and large -- above
>> 0.95)
>>> Thank you very much
>>>
>>> ZF
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-06-06 10:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-02  4:50 accuracy of gsl_cdf_binomial_P Z F
2011-06-05 13:04 ` Well Howell
2011-06-06  2:20   ` Z F
2011-06-06 10:48     ` Well Howell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).