* accuracy of gsl_cdf_binomial_P @ 2011-06-02 4:50 Z F 2011-06-05 13:04 ` Well Howell 0 siblings, 1 reply; 4+ messages in thread From: Z F @ 2011-06-02 4:50 UTC (permalink / raw) To: gsl-discuss Hello everybody, I was wondering if someone could comment on the accuracy of gsl_cdf_binomial_P() function gsl implementation for large n (n is about a few thousand). for different values of p and when the result of cdf is in the tails ( small less then 0.05 and large -- above 0.95) Thank you very much ZF ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: accuracy of gsl_cdf_binomial_P 2011-06-02 4:50 accuracy of gsl_cdf_binomial_P Z F @ 2011-06-05 13:04 ` Well Howell 2011-06-06 2:20 ` Z F 0 siblings, 1 reply; 4+ messages in thread From: Well Howell @ 2011-06-05 13:04 UTC (permalink / raw) To: gsl-discuss An interesting (but "homework-like" ~;) question - and fun to answer too. Anyway, I'd probably compare GSL results with those from other sources. I had easy access to gsl_cdf_binomial_P (v 1.14), R pbinom(k,n,p), binomCDF (Excel 2007) and dcdflib (Fortran - Brown, Lovato & Russel; U. Texas; November, 1997). For a sample size of n=1000, a trial probability of p=0.01 and number of successes of s=1 thru 40, the CDF values from dcdclib and the R 2.13.0 stats package pbinom() function (http://cran.r-project.org/) show no difference. Mean absolute deviations for these 40 tests, comparing pbinom with gsl_cdf_binomial_P and with binomCDF, show MAD of 2.319E-15 and 3.296E-15 respectively. My "commend"? Looks as if we all have to decide when to STOP accumulating small terms, and some stop earlier than others. While I always test functions in Excel against other sources before release in a report, anything showing a MAD below 4E-15 sure beats using my slide rule (which didn't have an incomplete beta function anyway ~;). Well Howell On 6/2/2011 12:49 AM, Z F wrote: > Hello everybody, > > I was wondering if someone could comment on the accuracy of gsl_cdf_binomial_P() function gsl implementation for large n (n is about a few thousand). > for different values of p and when the result of cdf is in the tails ( small less then 0.05 and large -- above 0.95) > > Thank you very much > > ZF > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: accuracy of gsl_cdf_binomial_P 2011-06-05 13:04 ` Well Howell @ 2011-06-06 2:20 ` Z F 2011-06-06 10:48 ` Well Howell 0 siblings, 1 reply; 4+ messages in thread From: Z F @ 2011-06-06 2:20 UTC (permalink / raw) To: gsl-discuss, well Dear Well Howell, --- On Sun, 6/5/11, Well Howell <whowell@superlink.net> wrote: > An interesting (but "homework-like" > ~;) question - and fun to answer too. > > Anyway, I'd probably compare GSL results with those from > other sources. > > I had easy access to gsl_cdf_binomial_P (v 1.14), R > pbinom(k,n,p), > binomCDF > (Excel 2007) and dcdflib (Fortran - Brown, Lovato & > Russel; U. Texas; > November, 1997). > > For a sample size of n=1000, a trial probability of p=0.01 > and number of > successes of > s=1 thru 40, the CDF values from dcdclib and the R 2.13.0 > stats package > pbinom() > function (http://cran.r-project.org/) show no > difference. > Thank you very much for your reply. It seems I was not clear with my question. I am not looking for a comparison with other libraries, but rather for information regarding the approximations used to obtain the values of CDF. What I am afraid of is that a Gaussian approximation is used for a large sample, rendering values in the tails of the distribution error-prone. I someone could provide any info on the subject or maybe point in the "right direction" , I would highly appreciate it. Thanks again ZF > > On 6/2/2011 12:49 AM, Z F wrote: > > Hello everybody, > > > > I was wondering if someone could comment on the > accuracy of gsl_cdf_binomial_P() function gsl implementation > for large n (n is about a few thousand). > > for different values of p and when the result of cdf > is in the tails ( small less then 0.05 and large -- above > 0.95) > > > > Thank you very much > > > > ZF > > > > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: accuracy of gsl_cdf_binomial_P 2011-06-06 2:20 ` Z F @ 2011-06-06 10:48 ` Well Howell 0 siblings, 0 replies; 4+ messages in thread From: Well Howell @ 2011-06-06 10:48 UTC (permalink / raw) To: gsl-discuss I do see a testing function (beta_series) that only tries sample sizes smaller than n=512, but I can't easily find any use of a gaussian approximation in the source code for the 1.14 version of GSL. I don't expect some of the other sources I tested against to use the gaussian either, so my finding that all 4 methods agree within about ten times the IEEE eps value of 2.2204E-16 would be proof enough for me to NOT fully read the beta_inc.c source code. Funny history - I was first asked if I was using the gaussian approximation to the binomial in the mid 60's, and was able to answer that I was using the exact binomial ~;) On 6/5/2011 10:19 PM, Z F wrote: > Dear Well Howell, > > --- On Sun, 6/5/11, Well Howell<whowell@superlink.net> wrote: > >> An interesting (but "homework-like" >> ~;) question - and fun to answer too. >> >> Anyway, I'd probably compare GSL results with those from >> other sources. >> >> I had easy access to gsl_cdf_binomial_P (v 1.14), R >> pbinom(k,n,p), >> binomCDF >> (Excel 2007) and dcdflib (Fortran - Brown, Lovato& >> Russel; U. Texas; >> November, 1997). >> >> For a sample size of n=1000, a trial probability of p=0.01 >> and number of >> successes of >> s=1 thru 40, the CDF values from dcdclib and the R 2.13.0 >> stats package >> pbinom() >> function (http://cran.r-project.org/) show no >> difference. >> > Thank you very much for your reply. > It seems I was not clear with my question. I am not looking for a > comparison with other libraries, but rather for information regarding > the approximations used to obtain the values of CDF. What I am afraid of > is that a Gaussian approximation is used for a large sample, rendering > values in the tails of the distribution error-prone. > > I someone could provide any info on the subject or maybe point in the "right direction" , I would highly appreciate it. > > > Thanks again > > ZF > > >> On 6/2/2011 12:49 AM, Z F wrote: >>> Hello everybody, >>> >>> I was wondering if someone could comment on the >> accuracy of gsl_cdf_binomial_P() function gsl implementation >> for large n (n is about a few thousand). >>> for different values of p and when the result of cdf >> is in the tails ( small less then 0.05 and large -- above >> 0.95) >>> Thank you very much >>> >>> ZF >>> >>> >> > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-06-06 10:48 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-02 4:50 accuracy of gsl_cdf_binomial_P Z F 2011-06-05 13:04 ` Well Howell 2011-06-06 2:20 ` Z F 2011-06-06 10:48 ` Well Howell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).