From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26697 invoked by alias); 6 Jun 2011 10:48:07 -0000 Received: (qmail 26689 invoked by uid 22791); 6 Jun 2011 10:48:07 -0000 X-SWARE-Spam-Status: Yes, hits=6.5 required=5.0 tests=BAYES_50,BOTNET,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_SOFTFAIL X-Spam-Check-By: sourceware.org Received: from vms173011pub.verizon.net (HELO vms173011pub.verizon.net) (206.46.173.11) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 06 Jun 2011 10:47:51 +0000 Received: from [192.168.1.5] ([unknown] [72.82.241.96]) by vms173011.mailsrvcs.net (Sun Java(tm) System Messaging Server 7u2-7.02 32bit (built Apr 16 2009)) with ESMTPA id <0LMD0076R7BLVC42@vms173011.mailsrvcs.net> for gsl-discuss@sourceware.org; Mon, 06 Jun 2011 05:47:45 -0500 (CDT) Message-id: <4DECB052.6000303@superlink.net> Date: Mon, 06 Jun 2011 10:48:00 -0000 From: Well Howell Reply-to: well@wheatstone-analytics.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-version: 1.0 To: gsl-discuss@sourceware.org Subject: Re: accuracy of gsl_cdf_binomial_P References: <561449.86378.qm@web110509.mail.gq1.yahoo.com> In-reply-to: <561449.86378.qm@web110509.mail.gq1.yahoo.com> Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7bit Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00019.txt.bz2 I do see a testing function (beta_series) that only tries sample sizes smaller than n=512, but I can't easily find any use of a gaussian approximation in the source code for the 1.14 version of GSL. I don't expect some of the other sources I tested against to use the gaussian either, so my finding that all 4 methods agree within about ten times the IEEE eps value of 2.2204E-16 would be proof enough for me to NOT fully read the beta_inc.c source code. Funny history - I was first asked if I was using the gaussian approximation to the binomial in the mid 60's, and was able to answer that I was using the exact binomial ~;) On 6/5/2011 10:19 PM, Z F wrote: > Dear Well Howell, > > --- On Sun, 6/5/11, Well Howell wrote: > >> An interesting (but "homework-like" >> ~;) question - and fun to answer too. >> >> Anyway, I'd probably compare GSL results with those from >> other sources. >> >> I had easy access to gsl_cdf_binomial_P (v 1.14), R >> pbinom(k,n,p), >> binomCDF >> (Excel 2007) and dcdflib (Fortran - Brown, Lovato& >> Russel; U. Texas; >> November, 1997). >> >> For a sample size of n=1000, a trial probability of p=0.01 >> and number of >> successes of >> s=1 thru 40, the CDF values from dcdclib and the R 2.13.0 >> stats package >> pbinom() >> function (http://cran.r-project.org/) show no >> difference. >> > Thank you very much for your reply. > It seems I was not clear with my question. I am not looking for a > comparison with other libraries, but rather for information regarding > the approximations used to obtain the values of CDF. What I am afraid of > is that a Gaussian approximation is used for a large sample, rendering > values in the tails of the distribution error-prone. > > I someone could provide any info on the subject or maybe point in the "right direction" , I would highly appreciate it. > > > Thanks again > > ZF > > >> On 6/2/2011 12:49 AM, Z F wrote: >>> Hello everybody, >>> >>> I was wondering if someone could comment on the >> accuracy of gsl_cdf_binomial_P() function gsl implementation >> for large n (n is about a few thousand). >>> for different values of p and when the result of cdf >> is in the tails ( small less then 0.05 and large -- above >> 0.95) >>> Thank you very much >>> >>> ZF >>> >>> >> >