public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* gsl/randist vs ranlib
@ 2003-04-09 18:03 James Theiler
  2003-04-11  9:31 ` Brian Gough
  0 siblings, 1 reply; 2+ messages in thread
From: James Theiler @ 2003-04-09 18:03 UTC (permalink / raw)
  To: gsl-discuss; +Cc: pearson

Hello,

  It has been pointed out to me by my colleague John Pearson that
several of gsl's randist routines are quite slow.  This became
particularly evident in comparing gsl's binomial generator
(gsl_ran_binomial) to the equivalent routine in "ranlib" -- this is a
very old (though very nice) package that is avaliable through CMU's
StatLib.  It is free software, but not GPL; I won't speculate on the
legalities (or moralities!) that might be entailed, but it says it is
"in the public domain" so we should at least be able to use it for
inspiration. See http://lib.stat.cmu.edu/general/Utexas/ranlib.readme
and don't be put off that it says "Fortran routines"; if you read a
little further you see that they have been translated into C
... still, they are very fortran-esque with goto's and such.

(see also http://www.netlib.org/random/, look for ranlib.c.tar.gz)

  We are talking about a factor of 25x slower for n=1000, p=0.33.
It's not clear to me how much of this is due to the choice of the
algorithm for binomial generator -- it scales like O(log n) but O(1)
algorithms exist[*] -- or some other factors, such as its dependence
on the gamma distribution (via the beta distribution) which also seems
much slower for gsl than it does for ranlib, or the different uniform
random generator that is at the base of both algorithms.  (Aside: the
factor drops to a "mere" 15x if p is changed from run to run; this is
because the ranlib version has extra efficiency which exploits the
situation when multiple calls are made with the same argument.)

  This is something we are going to look into a little closer, but in
the meantime, we are soliciting opinions on this topic: maybe others
have already noticed similar behavior, and have some wisdom to impart.
Also, it is to suggest a new item in gsl's TODO list, which is a more
comprehensive examination of the performance of gsl's randist
algorithms, with an eye to the ranlib implementations either as a
benchmark, or as some code to possibly assimilate.

regards,
jt

[*] One of the references in the ranlib documentation, by the way,
makes a good read: Kachitvichyanukul, V. and Schmeiser, B.  W.
Binomial Random Variate Generation.  Communications of the ACM, 31, 2
(February, 1988) 216.

---------------------------------------------
James Theiler                     jt@lanl.gov
MS-B244, NIS-2, LANL        tel: 505/665-5682
Los Alamos, NM 87545        fax: 505/665-4414
----- Space and Remote Sensing Sciences -----


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: gsl/randist vs ranlib
  2003-04-09 18:03 gsl/randist vs ranlib James Theiler
@ 2003-04-11  9:31 ` Brian Gough
  0 siblings, 0 replies; 2+ messages in thread
From: Brian Gough @ 2003-04-11  9:31 UTC (permalink / raw)
  To: James Theiler; +Cc: gsl-discuss, pearson

James Theiler writes:
 >   This is something we are going to look into a little closer, but in
 > the meantime, we are soliciting opinions on this topic: maybe others
 > have already noticed similar behavior, and have some wisdom to impart.
 > Also, it is to suggest a new item in gsl's TODO list, which is a more
 > comprehensive examination of the performance of gsl's randist
 > algorithms, with an eye to the ranlib implementations either as a
 > benchmark, or as some code to possibly assimilate.

Thanks for looking at this.  At the time I didn't do any
optimisation for the routines that I wrote, I just used the
algorithms from Knuth.

One point: I think it is best to keep the routines stateless,
as they are now, for simplicity.  Your discrete method with
the Walker preprocessing is the easiest way to handle repeated
sampling for the finite distributions.

Brian

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-04-11  9:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-09 18:03 gsl/randist vs ranlib James Theiler
2003-04-11  9:31 ` Brian Gough

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).