Re: GSL 1.6 random numbers (gauss.c and rng.c)

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* Re: GSL 1.6 random numbers (gauss.c and rng.c)
       [not found] <16887.49847.140489.31885@shearwater.sarnoff.com>
@ 2005-01-26 17:52 ` James Theiler
  2005-01-27 10:00   ` Charles Karney
  0 siblings, 1 reply; 6+ messages in thread
From: James Theiler @ 2005-01-26 17:52 UTC (permalink / raw)
  To: Charles Karney; +Cc: gsl-discuss

Hi Charles,
   And thanks for the comments, which with this email I am cc'ing 
to the gsl-discuss list.  To the list, I am recommending that we
adopt both of his suggestions, though perhaps thinking a bit about
whether there is an even better solution to the issue he raised in
the gsl_rng_uniform_int routine.

On Wed, 26 Jan 2005, Charles Karney wrote:

] Hi!
] 
] We exchanged some E-mail back in Feb 2003 concerning implementations of
] the Walker algorithm.

And a valuable exchange it was, as I recall; thanks again for those
comments as well.

] 
] Now, I've got some comments on
] 
]     gsl_ran_gaussian_ratio_method
]     gsl_rng_uniform_int
] 
] in GSL 1.6
] 
] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the
]     constant for sqrt(8/e) is wrong.  Worse still, the number you have
]     is too small.  Rounding the result UP still gives the correct
]     results, however (at the expense of a few extra rejects).  So you
]     might replace
] 
]       x = 1.71552776992141359295 * (v - 0.5) / u;
]     by
]       x = 1.715527769921413593 * (v - 0.5) / u;
]     or
]       x = 1.71552777 * (v - 0.5) / u;
] 
]     In my implementation I round up after additional digits:
] 
]       1.71552776992141359296037928256

Hmmm, if x were used solely as the rejection criterion, then I would
be inclined to agree that there is no harm to accuracy (and utterly
negligible harm to efficiency) to use a rounded-up value for the
constant.

But it's a little more complicated since x is also used as part of the
returned variate.  You may still be right that it doesn't matter,
maybe everything just scales up, but it's not obvious to me.

(Ok, on further thought, I've decided it's not obvious to me that
rounding up is valid even at just the rejection stage ... but this 
is undoubtedly just a failing on my part; I'm not saying you are 
wrong, just that it's not obvious to me that rounding up is 
harmless.)

Hoever, I just used 'bc' to verify your constant, and I agree with you
on that; the very least we should do is change that trailing '295' to
'296'.  I'd be happy to recommend changing that '295' to just '3' --
it's still bound to be adequately precise and it does round up.  

But what we should do is follow your practice, and use the still more
precise number.  No harm to us in being more precise, and if the
round-up is a conceptually better thing, then all the better.

] 
] (b) [EFFICIENCY] Leva gives quadratic tests to perform in
]     gsl_ran_gaussian_ratio_method which result in log being evaluated
]     0.012 per normal deviate instead of 1.369 times.  (Knuth's tests
]     result in 0.232 logs/deviate).  Depending on the relative speed of
]     log vs other arithmetic operations, this might speed up
]     gsl_ran_gaussian_ratio_method significantly.  The reference is
] 
]       J. L. Leva, A Fast Normal Random Number Generator,
]       ACM TOMS 18, 449-453 and 454-455 (1992).

Worth looking into. In my experience, logs are pretty fast on machines
I use (eg, Pentiums), so that the actual gains in reducing their
number are often not as substantial as one might hope.  But I haven't
done any benchmarks lately, and besides, if this is a better
algorithm, we should at least look into it.

] 
] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range.  However
]     within that test you could allow the case n-1 == range.  Thus
] 
]       if (n > range) 
] 	{
] 	  if (n - 1 == range)
] 	    return ((r->type->get) (r->state)) - offset;
] 	  GSL_ERROR_VAL ("n exceeds maximum value of generator",
] 			    GSL_EINVAL, 0) ;
] 	}
] 
]     (Of course if range = maxint - 1, n would not be representable...)
] 

I have to admit, I haven't looked at this particular function before.  
It is confusing to me; it looks like range is defined in a funny way;
eg if min=0, max=99, shouldn't the range be 100 ? Or, 
if we are going to define range=max-min, then we 
should define the scale differently

      unsigned long int scale = (range + 1)/n;

where the "+ 1" is added to what we have now.  Then the rest proceeds
correctly and even more efficiently.

...except when r->type->max is the maximum unsigned long (ie,
0xffffffffUL), which is the case for a lot of the generators.  Then
the "+ 1" causes an overflow.  

So the current algorithm does give correct values, it just isn't as
efficient as it could be (if there were a nice way to deal with this
overflow).  eg, if n=(range+1)/2, -- eg, think n=50 in the case
above where min=0 and max=99 -- then the do/while loop is executed
twice as often as is actually necessary.

However, as Charles points out, the current algorithm isn't quite
correct when n-1==range.  His fix solves that problem.

Another alternative would be something like
    scale = ((range+1) > range ? range+1 : range)/n;
except I'm not really sure it would work, and yes, it is ugly.

jt

-- 
James Theiler            Space and Remote Sensing Sciences
MS-B244, ISR-2, LANL     Los Alamos National Laboratory
Los Alamos, NM 87545     http://nis-www.lanl.gov/~jt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GSL 1.6 random numbers (gauss.c and rng.c)
  2005-01-27 10:00   ` Charles Karney
@ 2005-01-26 20:54     ` James Theiler
  2005-01-27 10:00       ` GSL 1.6: patch for gauss.c Charles Karney
  0 siblings, 1 reply; 6+ messages in thread
From: James Theiler @ 2005-01-26 20:54 UTC (permalink / raw)
  To: Charles Karney; +Cc: gsl-discuss

On Wed, 26 Jan 2005, Charles Karney wrote:

]  > From: James Theiler <jt@lanl.gov>
]  > Subject: Re: GSL 1.6 random numbers (gauss.c and rng.c)
]  > Date: Wed, 26 Jan 2005 10:52:03 -0700 (MST)
]  > 
]  > ] Now, I've got some comments on
]  > ] 
]  > ]     gsl_ran_gaussian_ratio_method
]  > ]     gsl_rng_uniform_int
]  > ] 
]  > ] in GSL 1.6
]  > ] 
]  > ] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the
]  > ]     constant for sqrt(8/e) is wrong.  Worse still, the number you have
]  > ]     is too small.  Rounding the result UP still gives the correct
]  > ]     results, however (at the expense of a few extra rejects).  So you
]  > ]     might replace
]  > ] 
]  > ]       x = 1.71552776992141359295 * (v - 0.5) / u;
]  > ]     by
]  > ]       x = 1.715527769921413593 * (v - 0.5) / u;
]  > ]     or
]  > ]       x = 1.71552777 * (v - 0.5) / u;
]  > ] 
]  > ]     In my implementation I round up after additional digits:
]  > ] 
]  > ]       1.71552776992141359296037928256
]  > 
]  > Hmmm, if x were used solely as the rejection criterion, then I would
]  > be inclined to agree that there is no harm to accuracy (and utterly
]  > negligible harm to efficiency) to use a rounded-up value for the
]  > constant.
]  > 
]  > But it's a little more complicated since x is also used as part of the
]  > returned variate.  You may still be right that it doesn't matter,
]  > maybe everything just scales up, but it's not obvious to me.
] 
] If you check the description of the algorithm in Knuth or the original
] paper, you will see that you are just trying to sample a point uniformly
] in a funny shaped region.  This is done by choosing a point in a
] rectangle enclosing the region and rejecting the point unless it's
] inside the region.  The rectangle [0,1] x [-sqrt(2/e), sqrt(2/e)] snugly
] encloses the region.  However [0,1] x [-1,1] would also "work" at some
] cost in efficiency and precision.  Leva uses 1.7156 for his multiplier.

Thanks.  With that explanation, it _is_ obvious.

And on those grounds, I like using 1.7156; someday you might care
about statisical precision to one part in 10^16, but you'll never
care about efficiency differences of one part in ten thousand.

] 
]  > ] (b) [EFFICIENCY] Leva gives quadratic tests to perform in
]  > ]     gsl_ran_gaussian_ratio_method which result in log being evaluated
]  > ]     0.012 per normal deviate instead of 1.369 times.  (Knuth's tests
]  > ]     result in 0.232 logs/deviate).  Depending on the relative speed of
]  > ]     log vs other arithmetic operations, this might speed up
]  > ]     gsl_ran_gaussian_ratio_method significantly.  The reference is
]  > ] 
]  > ]       J. L. Leva, A Fast Normal Random Number Generator,
]  > ]       ACM TOMS 18, 449-453 and 454-455 (1992).
]  > 
]  > Worth looking into. In my experience, logs are pretty fast on machines
]  > I use (eg, Pentiums), so that the actual gains in reducing their
]  > number are often not as substantial as one might hope.  But I haven't
]  > done any benchmarks lately, and besides, if this is a better
]  > algorithm, we should at least look into it.
] 
] I had mixed results on a Pentium with gcc.  My previous implemention
] used Knuth's bounds.  Compiling with -ffast-math, Leva gives a 2.5%
] speedup.  Without -ffast-math, I got a 18% speedup.  (As I recall, the
] Knuthian bounds gave a 25% speedup over the raw ratio method.)
] 
] Note that the coefficients in Leva's quadratic bounds are only given to
] a few sig. figs.  This is OK -- see discussion above about the the
] sqrt(8/e) const.  The only important thing is that the FINAL exit test x
] * x <= -4.0 * log (u) is accurate.
] 
] One other comment on both gsl_ran_gaussian_ratio_method and
] gsl_ran_gaussian.  I think you would be better off using
] gsl_rng_uniform_pos consistently in both routines.  This has the
] property that (gsl_rng_uniform_pos - 0.5) is symmetrically distributed
] about zero.  With gsl_rng_uniform, the gaussian routines return negative
] numbers slightly more often than positive ones.  (A tiny effect -- but
] better to fix it now than to worry about simulation results 5 years down
] the road.)

Good point; I agree.

] 
]  > ] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range.  However
]  > ]     within that test you could allow the case n-1 == range.  Thus
]  > ] 
]  > ]       if (n > range) 
]  > ] 	{
]  > ] 	  if (n - 1 == range)
]  > ] 	    return ((r->type->get) (r->state)) - offset;
]  > ] 	  GSL_ERROR_VAL ("n exceeds maximum value of generator",
]  > ] 			    GSL_EINVAL, 0) ;
]  > ] 	}
]  > ] 
]  > ]     (Of course if range = maxint - 1, n would not be representable...)
]  > ] 
]  > 
]  > I have to admit, I haven't looked at this particular function before.  
]  > It is confusing to me; it looks like range is defined in a funny way;
]  > eg if min=0, max=99, shouldn't the range be 100 ? Or, 
]  > if we are going to define range=max-min, then we 
]  > should define the scale differently
]  > 
]  >       unsigned long int scale = (range + 1)/n;
]  > 
]  > where the "+ 1" is added to what we have now.  Then the rest proceeds
]  > correctly and even more efficiently.
]  > 
]  > ...except when r->type->max is the maximum unsigned long (ie,
]  > 0xffffffffUL), which is the case for a lot of the generators.  Then
]  > the "+ 1" causes an overflow.  
] 
] I suspect that this was merely to avoid overflow.  Here's my independent
] C++ implemention of this functionality where I encountered the same
] problem.  The comments tell the story...
] 
] Here MM = 2^30 and UInt30() is Knuth's recommended RNG (returning a
] result in [0, MM))
] 
] // A random number in [0, 2^32).
] unsigned long Random::Integer() {
]     // xor 2 30-bit numbers to give 32-bit result
]     return (UInt30() << 2) ^ UInt30();
] }
] 
] // Return random number in [0,n)
] unsigned long Random::Integer(const unsigned long n) throw() {
]     if (n == 0)
]         // Special case, equivalent to Integer(2^32).  (Otherwise n == 0
]         // is meaningless.)
]         return Integer();
]     else if (n == 1)
]         // Saves using up a random number.
]         return 0;
]     else {
]         // Do we need 32-bit randoms as our raw material?
]         const bool big = n > MM;
]         // The maximum random that we need.  We'd like the first number
]         // to be 2^32, but we can't deal with such large numbers.  The
]         // following will give the correct results at a slight cost in
]         // efficiency.  (The case n = 2^31 will need two trips through
]         // the loop instead of one.)
]         const unsigned long q = big ? 0xffffffffUL : MM;
]         const unsigned long r = q / n; // r > 0
]         const unsigned long m = r * n; // 0 < m <= q
]         unsigned long u;        // Find a random number in [0,m).
]         do {
]             // For small n, this is executed once (since m is nearly q).
]             // Worst cases are n = MM/2 + 1 and 2*MM, in which case the
]             // loop is executed slightly less than twice on average.
]             u = big ? Integer() : UInt30();
]         } while (u >= m);
]         // Since m is r * n, u/r is unbiased random number in [0,n).  An
]         // unbiased alternative is to use u % n.  However, division is
]         // preferred, as it uses the "more random" leading bits in u.
]         // This also makes Bool() and Integer(2) equivalent.
]         return u / r;
]     }
] }

Nicely written.
I like the use of 0 to indicate 2^32.


] 
]  > 
]  > -- 
]  > James Theiler            Space and Remote Sensing Sciences
]  > MS-B244, ISR-2, LANL     Los Alamos National Laboratory
]  > Los Alamos, NM 87545     http://nis-www.lanl.gov/~jt
] 
] One final remark.  knuthran.c is out of date!  See
] 
]     http://www-cs-faculty.stanford.edu/~knuth/news02.html#rng
]     http://www-cs-faculty.stanford.edu/~knuth/programs/rng.c
] 
] There are two significant changes:
] 
] (1) The generator is "warmed up" more thoroughly.  (This avoids problems
]     with correlations between random streams with adjacent seeds.)
] 
] (2) Only 100 out of every 1009 random numbers are used.  (This is needed
]     so that the "birthday test" is satisfied.)
] 
] 


This is good to know.

jt

-- 
James Theiler            Space and Remote Sensing Sciences
MS-B244, ISR-2, LANL     Los Alamos National Laboratory
Los Alamos, NM 87545     http://nis-www.lanl.gov/~jt





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: GSL 1.6 random numbers (gauss.c and rng.c)
  2005-01-26 17:52 ` GSL 1.6 random numbers (gauss.c and rng.c) James Theiler
@ 2005-01-27 10:00   ` Charles Karney
  2005-01-26 20:54     ` James Theiler
  0 siblings, 1 reply; 6+ messages in thread
From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw)
  To: James Theiler; +Cc: Charles Karney, gsl-discuss

 > From: James Theiler <jt@lanl.gov>
 > Subject: Re: GSL 1.6 random numbers (gauss.c and rng.c)
 > Date: Wed, 26 Jan 2005 10:52:03 -0700 (MST)
 > 
 > ] Now, I've got some comments on
 > ] 
 > ]     gsl_ran_gaussian_ratio_method
 > ]     gsl_rng_uniform_int
 > ] 
 > ] in GSL 1.6
 > ] 
 > ] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the
 > ]     constant for sqrt(8/e) is wrong.  Worse still, the number you have
 > ]     is too small.  Rounding the result UP still gives the correct
 > ]     results, however (at the expense of a few extra rejects).  So you
 > ]     might replace
 > ] 
 > ]       x = 1.71552776992141359295 * (v - 0.5) / u;
 > ]     by
 > ]       x = 1.715527769921413593 * (v - 0.5) / u;
 > ]     or
 > ]       x = 1.71552777 * (v - 0.5) / u;
 > ] 
 > ]     In my implementation I round up after additional digits:
 > ] 
 > ]       1.71552776992141359296037928256
 > 
 > Hmmm, if x were used solely as the rejection criterion, then I would
 > be inclined to agree that there is no harm to accuracy (and utterly
 > negligible harm to efficiency) to use a rounded-up value for the
 > constant.
 > 
 > But it's a little more complicated since x is also used as part of the
 > returned variate.  You may still be right that it doesn't matter,
 > maybe everything just scales up, but it's not obvious to me.

If you check the description of the algorithm in Knuth or the original
paper, you will see that you are just trying to sample a point uniformly
in a funny shaped region.  This is done by choosing a point in a
rectangle enclosing the region and rejecting the point unless it's
inside the region.  The rectangle [0,1] x [-sqrt(2/e), sqrt(2/e)] snugly
encloses the region.  However [0,1] x [-1,1] would also "work" at some
cost in efficiency and precision.  Leva uses 1.7156 for his multiplier.

 > ] (b) [EFFICIENCY] Leva gives quadratic tests to perform in
 > ]     gsl_ran_gaussian_ratio_method which result in log being evaluated
 > ]     0.012 per normal deviate instead of 1.369 times.  (Knuth's tests
 > ]     result in 0.232 logs/deviate).  Depending on the relative speed of
 > ]     log vs other arithmetic operations, this might speed up
 > ]     gsl_ran_gaussian_ratio_method significantly.  The reference is
 > ] 
 > ]       J. L. Leva, A Fast Normal Random Number Generator,
 > ]       ACM TOMS 18, 449-453 and 454-455 (1992).
 > 
 > Worth looking into. In my experience, logs are pretty fast on machines
 > I use (eg, Pentiums), so that the actual gains in reducing their
 > number are often not as substantial as one might hope.  But I haven't
 > done any benchmarks lately, and besides, if this is a better
 > algorithm, we should at least look into it.

I had mixed results on a Pentium with gcc.  My previous implemention
used Knuth's bounds.  Compiling with -ffast-math, Leva gives a 2.5%
speedup.  Without -ffast-math, I got a 18% speedup.  (As I recall, the
Knuthian bounds gave a 25% speedup over the raw ratio method.)

Note that the coefficients in Leva's quadratic bounds are only given to
a few sig. figs.  This is OK -- see discussion above about the the
sqrt(8/e) const.  The only important thing is that the FINAL exit test x
* x <= -4.0 * log (u) is accurate.

One other comment on both gsl_ran_gaussian_ratio_method and
gsl_ran_gaussian.  I think you would be better off using
gsl_rng_uniform_pos consistently in both routines.  This has the
property that (gsl_rng_uniform_pos - 0.5) is symmetrically distributed
about zero.  With gsl_rng_uniform, the gaussian routines return negative
numbers slightly more often than positive ones.  (A tiny effect -- but
better to fix it now than to worry about simulation results 5 years down
the road.)

 > ] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range.  However
 > ]     within that test you could allow the case n-1 == range.  Thus
 > ] 
 > ]       if (n > range) 
 > ] 	{
 > ] 	  if (n - 1 == range)
 > ] 	    return ((r->type->get) (r->state)) - offset;
 > ] 	  GSL_ERROR_VAL ("n exceeds maximum value of generator",
 > ] 			    GSL_EINVAL, 0) ;
 > ] 	}
 > ] 
 > ]     (Of course if range = maxint - 1, n would not be representable...)
 > ] 
 > 
 > I have to admit, I haven't looked at this particular function before.  
 > It is confusing to me; it looks like range is defined in a funny way;
 > eg if min=0, max=99, shouldn't the range be 100 ? Or, 
 > if we are going to define range=max-min, then we 
 > should define the scale differently
 > 
 >       unsigned long int scale = (range + 1)/n;
 > 
 > where the "+ 1" is added to what we have now.  Then the rest proceeds
 > correctly and even more efficiently.
 > 
 > ...except when r->type->max is the maximum unsigned long (ie,
 > 0xffffffffUL), which is the case for a lot of the generators.  Then
 > the "+ 1" causes an overflow.  

I suspect that this was merely to avoid overflow.  Here's my independent
C++ implemention of this functionality where I encountered the same
problem.  The comments tell the story...

Here MM = 2^30 and UInt30() is Knuth's recommended RNG (returning a
result in [0, MM))

// A random number in [0, 2^32).
unsigned long Random::Integer() {
    // xor 2 30-bit numbers to give 32-bit result
    return (UInt30() << 2) ^ UInt30();
}

// Return random number in [0,n)
unsigned long Random::Integer(const unsigned long n) throw() {
    if (n == 0)
        // Special case, equivalent to Integer(2^32).  (Otherwise n == 0
        // is meaningless.)
        return Integer();
    else if (n == 1)
        // Saves using up a random number.
        return 0;
    else {
        // Do we need 32-bit randoms as our raw material?
        const bool big = n > MM;
        // The maximum random that we need.  We'd like the first number
        // to be 2^32, but we can't deal with such large numbers.  The
        // following will give the correct results at a slight cost in
        // efficiency.  (The case n = 2^31 will need two trips through
        // the loop instead of one.)
        const unsigned long q = big ? 0xffffffffUL : MM;
        const unsigned long r = q / n; // r > 0
        const unsigned long m = r * n; // 0 < m <= q
        unsigned long u;        // Find a random number in [0,m).
        do {
            // For small n, this is executed once (since m is nearly q).
            // Worst cases are n = MM/2 + 1 and 2*MM, in which case the
            // loop is executed slightly less than twice on average.
            u = big ? Integer() : UInt30();
        } while (u >= m);
        // Since m is r * n, u/r is unbiased random number in [0,n).  An
        // unbiased alternative is to use u % n.  However, division is
        // preferred, as it uses the "more random" leading bits in u.
        // This also makes Bool() and Integer(2) equivalent.
        return u / r;
    }
}

 > 
 > -- 
 > James Theiler            Space and Remote Sensing Sciences
 > MS-B244, ISR-2, LANL     Los Alamos National Laboratory
 > Los Alamos, NM 87545     http://nis-www.lanl.gov/~jt

One final remark.  knuthran.c is out of date!  See

    http://www-cs-faculty.stanford.edu/~knuth/news02.html#rng
    http://www-cs-faculty.stanford.edu/~knuth/programs/rng.c

There are two significant changes:

(1) The generator is "warmed up" more thoroughly.  (This avoids problems
    with correlations between random streams with adjacent seeds.)

(2) Only 100 out of every 1009 random numbers are used.  (This is needed
    so that the "birthday test" is satisfied.)

-- 
Charles Karney <ckarney@sarnoff.com>
Sarnoff Corporation, Princeton, NJ 08543-5300

URL: http://charles.karney.info
Tel: +1 609 734 2312
Fax: +1 609 734 2323

^ permalink raw reply	[flat|nested] 6+ messages in thread

* GSL 1.6: patch for gauss.c -- CORRECTION + TIMING
  2005-01-27 10:00       ` GSL 1.6: patch for gauss.c Charles Karney
@ 2005-01-27 10:00         ` Charles Karney
  2005-02-06  8:04           ` GSL 1.6: updated patch for gauss.c Charles Karney
  0 siblings, 1 reply; 6+ messages in thread
From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw)
  To: James Theiler; +Cc: gsl-discuss

The patch to randist/gauss.c that I sent out at

    Date: Wed, 26 Jan 2005 17:29:26 -0500

has a fatal C++'ism creeping is.  Please replace

      y = abs(v) - t;

by

      y = fabs(v) - t;

Sorry about that!

I did a timing test using the default basic RNG with gcc -O2
optimization on a Linux Pentium 866 MHz machine.  The results are:

    GSL polar method: 640 ns
    Old ratio method: 630 ns
    New ratio method: 380 ns

So Leva's bounds give a nice speed up by about 1.6 over the raw ratio
method and the polar method.  Perhaps gsl_ran_gaussian_ratio_method
should be the default routine for normal deviates?

A PostScript on knuthran.  With Knuth's latest recommendations, knuthran
passes all the "Die Hard" tests and so qualifies as a major league RNG
(?).  (This is the raw RNG that I use.)

-- 
Charles Karney <ckarney@sarnoff.com>
Sarnoff Corporation, Princeton, NJ 08543-5300

URL: http://charles.karney.info
Tel: +1 609 734 2312
Fax: +1 609 734 2323

^ permalink raw reply	[flat|nested] 6+ messages in thread

* GSL 1.6: patch for gauss.c
  2005-01-26 20:54     ` James Theiler
@ 2005-01-27 10:00       ` Charles Karney
  2005-01-27 10:00         ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney
  0 siblings, 1 reply; 6+ messages in thread
From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw)
  To: James Theiler; +Cc: gsl-discuss

OK, here's a patch to gauss.c (v 1.6) to implement Leva's bounds on the
ratio method for normal deviates.  It also uses gsl_rng_uniform_pos for
both Gaussian methods to ensure that the resulting normal deviates are
symmetric.

I checked Leva's paper on the timing...

    Polar:                333
    Ratio:                411
    Ratio+Knuth's bounds: 197
    Ratio+Leva's bounds:  166
    Polar (GSL):          666

These are times (us) per normal deviate on a 1992 era computer and I've
extrapolated to Polar (GSL) which only uses one of the pair of normal
deviates that the polar method produces.

Because modern computers implement log pretty efficiently, the benefits
of using the bounds are not as great as shown here.  But I think that
you will find that the ratio method should comfortably beat the polar
method

--- randist/gauss.c.orig	2003-07-25 11:18:13.000000000 -0400
+++ randist/gauss.c	2005-01-26 17:25:26.000000000 -0500
@@ -28,8 +28,8 @@
  * deviates.  We don't produce two, because then we'd have to save one
  * in a static variable for the next call, and that would screws up
  * re-entrant or threaded code, so we only produce one.  This makes
- * the Ratio method suddenly more appealing.  There are further tests
- * one can make if the log() is slow.  See Knuth for details */
+ * the Ratio method suddenly more appealing.  We include Leva's tests
+ * in the Ratio method to avoid computing the log() very often. */
 
 /* Both methods pass the statistical tests; but the polar method
  * seems to be a touch faster on my home Pentium, EVEN though we
@@ -47,8 +47,9 @@
     {
       /* choose x,y in uniform square (-1,-1) to (+1,+1) */
 
-      x = -1 + 2 * gsl_rng_uniform (r);
-      y = -1 + 2 * gsl_rng_uniform (r);
+      /* Use gsl_rng_uniform_pos so that square is centered on origin. */
+      x = -1 + 2 * gsl_rng_uniform_pos (r);
+      y = -1 + 2 * gsl_rng_uniform_pos (r);
 
       /* see if it is in the unit circle */
       r2 = x * x + y * y;
@@ -61,26 +62,38 @@
 
 /* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130 */
 /* K+M, ACM Trans Math Software 3 (1977) 257-260. */
+/* Leva, ACM Trans Math Software 18 (1992) 449-453 and 454-455. */
+/* This is a straighforward conversion of Leva's Fortran code. */
 
 double
 gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma)
 {
-  double u, v, x;
+  double u, v, x, y, Q;
+  const double s =  0.449871;
+  const double t = -0.386595;
+  const double a =  0.19600;
+  const double b =  0.25472;
+  const double r1 = 0.27597;
+  const double r2 = 0.27846;
 
-  do
+  do                 /* This loop is executed 1.369 times on average  */
     {
-      v = gsl_rng_uniform (r);
-      do
-        {
-          u = gsl_rng_uniform (r);
-        }
-      while (u == 0);
-      /* Const 1.715... = sqrt(8/e) */
-      x = 1.71552776992141359295 * (v - 0.5) / u;
+      /* Generate a point P = (u, v) uniform in a rectangle */
+      u = gsl_rng_uniform_pos (r); /* NB avoid u = 0 */
+      /* Const 1.7156 > sqrt(8/e) -- but not by too much*/
+      v = 1.7156 * (gsl_rng_uniform_pos (r) - 0.5); /* v is centered */
+      x = u - s;
+      y = abs(v) - t;
+      /* Compute quadratic form Q */
+      Q = x*x + y * (a*y - b*x);
     }
-  while (x * x > -4.0 * log (u));
+  while ( Q >= r1 &&            /* accept P if Q < r1 */
+          ( Q > r2 ||           /* reject P if Q > r2 */
+            /* Accept if v^2 <= -4 u^2 log(u) */
+            /* This final test is executed 0.012 times on average. */
+            v*v > -4 * u*u * log(u) ) );
 
-  return sigma * x;
+  return sigma * v/u;           /* Return slope */
 }
 
 double

-- 
Charles Karney <ckarney@sarnoff.com>
Sarnoff Corporation, Princeton, NJ 08543-5300

URL: http://charles.karney.info
Tel: +1 609 734 2312
Fax: +1 609 734 2323

^ permalink raw reply	[flat|nested] 6+ messages in thread

* GSL 1.6: updated patch for gauss.c
  2005-01-27 10:00         ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney
@ 2005-02-06  8:04           ` Charles Karney
  0 siblings, 0 replies; 6+ messages in thread
From: Charles Karney @ 2005-02-06  8:04 UTC (permalink / raw)
  To: gsl-discuss; +Cc: James Theiler

Here's an updated patch for gauss.c (GSL 1.6).  It includes the
information on the timing, corrects the abs/fabs bug, and reverts to
gsl_rng_uniform for the "v" coordinate.  I had earlier argued that
having v in [-0.5, 0.5) resulted in an asymmetric result.  This is
bogus.  The endpoint v=-0.5 gets rejected in the while clause, so
there's no need to add the penalty of calling gsl_rng_uniform_pos.
(Note however, that this argument does NOT extend to the polar method.
There you really do need to avoid the edges of the square to guarantee
that the result is symmetric about zero.)  Finally I use
1-gsl_rng_uniform for u, since I presume that this will be more
efficient that calling gsl_rng_uniform_pos.

--- randist/gauss.c.orig	2003-07-25 11:18:13.000000000 -0400
+++ randist/gauss.c	2005-02-03 10:33:42.000000000 -0500
@@ -28,12 +28,17 @@
  * deviates.  We don't produce two, because then we'd have to save one
  * in a static variable for the next call, and that would screws up
  * re-entrant or threaded code, so we only produce one.  This makes
- * the Ratio method suddenly more appealing.  There are further tests
- * one can make if the log() is slow.  See Knuth for details */
-
-/* Both methods pass the statistical tests; but the polar method
- * seems to be a touch faster on my home Pentium, EVEN though we
- * are only using half of the available random deviates!
+ * the Ratio method suddenly more appealing.
+ *
+ * [Added by Charles Karney] We use Leva's implementation of the Ratio
+ * method which avoids calling log() nearly all the time and makes the
+ * Ratio method faster than the Polar method (when it produces just one
+ * result per call).  Timing per call (gcc -O2 on 866MHz Pentium,
+ * average over 10^8 calls)
+ *
+ *   Polar method: 660 ns
+ *   Ratio method: 368 ns
+ *
  */
 
 /* Polar (Box-Mueller) method; See Knuth v2, 3rd ed, p122 */
@@ -47,8 +52,11 @@
     {
       /* choose x,y in uniform square (-1,-1) to (+1,+1) */
 
-      x = -1 + 2 * gsl_rng_uniform (r);
-      y = -1 + 2 * gsl_rng_uniform (r);
+      /* Use gsl_rng_uniform_pos so that square is centered on origin.
+       * If gsl_rng_uniform is used negative results would be slightly
+       * more likely than positive. */
+      x = -1 + 2 * gsl_rng_uniform_pos (r);
+      y = -1 + 2 * gsl_rng_uniform_pos (r);
 
       /* see if it is in the unit circle */
       r2 = x * x + y * y;
@@ -59,28 +67,54 @@
   return sigma * y * sqrt (-2.0 * log (r2) / r2);
 }
 
-/* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130 */
-/* K+M, ACM Trans Math Software 3 (1977) 257-260. */
+/* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130.
+ * K+M, ACM Trans Math Software 3 (1977) 257-260.
+ *
+ * [Added by Charles Karney] This is an implementation of Leva'a
+ * modifications to the original K+M method; see:
+ * J. L. Leva, ACM Trans Math Software 18 (1992) 449-453 and 454-455. */
 
 double
 gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma)
 {
-  double u, v, x;
+  double u, v, x, y, Q;
+  const double s =  0.449871;   /* Constants from Leva */
+  const double t = -0.386595;
+  const double a =  0.19600;
+  const double b =  0.25472;
+  const double r1 = 0.27597;
+  const double r2 = 0.27846;
 
-  do
+  do                 /* This loop is executed 1.369 times on average  */
     {
-      v = gsl_rng_uniform (r);
-      do
-        {
-          u = gsl_rng_uniform (r);
-        }
-      while (u == 0);
-      /* Const 1.715... = sqrt(8/e) */
-      x = 1.71552776992141359295 * (v - 0.5) / u;
+      /* Generate a point P = (u, v) uniform in a rectangle enclosing
+       * the K+M region v^2 <= - 4 u^2 log(u). */
+
+      /* u in (0, 1] to avoid singularity at u = 0 */
+      u = 1 - gsl_rng_uniform (r);
+
+      /* v is in the asymmetric interval [-0.5, 0.5).  However v = -0.5
+       * is rejected in the last part of the while clause.  The
+       * resulting normal deviate is strictly symmetric about 0
+       * (provided that v is symmetric once v = -0.5 is excluded). */
+      v = gsl_rng_uniform (r) - 0.5;
+
+      /* Constant 1.7156 > sqrt(8/e) (for accuracy); but not by too
+       * much (for efficiency). */
+      v *= 1.7156;
+
+      /* Compute Leva's quadratic form Q */
+      x = u - s;
+      y = fabs(v) - t;
+      Q = x*x + y * (a*y - b*x);
     }
-  while (x * x > -4.0 * log (u));
+  while ( Q >= r1 &&            /* accept P if Q < r1 (Leva) */
+          ( Q > r2 ||           /* reject P if Q > r2 (Leva) */
+            /* Accept if v^2 <= -4 u^2 log(u) (K+M) */
+            /* This final test is executed 0.012 times on average. */
+            v*v > -4 * u*u * log(u) ) );
 
-  return sigma * x;
+  return sigma * v/u;           /* Return slope */
 }
 
 double

-- 
Charles Karney <ckarney@sarnoff.com>
Sarnoff Corporation, Princeton, NJ 08543-5300

URL: http://charles.karney.info
Tel: +1 609 734 2312
Fax: +1 609 734 2323

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-02-06  8:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <16887.49847.140489.31885@shearwater.sarnoff.com>
2005-01-26 17:52 ` GSL 1.6 random numbers (gauss.c and rng.c) James Theiler
2005-01-27 10:00   ` Charles Karney
2005-01-26 20:54     ` James Theiler
2005-01-27 10:00       ` GSL 1.6: patch for gauss.c Charles Karney
2005-01-27 10:00         ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney
2005-02-06  8:04           ` GSL 1.6: updated patch for gauss.c Charles Karney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).