* Re: GSL 1.6 random numbers (gauss.c and rng.c) [not found] <16887.49847.140489.31885@shearwater.sarnoff.com> @ 2005-01-26 17:52 ` James Theiler 2005-01-27 10:00 ` Charles Karney 0 siblings, 1 reply; 6+ messages in thread From: James Theiler @ 2005-01-26 17:52 UTC (permalink / raw) To: Charles Karney; +Cc: gsl-discuss Hi Charles, And thanks for the comments, which with this email I am cc'ing to the gsl-discuss list. To the list, I am recommending that we adopt both of his suggestions, though perhaps thinking a bit about whether there is an even better solution to the issue he raised in the gsl_rng_uniform_int routine. On Wed, 26 Jan 2005, Charles Karney wrote: ] Hi! ] ] We exchanged some E-mail back in Feb 2003 concerning implementations of ] the Walker algorithm. And a valuable exchange it was, as I recall; thanks again for those comments as well. ] ] Now, I've got some comments on ] ] gsl_ran_gaussian_ratio_method ] gsl_rng_uniform_int ] ] in GSL 1.6 ] ] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the ] constant for sqrt(8/e) is wrong. Worse still, the number you have ] is too small. Rounding the result UP still gives the correct ] results, however (at the expense of a few extra rejects). So you ] might replace ] ] x = 1.71552776992141359295 * (v - 0.5) / u; ] by ] x = 1.715527769921413593 * (v - 0.5) / u; ] or ] x = 1.71552777 * (v - 0.5) / u; ] ] In my implementation I round up after additional digits: ] ] 1.71552776992141359296037928256 Hmmm, if x were used solely as the rejection criterion, then I would be inclined to agree that there is no harm to accuracy (and utterly negligible harm to efficiency) to use a rounded-up value for the constant. But it's a little more complicated since x is also used as part of the returned variate. You may still be right that it doesn't matter, maybe everything just scales up, but it's not obvious to me. (Ok, on further thought, I've decided it's not obvious to me that rounding up is valid even at just the rejection stage ... but this is undoubtedly just a failing on my part; I'm not saying you are wrong, just that it's not obvious to me that rounding up is harmless.) Hoever, I just used 'bc' to verify your constant, and I agree with you on that; the very least we should do is change that trailing '295' to '296'. I'd be happy to recommend changing that '295' to just '3' -- it's still bound to be adequately precise and it does round up. But what we should do is follow your practice, and use the still more precise number. No harm to us in being more precise, and if the round-up is a conceptually better thing, then all the better. ] ] (b) [EFFICIENCY] Leva gives quadratic tests to perform in ] gsl_ran_gaussian_ratio_method which result in log being evaluated ] 0.012 per normal deviate instead of 1.369 times. (Knuth's tests ] result in 0.232 logs/deviate). Depending on the relative speed of ] log vs other arithmetic operations, this might speed up ] gsl_ran_gaussian_ratio_method significantly. The reference is ] ] J. L. Leva, A Fast Normal Random Number Generator, ] ACM TOMS 18, 449-453 and 454-455 (1992). Worth looking into. In my experience, logs are pretty fast on machines I use (eg, Pentiums), so that the actual gains in reducing their number are often not as substantial as one might hope. But I haven't done any benchmarks lately, and besides, if this is a better algorithm, we should at least look into it. ] ] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range. However ] within that test you could allow the case n-1 == range. Thus ] ] if (n > range) ] { ] if (n - 1 == range) ] return ((r->type->get) (r->state)) - offset; ] GSL_ERROR_VAL ("n exceeds maximum value of generator", ] GSL_EINVAL, 0) ; ] } ] ] (Of course if range = maxint - 1, n would not be representable...) ] I have to admit, I haven't looked at this particular function before. It is confusing to me; it looks like range is defined in a funny way; eg if min=0, max=99, shouldn't the range be 100 ? Or, if we are going to define range=max-min, then we should define the scale differently unsigned long int scale = (range + 1)/n; where the "+ 1" is added to what we have now. Then the rest proceeds correctly and even more efficiently. ...except when r->type->max is the maximum unsigned long (ie, 0xffffffffUL), which is the case for a lot of the generators. Then the "+ 1" causes an overflow. So the current algorithm does give correct values, it just isn't as efficient as it could be (if there were a nice way to deal with this overflow). eg, if n=(range+1)/2, -- eg, think n=50 in the case above where min=0 and max=99 -- then the do/while loop is executed twice as often as is actually necessary. However, as Charles points out, the current algorithm isn't quite correct when n-1==range. His fix solves that problem. Another alternative would be something like scale = ((range+1) > range ? range+1 : range)/n; except I'm not really sure it would work, and yes, it is ugly. jt -- James Theiler Space and Remote Sensing Sciences MS-B244, ISR-2, LANL Los Alamos National Laboratory Los Alamos, NM 87545 http://nis-www.lanl.gov/~jt ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: GSL 1.6 random numbers (gauss.c and rng.c) 2005-01-26 17:52 ` GSL 1.6 random numbers (gauss.c and rng.c) James Theiler @ 2005-01-27 10:00 ` Charles Karney 2005-01-26 20:54 ` James Theiler 0 siblings, 1 reply; 6+ messages in thread From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw) To: James Theiler; +Cc: Charles Karney, gsl-discuss > From: James Theiler <jt@lanl.gov> > Subject: Re: GSL 1.6 random numbers (gauss.c and rng.c) > Date: Wed, 26 Jan 2005 10:52:03 -0700 (MST) > > ] Now, I've got some comments on > ] > ] gsl_ran_gaussian_ratio_method > ] gsl_rng_uniform_int > ] > ] in GSL 1.6 > ] > ] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the > ] constant for sqrt(8/e) is wrong. Worse still, the number you have > ] is too small. Rounding the result UP still gives the correct > ] results, however (at the expense of a few extra rejects). So you > ] might replace > ] > ] x = 1.71552776992141359295 * (v - 0.5) / u; > ] by > ] x = 1.715527769921413593 * (v - 0.5) / u; > ] or > ] x = 1.71552777 * (v - 0.5) / u; > ] > ] In my implementation I round up after additional digits: > ] > ] 1.71552776992141359296037928256 > > Hmmm, if x were used solely as the rejection criterion, then I would > be inclined to agree that there is no harm to accuracy (and utterly > negligible harm to efficiency) to use a rounded-up value for the > constant. > > But it's a little more complicated since x is also used as part of the > returned variate. You may still be right that it doesn't matter, > maybe everything just scales up, but it's not obvious to me. If you check the description of the algorithm in Knuth or the original paper, you will see that you are just trying to sample a point uniformly in a funny shaped region. This is done by choosing a point in a rectangle enclosing the region and rejecting the point unless it's inside the region. The rectangle [0,1] x [-sqrt(2/e), sqrt(2/e)] snugly encloses the region. However [0,1] x [-1,1] would also "work" at some cost in efficiency and precision. Leva uses 1.7156 for his multiplier. > ] (b) [EFFICIENCY] Leva gives quadratic tests to perform in > ] gsl_ran_gaussian_ratio_method which result in log being evaluated > ] 0.012 per normal deviate instead of 1.369 times. (Knuth's tests > ] result in 0.232 logs/deviate). Depending on the relative speed of > ] log vs other arithmetic operations, this might speed up > ] gsl_ran_gaussian_ratio_method significantly. The reference is > ] > ] J. L. Leva, A Fast Normal Random Number Generator, > ] ACM TOMS 18, 449-453 and 454-455 (1992). > > Worth looking into. In my experience, logs are pretty fast on machines > I use (eg, Pentiums), so that the actual gains in reducing their > number are often not as substantial as one might hope. But I haven't > done any benchmarks lately, and besides, if this is a better > algorithm, we should at least look into it. I had mixed results on a Pentium with gcc. My previous implemention used Knuth's bounds. Compiling with -ffast-math, Leva gives a 2.5% speedup. Without -ffast-math, I got a 18% speedup. (As I recall, the Knuthian bounds gave a 25% speedup over the raw ratio method.) Note that the coefficients in Leva's quadratic bounds are only given to a few sig. figs. This is OK -- see discussion above about the the sqrt(8/e) const. The only important thing is that the FINAL exit test x * x <= -4.0 * log (u) is accurate. One other comment on both gsl_ran_gaussian_ratio_method and gsl_ran_gaussian. I think you would be better off using gsl_rng_uniform_pos consistently in both routines. This has the property that (gsl_rng_uniform_pos - 0.5) is symmetrically distributed about zero. With gsl_rng_uniform, the gaussian routines return negative numbers slightly more often than positive ones. (A tiny effect -- but better to fix it now than to worry about simulation results 5 years down the road.) > ] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range. However > ] within that test you could allow the case n-1 == range. Thus > ] > ] if (n > range) > ] { > ] if (n - 1 == range) > ] return ((r->type->get) (r->state)) - offset; > ] GSL_ERROR_VAL ("n exceeds maximum value of generator", > ] GSL_EINVAL, 0) ; > ] } > ] > ] (Of course if range = maxint - 1, n would not be representable...) > ] > > I have to admit, I haven't looked at this particular function before. > It is confusing to me; it looks like range is defined in a funny way; > eg if min=0, max=99, shouldn't the range be 100 ? Or, > if we are going to define range=max-min, then we > should define the scale differently > > unsigned long int scale = (range + 1)/n; > > where the "+ 1" is added to what we have now. Then the rest proceeds > correctly and even more efficiently. > > ...except when r->type->max is the maximum unsigned long (ie, > 0xffffffffUL), which is the case for a lot of the generators. Then > the "+ 1" causes an overflow. I suspect that this was merely to avoid overflow. Here's my independent C++ implemention of this functionality where I encountered the same problem. The comments tell the story... Here MM = 2^30 and UInt30() is Knuth's recommended RNG (returning a result in [0, MM)) // A random number in [0, 2^32). unsigned long Random::Integer() { // xor 2 30-bit numbers to give 32-bit result return (UInt30() << 2) ^ UInt30(); } // Return random number in [0,n) unsigned long Random::Integer(const unsigned long n) throw() { if (n == 0) // Special case, equivalent to Integer(2^32). (Otherwise n == 0 // is meaningless.) return Integer(); else if (n == 1) // Saves using up a random number. return 0; else { // Do we need 32-bit randoms as our raw material? const bool big = n > MM; // The maximum random that we need. We'd like the first number // to be 2^32, but we can't deal with such large numbers. The // following will give the correct results at a slight cost in // efficiency. (The case n = 2^31 will need two trips through // the loop instead of one.) const unsigned long q = big ? 0xffffffffUL : MM; const unsigned long r = q / n; // r > 0 const unsigned long m = r * n; // 0 < m <= q unsigned long u; // Find a random number in [0,m). do { // For small n, this is executed once (since m is nearly q). // Worst cases are n = MM/2 + 1 and 2*MM, in which case the // loop is executed slightly less than twice on average. u = big ? Integer() : UInt30(); } while (u >= m); // Since m is r * n, u/r is unbiased random number in [0,n). An // unbiased alternative is to use u % n. However, division is // preferred, as it uses the "more random" leading bits in u. // This also makes Bool() and Integer(2) equivalent. return u / r; } } > > -- > James Theiler Space and Remote Sensing Sciences > MS-B244, ISR-2, LANL Los Alamos National Laboratory > Los Alamos, NM 87545 http://nis-www.lanl.gov/~jt One final remark. knuthran.c is out of date! See http://www-cs-faculty.stanford.edu/~knuth/news02.html#rng http://www-cs-faculty.stanford.edu/~knuth/programs/rng.c There are two significant changes: (1) The generator is "warmed up" more thoroughly. (This avoids problems with correlations between random streams with adjacent seeds.) (2) Only 100 out of every 1009 random numbers are used. (This is needed so that the "birthday test" is satisfied.) -- Charles Karney <ckarney@sarnoff.com> Sarnoff Corporation, Princeton, NJ 08543-5300 URL: http://charles.karney.info Tel: +1 609 734 2312 Fax: +1 609 734 2323 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: GSL 1.6 random numbers (gauss.c and rng.c) 2005-01-27 10:00 ` Charles Karney @ 2005-01-26 20:54 ` James Theiler 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c Charles Karney 0 siblings, 1 reply; 6+ messages in thread From: James Theiler @ 2005-01-26 20:54 UTC (permalink / raw) To: Charles Karney; +Cc: gsl-discuss On Wed, 26 Jan 2005, Charles Karney wrote: ] > From: James Theiler <jt@lanl.gov> ] > Subject: Re: GSL 1.6 random numbers (gauss.c and rng.c) ] > Date: Wed, 26 Jan 2005 10:52:03 -0700 (MST) ] > ] > ] Now, I've got some comments on ] > ] ] > ] gsl_ran_gaussian_ratio_method ] > ] gsl_rng_uniform_int ] > ] ] > ] in GSL 1.6 ] > ] ] > ] (a) [MINOR NIT] In gsl_ran_gaussian_ratio_method, the last digit of the ] > ] constant for sqrt(8/e) is wrong. Worse still, the number you have ] > ] is too small. Rounding the result UP still gives the correct ] > ] results, however (at the expense of a few extra rejects). So you ] > ] might replace ] > ] ] > ] x = 1.71552776992141359295 * (v - 0.5) / u; ] > ] by ] > ] x = 1.715527769921413593 * (v - 0.5) / u; ] > ] or ] > ] x = 1.71552777 * (v - 0.5) / u; ] > ] ] > ] In my implementation I round up after additional digits: ] > ] ] > ] 1.71552776992141359296037928256 ] > ] > Hmmm, if x were used solely as the rejection criterion, then I would ] > be inclined to agree that there is no harm to accuracy (and utterly ] > negligible harm to efficiency) to use a rounded-up value for the ] > constant. ] > ] > But it's a little more complicated since x is also used as part of the ] > returned variate. You may still be right that it doesn't matter, ] > maybe everything just scales up, but it's not obvious to me. ] ] If you check the description of the algorithm in Knuth or the original ] paper, you will see that you are just trying to sample a point uniformly ] in a funny shaped region. This is done by choosing a point in a ] rectangle enclosing the region and rejecting the point unless it's ] inside the region. The rectangle [0,1] x [-sqrt(2/e), sqrt(2/e)] snugly ] encloses the region. However [0,1] x [-1,1] would also "work" at some ] cost in efficiency and precision. Leva uses 1.7156 for his multiplier. Thanks. With that explanation, it _is_ obvious. And on those grounds, I like using 1.7156; someday you might care about statisical precision to one part in 10^16, but you'll never care about efficiency differences of one part in ten thousand. ] ] > ] (b) [EFFICIENCY] Leva gives quadratic tests to perform in ] > ] gsl_ran_gaussian_ratio_method which result in log being evaluated ] > ] 0.012 per normal deviate instead of 1.369 times. (Knuth's tests ] > ] result in 0.232 logs/deviate). Depending on the relative speed of ] > ] log vs other arithmetic operations, this might speed up ] > ] gsl_ran_gaussian_ratio_method significantly. The reference is ] > ] ] > ] J. L. Leva, A Fast Normal Random Number Generator, ] > ] ACM TOMS 18, 449-453 and 454-455 (1992). ] > ] > Worth looking into. In my experience, logs are pretty fast on machines ] > I use (eg, Pentiums), so that the actual gains in reducing their ] > number are often not as substantial as one might hope. But I haven't ] > done any benchmarks lately, and besides, if this is a better ] > algorithm, we should at least look into it. ] ] I had mixed results on a Pentium with gcc. My previous implemention ] used Knuth's bounds. Compiling with -ffast-math, Leva gives a 2.5% ] speedup. Without -ffast-math, I got a 18% speedup. (As I recall, the ] Knuthian bounds gave a 25% speedup over the raw ratio method.) ] ] Note that the coefficients in Leva's quadratic bounds are only given to ] a few sig. figs. This is OK -- see discussion above about the the ] sqrt(8/e) const. The only important thing is that the FINAL exit test x ] * x <= -4.0 * log (u) is accurate. ] ] One other comment on both gsl_ran_gaussian_ratio_method and ] gsl_ran_gaussian. I think you would be better off using ] gsl_rng_uniform_pos consistently in both routines. This has the ] property that (gsl_rng_uniform_pos - 0.5) is symmetrically distributed ] about zero. With gsl_rng_uniform, the gaussian routines return negative ] numbers slightly more often than positive ones. (A tiny effect -- but ] better to fix it now than to worry about simulation results 5 years down ] the road.) Good point; I agree. ] ] > ] (c) [GENERALIZE?] In gsl_rng_uniform_int you quit if n > range. However ] > ] within that test you could allow the case n-1 == range. Thus ] > ] ] > ] if (n > range) ] > ] { ] > ] if (n - 1 == range) ] > ] return ((r->type->get) (r->state)) - offset; ] > ] GSL_ERROR_VAL ("n exceeds maximum value of generator", ] > ] GSL_EINVAL, 0) ; ] > ] } ] > ] ] > ] (Of course if range = maxint - 1, n would not be representable...) ] > ] ] > ] > I have to admit, I haven't looked at this particular function before. ] > It is confusing to me; it looks like range is defined in a funny way; ] > eg if min=0, max=99, shouldn't the range be 100 ? Or, ] > if we are going to define range=max-min, then we ] > should define the scale differently ] > ] > unsigned long int scale = (range + 1)/n; ] > ] > where the "+ 1" is added to what we have now. Then the rest proceeds ] > correctly and even more efficiently. ] > ] > ...except when r->type->max is the maximum unsigned long (ie, ] > 0xffffffffUL), which is the case for a lot of the generators. Then ] > the "+ 1" causes an overflow. ] ] I suspect that this was merely to avoid overflow. Here's my independent ] C++ implemention of this functionality where I encountered the same ] problem. The comments tell the story... ] ] Here MM = 2^30 and UInt30() is Knuth's recommended RNG (returning a ] result in [0, MM)) ] ] // A random number in [0, 2^32). ] unsigned long Random::Integer() { ] // xor 2 30-bit numbers to give 32-bit result ] return (UInt30() << 2) ^ UInt30(); ] } ] ] // Return random number in [0,n) ] unsigned long Random::Integer(const unsigned long n) throw() { ] if (n == 0) ] // Special case, equivalent to Integer(2^32). (Otherwise n == 0 ] // is meaningless.) ] return Integer(); ] else if (n == 1) ] // Saves using up a random number. ] return 0; ] else { ] // Do we need 32-bit randoms as our raw material? ] const bool big = n > MM; ] // The maximum random that we need. We'd like the first number ] // to be 2^32, but we can't deal with such large numbers. The ] // following will give the correct results at a slight cost in ] // efficiency. (The case n = 2^31 will need two trips through ] // the loop instead of one.) ] const unsigned long q = big ? 0xffffffffUL : MM; ] const unsigned long r = q / n; // r > 0 ] const unsigned long m = r * n; // 0 < m <= q ] unsigned long u; // Find a random number in [0,m). ] do { ] // For small n, this is executed once (since m is nearly q). ] // Worst cases are n = MM/2 + 1 and 2*MM, in which case the ] // loop is executed slightly less than twice on average. ] u = big ? Integer() : UInt30(); ] } while (u >= m); ] // Since m is r * n, u/r is unbiased random number in [0,n). An ] // unbiased alternative is to use u % n. However, division is ] // preferred, as it uses the "more random" leading bits in u. ] // This also makes Bool() and Integer(2) equivalent. ] return u / r; ] } ] } Nicely written. I like the use of 0 to indicate 2^32. ] ] > ] > -- ] > James Theiler Space and Remote Sensing Sciences ] > MS-B244, ISR-2, LANL Los Alamos National Laboratory ] > Los Alamos, NM 87545 http://nis-www.lanl.gov/~jt ] ] One final remark. knuthran.c is out of date! See ] ] http://www-cs-faculty.stanford.edu/~knuth/news02.html#rng ] http://www-cs-faculty.stanford.edu/~knuth/programs/rng.c ] ] There are two significant changes: ] ] (1) The generator is "warmed up" more thoroughly. (This avoids problems ] with correlations between random streams with adjacent seeds.) ] ] (2) Only 100 out of every 1009 random numbers are used. (This is needed ] so that the "birthday test" is satisfied.) ] ] This is good to know. jt -- James Theiler Space and Remote Sensing Sciences MS-B244, ISR-2, LANL Los Alamos National Laboratory Los Alamos, NM 87545 http://nis-www.lanl.gov/~jt ^ permalink raw reply [flat|nested] 6+ messages in thread
* GSL 1.6: patch for gauss.c 2005-01-26 20:54 ` James Theiler @ 2005-01-27 10:00 ` Charles Karney 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney 0 siblings, 1 reply; 6+ messages in thread From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw) To: James Theiler; +Cc: gsl-discuss OK, here's a patch to gauss.c (v 1.6) to implement Leva's bounds on the ratio method for normal deviates. It also uses gsl_rng_uniform_pos for both Gaussian methods to ensure that the resulting normal deviates are symmetric. I checked Leva's paper on the timing... Polar: 333 Ratio: 411 Ratio+Knuth's bounds: 197 Ratio+Leva's bounds: 166 Polar (GSL): 666 These are times (us) per normal deviate on a 1992 era computer and I've extrapolated to Polar (GSL) which only uses one of the pair of normal deviates that the polar method produces. Because modern computers implement log pretty efficiently, the benefits of using the bounds are not as great as shown here. But I think that you will find that the ratio method should comfortably beat the polar method --- randist/gauss.c.orig 2003-07-25 11:18:13.000000000 -0400 +++ randist/gauss.c 2005-01-26 17:25:26.000000000 -0500 @@ -28,8 +28,8 @@ * deviates. We don't produce two, because then we'd have to save one * in a static variable for the next call, and that would screws up * re-entrant or threaded code, so we only produce one. This makes - * the Ratio method suddenly more appealing. There are further tests - * one can make if the log() is slow. See Knuth for details */ + * the Ratio method suddenly more appealing. We include Leva's tests + * in the Ratio method to avoid computing the log() very often. */ /* Both methods pass the statistical tests; but the polar method * seems to be a touch faster on my home Pentium, EVEN though we @@ -47,8 +47,9 @@ { /* choose x,y in uniform square (-1,-1) to (+1,+1) */ - x = -1 + 2 * gsl_rng_uniform (r); - y = -1 + 2 * gsl_rng_uniform (r); + /* Use gsl_rng_uniform_pos so that square is centered on origin. */ + x = -1 + 2 * gsl_rng_uniform_pos (r); + y = -1 + 2 * gsl_rng_uniform_pos (r); /* see if it is in the unit circle */ r2 = x * x + y * y; @@ -61,26 +62,38 @@ /* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130 */ /* K+M, ACM Trans Math Software 3 (1977) 257-260. */ +/* Leva, ACM Trans Math Software 18 (1992) 449-453 and 454-455. */ +/* This is a straighforward conversion of Leva's Fortran code. */ double gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma) { - double u, v, x; + double u, v, x, y, Q; + const double s = 0.449871; + const double t = -0.386595; + const double a = 0.19600; + const double b = 0.25472; + const double r1 = 0.27597; + const double r2 = 0.27846; - do + do /* This loop is executed 1.369 times on average */ { - v = gsl_rng_uniform (r); - do - { - u = gsl_rng_uniform (r); - } - while (u == 0); - /* Const 1.715... = sqrt(8/e) */ - x = 1.71552776992141359295 * (v - 0.5) / u; + /* Generate a point P = (u, v) uniform in a rectangle */ + u = gsl_rng_uniform_pos (r); /* NB avoid u = 0 */ + /* Const 1.7156 > sqrt(8/e) -- but not by too much*/ + v = 1.7156 * (gsl_rng_uniform_pos (r) - 0.5); /* v is centered */ + x = u - s; + y = abs(v) - t; + /* Compute quadratic form Q */ + Q = x*x + y * (a*y - b*x); } - while (x * x > -4.0 * log (u)); + while ( Q >= r1 && /* accept P if Q < r1 */ + ( Q > r2 || /* reject P if Q > r2 */ + /* Accept if v^2 <= -4 u^2 log(u) */ + /* This final test is executed 0.012 times on average. */ + v*v > -4 * u*u * log(u) ) ); - return sigma * x; + return sigma * v/u; /* Return slope */ } double -- Charles Karney <ckarney@sarnoff.com> Sarnoff Corporation, Princeton, NJ 08543-5300 URL: http://charles.karney.info Tel: +1 609 734 2312 Fax: +1 609 734 2323 ^ permalink raw reply [flat|nested] 6+ messages in thread
* GSL 1.6: patch for gauss.c -- CORRECTION + TIMING 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c Charles Karney @ 2005-01-27 10:00 ` Charles Karney 2005-02-06 8:04 ` GSL 1.6: updated patch for gauss.c Charles Karney 0 siblings, 1 reply; 6+ messages in thread From: Charles Karney @ 2005-01-27 10:00 UTC (permalink / raw) To: James Theiler; +Cc: gsl-discuss The patch to randist/gauss.c that I sent out at Date: Wed, 26 Jan 2005 17:29:26 -0500 has a fatal C++'ism creeping is. Please replace y = abs(v) - t; by y = fabs(v) - t; Sorry about that! I did a timing test using the default basic RNG with gcc -O2 optimization on a Linux Pentium 866 MHz machine. The results are: GSL polar method: 640 ns Old ratio method: 630 ns New ratio method: 380 ns So Leva's bounds give a nice speed up by about 1.6 over the raw ratio method and the polar method. Perhaps gsl_ran_gaussian_ratio_method should be the default routine for normal deviates? A PostScript on knuthran. With Knuth's latest recommendations, knuthran passes all the "Die Hard" tests and so qualifies as a major league RNG (?). (This is the raw RNG that I use.) -- Charles Karney <ckarney@sarnoff.com> Sarnoff Corporation, Princeton, NJ 08543-5300 URL: http://charles.karney.info Tel: +1 609 734 2312 Fax: +1 609 734 2323 ^ permalink raw reply [flat|nested] 6+ messages in thread
* GSL 1.6: updated patch for gauss.c 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney @ 2005-02-06 8:04 ` Charles Karney 0 siblings, 0 replies; 6+ messages in thread From: Charles Karney @ 2005-02-06 8:04 UTC (permalink / raw) To: gsl-discuss; +Cc: James Theiler Here's an updated patch for gauss.c (GSL 1.6). It includes the information on the timing, corrects the abs/fabs bug, and reverts to gsl_rng_uniform for the "v" coordinate. I had earlier argued that having v in [-0.5, 0.5) resulted in an asymmetric result. This is bogus. The endpoint v=-0.5 gets rejected in the while clause, so there's no need to add the penalty of calling gsl_rng_uniform_pos. (Note however, that this argument does NOT extend to the polar method. There you really do need to avoid the edges of the square to guarantee that the result is symmetric about zero.) Finally I use 1-gsl_rng_uniform for u, since I presume that this will be more efficient that calling gsl_rng_uniform_pos. --- randist/gauss.c.orig 2003-07-25 11:18:13.000000000 -0400 +++ randist/gauss.c 2005-02-03 10:33:42.000000000 -0500 @@ -28,12 +28,17 @@ * deviates. We don't produce two, because then we'd have to save one * in a static variable for the next call, and that would screws up * re-entrant or threaded code, so we only produce one. This makes - * the Ratio method suddenly more appealing. There are further tests - * one can make if the log() is slow. See Knuth for details */ - -/* Both methods pass the statistical tests; but the polar method - * seems to be a touch faster on my home Pentium, EVEN though we - * are only using half of the available random deviates! + * the Ratio method suddenly more appealing. + * + * [Added by Charles Karney] We use Leva's implementation of the Ratio + * method which avoids calling log() nearly all the time and makes the + * Ratio method faster than the Polar method (when it produces just one + * result per call). Timing per call (gcc -O2 on 866MHz Pentium, + * average over 10^8 calls) + * + * Polar method: 660 ns + * Ratio method: 368 ns + * */ /* Polar (Box-Mueller) method; See Knuth v2, 3rd ed, p122 */ @@ -47,8 +52,11 @@ { /* choose x,y in uniform square (-1,-1) to (+1,+1) */ - x = -1 + 2 * gsl_rng_uniform (r); - y = -1 + 2 * gsl_rng_uniform (r); + /* Use gsl_rng_uniform_pos so that square is centered on origin. + * If gsl_rng_uniform is used negative results would be slightly + * more likely than positive. */ + x = -1 + 2 * gsl_rng_uniform_pos (r); + y = -1 + 2 * gsl_rng_uniform_pos (r); /* see if it is in the unit circle */ r2 = x * x + y * y; @@ -59,28 +67,54 @@ return sigma * y * sqrt (-2.0 * log (r2) / r2); } -/* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130 */ -/* K+M, ACM Trans Math Software 3 (1977) 257-260. */ +/* Ratio method (Kinderman-Monahan); see Knuth v2, 3rd ed, p130. + * K+M, ACM Trans Math Software 3 (1977) 257-260. + * + * [Added by Charles Karney] This is an implementation of Leva'a + * modifications to the original K+M method; see: + * J. L. Leva, ACM Trans Math Software 18 (1992) 449-453 and 454-455. */ double gsl_ran_gaussian_ratio_method (const gsl_rng * r, const double sigma) { - double u, v, x; + double u, v, x, y, Q; + const double s = 0.449871; /* Constants from Leva */ + const double t = -0.386595; + const double a = 0.19600; + const double b = 0.25472; + const double r1 = 0.27597; + const double r2 = 0.27846; - do + do /* This loop is executed 1.369 times on average */ { - v = gsl_rng_uniform (r); - do - { - u = gsl_rng_uniform (r); - } - while (u == 0); - /* Const 1.715... = sqrt(8/e) */ - x = 1.71552776992141359295 * (v - 0.5) / u; + /* Generate a point P = (u, v) uniform in a rectangle enclosing + * the K+M region v^2 <= - 4 u^2 log(u). */ + + /* u in (0, 1] to avoid singularity at u = 0 */ + u = 1 - gsl_rng_uniform (r); + + /* v is in the asymmetric interval [-0.5, 0.5). However v = -0.5 + * is rejected in the last part of the while clause. The + * resulting normal deviate is strictly symmetric about 0 + * (provided that v is symmetric once v = -0.5 is excluded). */ + v = gsl_rng_uniform (r) - 0.5; + + /* Constant 1.7156 > sqrt(8/e) (for accuracy); but not by too + * much (for efficiency). */ + v *= 1.7156; + + /* Compute Leva's quadratic form Q */ + x = u - s; + y = fabs(v) - t; + Q = x*x + y * (a*y - b*x); } - while (x * x > -4.0 * log (u)); + while ( Q >= r1 && /* accept P if Q < r1 (Leva) */ + ( Q > r2 || /* reject P if Q > r2 (Leva) */ + /* Accept if v^2 <= -4 u^2 log(u) (K+M) */ + /* This final test is executed 0.012 times on average. */ + v*v > -4 * u*u * log(u) ) ); - return sigma * x; + return sigma * v/u; /* Return slope */ } double -- Charles Karney <ckarney@sarnoff.com> Sarnoff Corporation, Princeton, NJ 08543-5300 URL: http://charles.karney.info Tel: +1 609 734 2312 Fax: +1 609 734 2323 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-02-06 8:04 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <16887.49847.140489.31885@shearwater.sarnoff.com> 2005-01-26 17:52 ` GSL 1.6 random numbers (gauss.c and rng.c) James Theiler 2005-01-27 10:00 ` Charles Karney 2005-01-26 20:54 ` James Theiler 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c Charles Karney 2005-01-27 10:00 ` GSL 1.6: patch for gauss.c -- CORRECTION + TIMING Charles Karney 2005-02-06 8:04 ` GSL 1.6: updated patch for gauss.c Charles Karney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).