public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* RNG question
@ 2008-08-03 23:32 Robert G. Brown
  2008-08-05  9:05 ` Fabian Bastin
  2008-08-08 11:12 ` Brian Gough
  0 siblings, 2 replies; 11+ messages in thread
From: Robert G. Brown @ 2008-08-03 23:32 UTC (permalink / raw)
  To: GSL Discussion list

I'm preparing to address an rng testing issue in dieharder to fix the
following bug/feature.  dieharder tests all the gsl generators as
embedded tests (making it easy for end users to study all its generators
and assess suitability for various tasks or demonstrate the weaknesses
of the classical weak generators).  It then uses the gsl harness to add
more rngs to test -- it is actually quite easy to wrap any new candidate
rng in the gsl rng format and just add it to the list that can be
invoked to directly test as opposed to test via file input (which is
much more limited).

However, when I add my own gsl rng types with a loop at the end of the
gsl-provided list, they pick up sequential numbers.  This means that if
any new gsl routines are added (as has happened a few times) all the
non-gsl routines have their numbers bumped.  If I add or rearrange any
of my own (since some of them group by type, it makes more sense to keep
them "together" in number-space) this also can change the number of
existing rngs.

I didn't view this as a problem during design, and the default behavior
of dieharder without arguments is to spit out a list of all its known
rngs just so people could see what was what.  HOWEVER, users have
started to make scripts that tests certain rngs, by number, and when the
numbers bump as described above it breaks their scripts.

There are obviously several ways I can fix it so this never happens
again, but I thought before I implemented any of them I'd ask at least
if it is now expected that the rngs in the gsl are "frozen", or if there
is an ongoing possibility that more will be added?  I'm guessing the
latter -- if somebody invents a really great one you can hardly not
include it in the GSL, and I've got a few that I might be able to
contribute back eventually as well, if there is any interest.

The other question I have is that at one point in time the maximum
number of rngs one could have was restricted by a macro in the sources
to be 100 (if I recall correctly -- I have a remark to that effect in my
own code's comments).  One solution to the dilemma above is to create
"ranges" of numbers -- 0-99 for gsl rngs, 100-199 for dieharder added
rngs, 200+ for user added rngs.  This would, however, require that the
macro/variable's value be bumped in the gsl to maybe 500 or 1000.
Otherwise I think it is not impossible that dieharder will exhaust the
current gsl space in the next few years, as people keep contributing new
rngs at a slow but steady pace, and I've got a small stack of them
standing by to add when I next get a chance.

Comments?  Answers?

    rgb

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RNG question
  2008-08-03 23:32 RNG question Robert G. Brown
@ 2008-08-05  9:05 ` Fabian Bastin
  2008-08-08 11:12 ` Brian Gough
  1 sibling, 0 replies; 11+ messages in thread
From: Fabian Bastin @ 2008-08-05  9:05 UTC (permalink / raw)
  To: GSL Discussion list

Just a comment. When testing random numbers generators, you should 
consider TestU01: http://www.iro.umontreal.ca/~simardr/testu01/tu01.html

This software is more recent than Diehard, has a more exhaustive list of 
generators and statistical tests, so it is reasonable to expect that 
some users would like to use it instead of Diehard. Just consider this 
citation: "It should be obvious that TESTU01 has supplanted DIEHARD, and 
TESTU01 is now the. standard for testing uniform RNGs" (B. D. 
McCullough, A review of TESTU01, Journal of Applied Econometrics 21(5), 
2006).

IMHO, in view of current research in this field, it is clear that new 
rngs will be proposed, so the list cannot be frozen (for instance, why 
not include the generator MRG32K3A of L'Ecuyer, as it is a well know 
reference, cited for instance by Law in his book Simulation Modeling and 
Analysis?), and not be limited to some test code since new software can 
be proposed at any time.

Therefore, I think that any arbitrary limitation should be avoided, 
except no reasonable alternative exists.

Fabian

Robert G. Brown wrote:
> I'm preparing to address an rng testing issue in dieharder to fix the
> following bug/feature.  dieharder tests all the gsl generators as
> embedded tests (making it easy for end users to study all its generators
> and assess suitability for various tasks or demonstrate the weaknesses
> of the classical weak generators).  It then uses the gsl harness to add
> more rngs to test -- it is actually quite easy to wrap any new candidate
> rng in the gsl rng format and just add it to the list that can be
> invoked to directly test as opposed to test via file input (which is
> much more limited).
> 
> However, when I add my own gsl rng types with a loop at the end of the
> gsl-provided list, they pick up sequential numbers.  This means that if
> any new gsl routines are added (as has happened a few times) all the
> non-gsl routines have their numbers bumped.  If I add or rearrange any
> of my own (since some of them group by type, it makes more sense to keep
> them "together" in number-space) this also can change the number of
> existing rngs.
> 
> I didn't view this as a problem during design, and the default behavior
> of dieharder without arguments is to spit out a list of all its known
> rngs just so people could see what was what.  HOWEVER, users have
> started to make scripts that tests certain rngs, by number, and when the
> numbers bump as described above it breaks their scripts.
> 
> There are obviously several ways I can fix it so this never happens
> again, but I thought before I implemented any of them I'd ask at least
> if it is now expected that the rngs in the gsl are "frozen", or if there
> is an ongoing possibility that more will be added?  I'm guessing the
> latter -- if somebody invents a really great one you can hardly not
> include it in the GSL, and I've got a few that I might be able to
> contribute back eventually as well, if there is any interest.
> 
> The other question I have is that at one point in time the maximum
> number of rngs one could have was restricted by a macro in the sources
> to be 100 (if I recall correctly -- I have a remark to that effect in my
> own code's comments).  One solution to the dilemma above is to create
> "ranges" of numbers -- 0-99 for gsl rngs, 100-199 for dieharder added
> rngs, 200+ for user added rngs.  This would, however, require that the
> macro/variable's value be bumped in the gsl to maybe 500 or 1000.
> Otherwise I think it is not impossible that dieharder will exhaust the
> current gsl space in the next few years, as people keep contributing new
> rngs at a slow but steady pace, and I've got a small stack of them
> standing by to add when I next get a chance.
> 
> Comments?  Answers?
> 
>    rgb
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RNG question
  2008-08-03 23:32 RNG question Robert G. Brown
  2008-08-05  9:05 ` Fabian Bastin
@ 2008-08-08 11:12 ` Brian Gough
  2008-08-08 12:33   ` Robert G. Brown
  1 sibling, 1 reply; 11+ messages in thread
From: Brian Gough @ 2008-08-08 11:12 UTC (permalink / raw)
  To: GSL Discussion list

At Sun, 3 Aug 2008 19:28:52 -0400 (EDT),
Robert G. Brown wrote:
> There are obviously several ways I can fix it so this never happens
> again, but I thought before I implemented any of them I'd ask at least
> if it is now expected that the rngs in the gsl are "frozen", or if there
> is an ongoing possibility that more will be added? 

Yes, I expect there will be some new ones (or corrections to existing
ones).

> The other question I have is that at one point in time the maximum
> number of rngs one could have was restricted by a macro in the sources
> to be 100 (if I recall correctly -- I have a remark to that effect in my
> own code's comments).

The macro is really an internal implementation detail and not intended
to be a limit (as far as I'm aware it's not publically accessible).
There are currently about 60 generators so if you wanted to give them
numbers 100 is too small but 1000 or 500 shouldn't be a problem.


-- 
Brian Gough

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RNG question
  2008-08-08 11:12 ` Brian Gough
@ 2008-08-08 12:33   ` Robert G. Brown
  2008-08-11 20:47     ` Brian Gough
  2008-08-18 21:12     ` Robert G. Brown
  0 siblings, 2 replies; 11+ messages in thread
From: Robert G. Brown @ 2008-08-08 12:33 UTC (permalink / raw)
  To: Brian Gough; +Cc: GSL Discussion list

On Fri, 8 Aug 2008, Brian Gough wrote:

> At Sun, 3 Aug 2008 19:28:52 -0400 (EDT),
> Robert G. Brown wrote:
>> There are obviously several ways I can fix it so this never happens
>> again, but I thought before I implemented any of them I'd ask at least
>> if it is now expected that the rngs in the gsl are "frozen", or if there
>> is an ongoing possibility that more will be added?
>
> Yes, I expect there will be some new ones (or corrections to existing
> ones).
>
>> The other question I have is that at one point in time the maximum
>> number of rngs one could have was restricted by a macro in the sources
>> to be 100 (if I recall correctly -- I have a remark to that effect in my
>> own code's comments).
>
> The macro is really an internal implementation detail and not intended
> to be a limit (as far as I'm aware it's not publically accessible).
> There are currently about 60 generators so if you wanted to give them
> numbers 100 is too small but 1000 or 500 shouldn't be a problem.

I looked over the code again and decided that was probably true on my
own.  I'm preparing to basically clone the types.c module but create a
larger vector of types.  I'll then run through a loop to populated it
with the gsl ones, then add my own, in ranges.

Are future/new rngs going to always be added at the end (new numbers) or
will they get inserted in typed groupings?  I ask because IIRC the new
MT's went in next to the original MT and displaced all the numbers.
This makes logical sense, but breaks code that has hard coded rng
numbers.  I can make dieharder work either way, but to lock in the
numbers in dieharder in the latter case I have to work harder...;-).

   rgb

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RNG question
  2008-08-08 12:33   ` Robert G. Brown
@ 2008-08-11 20:47     ` Brian Gough
  2008-08-18 21:12     ` Robert G. Brown
  1 sibling, 0 replies; 11+ messages in thread
From: Brian Gough @ 2008-08-11 20:47 UTC (permalink / raw)
  To: Robert G. Brown; +Cc: GSL Discussion list

At Fri, 8 Aug 2008 08:28:07 -0400 (EDT),
Robert G. Brown wrote:
> Are future/new rngs going to always be added at the end (new numbers) or
> will they get inserted in typed groupings? 

They could go anywhere, the order is not guaranteed to be preserved.

-- 
Brian Gough

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: RNG question
  2008-08-08 12:33   ` Robert G. Brown
  2008-08-11 20:47     ` Brian Gough
@ 2008-08-18 21:12     ` Robert G. Brown
  1 sibling, 0 replies; 11+ messages in thread
From: Robert G. Brown @ 2008-08-18 21:12 UTC (permalink / raw)
  To: Brian Gough; +Cc: GSL Discussion list

On Fri, 8 Aug 2008, Robert G. Brown wrote:

>> The macro is really an internal implementation detail and not intended
>> to be a limit (as far as I'm aware it's not publically accessible).
>> There are currently about 60 generators so if you wanted to give them
>> numbers 100 is too small but 1000 or 500 shouldn't be a problem.
>
> I looked over the code again and decided that was probably true on my
> own.  I'm preparing to basically clone the types.c module but create a
> larger vector of types.  I'll then run through a loop to populate it
> with the gsl ones, then add my own, in ranges.

I've done exactly this, and it works fine in dieharder 2.27.11.

If you are interested in contributed rngs all wrapped up and ready to
literally drop into place, let me know.  I have gsl-ready code for the
hardware/entropy /dev/random "generator", the related /dev/urandom
"generator" and the (BSD) /dev/arandom "generator" if you'd like it.  I
put these inside conditionals so that they are only added and show up in
the list of available generators (which I generate with fairly obvious
code) if they exist.

A suggestion:  I cannot realloc the built-in types[] (I'm pretty sure)
because it is statically allocated in types.c.  Right now in order to
get a types[] long enough to hold the GSL generator and the various
source dieharder generators in ranges, I have to allocate space for e.g.
dieharder_types[1000], create a pointer to gsl_types, initialize
gsl_types with gsl_rng_types_setup(), and then copy over the types (or
duplicate the specific code in gsl_rng_types_setup(), which then might
change without warning as GSL bumps a version and leave dieharder out of
sync with GSL where they overlap).  I then fill in my own and
contributed and R-derived generators in various ranges in
dieharder_types and it all works fine, BUT it is a bit of a hassle and
the alternative is a segment violation.

So, a minor request for a future snapshot of the GSL:  I think that if
you JUST increased N to 1000 (and maybe set the entire types vector to
zero before filling it as that makes it easy to test for where
generators are installed in the types vector) it wouldn't affect current
code in any way -- as you say it is hidden inside the source -- but it
would keep me from having to allocate a types[] vector of my own as I
could then just fill in the upper ranges of the one returned by the GSL.
I can easily imagine filling up the GSL-provided vector with only 100
slots, but I think it will take a bit of time to fill up a GSL-provided
types vector with 1000 slots.

> Are future/new rngs going to always be added at the end (new numbers) or
> will they get inserted in typed groupings?  I ask because IIRC the new
> MT's went in next to the original MT and displaced all the numbers.
> This makes logical sense, but breaks code that has hard coded rng
> numbers.  I can make dieharder work either way, but to lock in the
> numbers in dieharder in the latter case I have to work harder...;-).

I've decided to stick with the GSL numbers rigorously (reserving the
bottom 200 for the current GSL and any future expansion) -- if they
shift they shift.  Just remember that if you add new e.g. MT's grouped
in with the old ones (and displace e.g. the taus generators up a couple
of number-notches) you WILL break a lot of code that uses the old
numbers (which will all change).  I personally don't care -- dieharder
documents the generator name-to-number match automagically -- but
somebody that selected a generator by number in simulation code might
not notice and could end up getting "wrong answers" in the worst case
scenario as they run a few hundred CPU-months of simulation using a
different generator from the one they thought they were using...;-)

     rgb

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rng question
  2005-12-17 16:50     ` Brian Gough
@ 2005-12-17 23:58       ` Jari Häkkinen
  0 siblings, 0 replies; 11+ messages in thread
From: Jari Häkkinen @ 2005-12-17 23:58 UTC (permalink / raw)
  To: Brian Gough; +Cc: gsl-discuss

Thank you, I tested the latest CVS and GSL now behaves as expected.


Jari


Brian Gough wrote:
> Jari Häkkinen writes:
>  > No, the attached program does not return from the call to 
>  > gsl_rng_uniform_int.
>  > 
>  > The problem shows up on my powermac running MacOSX 10.4.3 with gcc 
>  > version powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple 
>  > Computer, Inc. build 4061).
>  > I compile GSL from CVS (20051208 = today) with with flags -NDEBUG -O2. 
>  > The test program is compiled with a plain 'gcc -o rnd_test rnd_test.c 
>  > -lgsl'.
>  > 
>  > Now, I also run the test program on a SuSE based machine, and the call 
>  > fails with an floating point exception as expected.
> 
> Thanks. I've added a test to catch the case n=0 and return an error, and
> added some notes in the documentation to explain how the function works.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rng question
  2005-12-09 13:49   ` Jari Häkkinen
@ 2005-12-17 16:50     ` Brian Gough
  2005-12-17 23:58       ` Jari Häkkinen
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Gough @ 2005-12-17 16:50 UTC (permalink / raw)
  To: Jari Häkkinen; +Cc: gsl-discuss

Jari Häkkinen writes:
 > No, the attached program does not return from the call to 
 > gsl_rng_uniform_int.
 > 
 > The problem shows up on my powermac running MacOSX 10.4.3 with gcc 
 > version powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple 
 > Computer, Inc. build 4061).
 > I compile GSL from CVS (20051208 = today) with with flags -NDEBUG -O2. 
 > The test program is compiled with a plain 'gcc -o rnd_test rnd_test.c 
 > -lgsl'.
 > 
 > Now, I also run the test program on a SuSE based machine, and the call 
 > fails with an floating point exception as expected.

Thanks. I've added a test to catch the case n=0 and return an error, and
added some notes in the documentation to explain how the function works.

-- 
Brian Gough

Network Theory Ltd,
Publishing Free Software Manuals --- http://www.network-theory.co.uk/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rng question
  2005-12-07 20:07 ` Brian Gough
@ 2005-12-09 13:49   ` Jari Häkkinen
  2005-12-17 16:50     ` Brian Gough
  0 siblings, 1 reply; 11+ messages in thread
From: Jari Häkkinen @ 2005-12-09 13:49 UTC (permalink / raw)
  To: Brian Gough; +Cc: gsl-discuss

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

No, the attached program does not return from the call to 
gsl_rng_uniform_int.

The problem shows up on my powermac running MacOSX 10.4.3 with gcc 
version powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 20041026 (Apple 
Computer, Inc. build 4061).
I compile GSL from CVS (20051208 = today) with with flags -NDEBUG -O2. 
The test program is compiled with a plain 'gcc -o rnd_test rnd_test.c 
-lgsl'.

Now, I also run the test program on a SuSE based machine, and the call 
fails with an floating point exception as expected.


Jari


Brian Gough wrote:
> Jari Häkkinen writes:
>  > According to the documentation gsl_rng_max(rng_ returns the maximum 
>  > random number the underlying rng can give, and 
>  > gsl_rng_uniform_int(rng,gsl_rng_max(rng)) will return 
>  > [0,gsl_rng_max(rng)-1]. This will not give the maximum number from the 
>  > underlying generator, so I tried 
>  > gsl_rng_uniform_int(rng,gsl_rng_max(rng)+1) but this call does not 
>  > return (since the second argument becomes 0).
> 
> Doesn't it give an exception? (due to division by zero in the expression scale = range / n;)
> What platform are you using?
> 

[-- Attachment #2: rnd_test.c --]
[-- Type: text/plain, Size: 280 bytes --]

#include <gsl/gsl_rng.h>

int main(const int argc,const char* argv[])
{
	gsl_rng_env_setup();
	gsl_rng* rng=gsl_rng_alloc(gsl_rng_default);
	unsigned long max=gsl_rng_max(rng);
	printf("%u\n",max+1);
	unsigned long number=gsl_rng_uniform_int(rng,max+1);
	printf("%u\n",number);
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: rng question
  2005-12-07 10:25 rng question Jari Häkkinen
@ 2005-12-07 20:07 ` Brian Gough
  2005-12-09 13:49   ` Jari Häkkinen
  0 siblings, 1 reply; 11+ messages in thread
From: Brian Gough @ 2005-12-07 20:07 UTC (permalink / raw)
  To: Jari Häkkinen; +Cc: gsl-discuss

Jari Häkkinen writes:
 > According to the documentation gsl_rng_max(rng_ returns the maximum 
 > random number the underlying rng can give, and 
 > gsl_rng_uniform_int(rng,gsl_rng_max(rng)) will return 
 > [0,gsl_rng_max(rng)-1]. This will not give the maximum number from the 
 > underlying generator, so I tried 
 > gsl_rng_uniform_int(rng,gsl_rng_max(rng)+1) but this call does not 
 > return (since the second argument becomes 0).

Doesn't it give an exception? (due to division by zero in the expression scale = range / n;)
What platform are you using?

-- 
Brian Gough

Network Theory Ltd,
Publishing Free Software Manuals --- http://www.network-theory.co.uk/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* rng question
@ 2005-12-07 10:25 Jari Häkkinen
  2005-12-07 20:07 ` Brian Gough
  0 siblings, 1 reply; 11+ messages in thread
From: Jari Häkkinen @ 2005-12-07 10:25 UTC (permalink / raw)
  To: gsl-discuss

Hi all,

I have question about the random number generator features and 
documentation.

According to the documentation gsl_rng_max(rng_ returns the maximum 
random number the underlying rng can give, and 
gsl_rng_uniform_int(rng,gsl_rng_max(rng)) will return 
[0,gsl_rng_max(rng)-1]. This will not give the maximum number from the 
underlying generator, so I tried 
gsl_rng_uniform_int(rng,gsl_rng_max(rng)+1) but this call does not 
return (since the second argument becomes 0).

Shouldn't this case be covered by GSL? Either by returning a error code 
or maybe setting the upper bound to gsl_rng_max(rng) when a 0 second 
argumnet is given? At least, I think this case should be covered in the 
documentation.


Cheers,

Jari

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-08-18 21:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-03 23:32 RNG question Robert G. Brown
2008-08-05  9:05 ` Fabian Bastin
2008-08-08 11:12 ` Brian Gough
2008-08-08 12:33   ` Robert G. Brown
2008-08-11 20:47     ` Brian Gough
2008-08-18 21:12     ` Robert G. Brown
  -- strict thread matches above, loose matches on Subject: below --
2005-12-07 10:25 rng question Jari Häkkinen
2005-12-07 20:07 ` Brian Gough
2005-12-09 13:49   ` Jari Häkkinen
2005-12-17 16:50     ` Brian Gough
2005-12-17 23:58       ` Jari Häkkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).