public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* rand is not ISO C compliant in Cygwin
@ 2023-11-10 20:19 Bruno Haible
  2023-11-10 21:39 ` Norton Allen
  2023-11-13 14:17 ` Corinna Vinschen
  0 siblings, 2 replies; 18+ messages in thread
From: Bruno Haible @ 2023-11-10 20:19 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]

ISO C 23 § 7.24.2.1 and 7.24.2.2 document how rand() and srand() are
expected to behave. In particular:
  "The srand function uses the argument as a seed for a new sequence
   of pseudo-random numbers to be returned by subsequent calls to rand.
   If srand is then called with the same seed value, the sequence of
   pseudo-random numbers shall be repeated. ...
   The srand function is not required to avoid data races with other
   calls to pseudo-random sequence generation functions. ..."

The two attached programs call srand() in one thread and then rand()
in another thread. There is no data race, since the second thread
is only created after the call to srand() has returned. The behaviour
in Cygwin is that the values in the second thread ignore the srand()
call done in the first thread.

How to reproduce the bug:

$ x86_64-pc-cygwin-gcc -Wall rand-in-posix-thread.c
$ ./a

or

$ x86_64-pc-cygwin-gcc -Wall rand-in-isoc-thread.c
$ ./a

Expected output:

Value from main thread:     1583559764
Value from separate thread: 1583559764

Actual output:

Value from main thread:     1583559764
Value from separate thread: 1481765933


[-- Attachment #2: rand-in-posix-thread.c --]
[-- Type: text/x-csrc, Size: 482 bytes --]

#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

static void *
rand_invocator_thread (void *arg)
{
  printf ("Value from separate thread: %d\n", rand ());
  return NULL;
}

int
main ()
{
  unsigned int seed = 19891109;

  srand (seed);
  printf ("Value from main thread:     %d\n", rand ());
  srand (seed);
  pthread_t t;
  assert (pthread_create (&t, NULL, rand_invocator_thread, NULL) == 0);
  assert (pthread_join (t, NULL) == 0);

  return 0;
}

[-- Attachment #3: rand-in-isoc-thread.c --]
[-- Type: text/x-csrc, Size: 483 bytes --]

#include <assert.h>
#include <threads.h>
#include <stdio.h>
#include <stdlib.h>

static int
rand_invocator_thread (void *arg)
{
  printf ("Value from separate thread: %d\n", rand ());
  return 0;
}

int
main ()
{
  unsigned int seed = 19891109;

  srand (seed);
  printf ("Value from main thread:     %d\n", rand ());
  srand (seed);
  thrd_t t;
  assert (thrd_create (&t, rand_invocator_thread, NULL) == thrd_success);
  assert (thrd_join (t, NULL) == thrd_success);

  return 0;
}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-10 20:19 rand is not ISO C compliant in Cygwin Bruno Haible
@ 2023-11-10 21:39 ` Norton Allen
  2023-11-10 22:27   ` Bruno Haible
  2023-11-13 14:17 ` Corinna Vinschen
  1 sibling, 1 reply; 18+ messages in thread
From: Norton Allen @ 2023-11-10 21:39 UTC (permalink / raw)
  To: Bruno Haible, cygwin

On 11/10/2023 3:19 PM, Bruno Haible via Cygwin wrote:
> ISO C 23 § 7.24.2.1 and 7.24.2.2 document how rand() and srand() are
> expected to behave. In particular:
>    "The srand function uses the argument as a seed for a new sequence
>     of pseudo-random numbers to be returned by subsequent calls to rand.
>     If srand is then called with the same seed value, the sequence of
>     pseudo-random numbers shall be repeated. ...
>     The srand function is not required to avoid data races with other
>     calls to pseudo-random sequence generation functions. ..."
>
> The two attached programs call srand() in one thread and then rand()
> in another thread. There is no data race, since the second thread
> is only created after the call to srand() has returned. The behaviour
> in Cygwin is that the values in the second thread ignore the srand()
> call done in the first thread.

Since the standard is trying to be precise, this reads to me as though 
Cygwin/(newlib?) has chosen to avoid race conditions by making 
pseudo-random sequences in different threads independent. Although the 
standard does not require this, it does not prohibit it either.


>
> How to reproduce the bug:
>
> $ x86_64-pc-cygwin-gcc -Wall rand-in-posix-thread.c
> $ ./a
>
> or
>
> $ x86_64-pc-cygwin-gcc -Wall rand-in-isoc-thread.c
> $ ./a
>
> Expected output:
>
> Value from main thread:     1583559764
> Value from separate thread: 1583559764
>
> Actual output:
>
> Value from main thread:     1583559764
> Value from separate thread: 1481765933
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-10 21:39 ` Norton Allen
@ 2023-11-10 22:27   ` Bruno Haible
  2023-11-11 16:50     ` Allen, Norton T.
  0 siblings, 1 reply; 18+ messages in thread
From: Bruno Haible @ 2023-11-10 22:27 UTC (permalink / raw)
  To: cygwin, Norton Allen

Norton Allen wrote:
> Cygwin/(newlib?) has chosen to avoid race conditions by making 
> pseudo-random sequences in different threads independent. Although the 
> standard does not require this, it does not prohibit it either.

I disagree. I cited the relevant sentences from the standard.

Other platforms (glibc, Android) avoid race conditions in rand()
by locking. Which is compliant with the standard.

Bruno




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-10 22:27   ` Bruno Haible
@ 2023-11-11 16:50     ` Allen, Norton T.
  2023-11-11 18:25       ` René Berber
  0 siblings, 1 reply; 18+ messages in thread
From: Allen, Norton T. @ 2023-11-11 16:50 UTC (permalink / raw)
  To: Bruno Haible, cygwin

[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]

On 11/10/2023 5:27 PM, Bruno Haible wrote:
> Norton Allen wrote:
>> Cygwin/(newlib?) has chosen to avoid race conditions by making
>> pseudo-random sequences in different threads independent. Although the
>> standard does not require this, it does not prohibit it either.
> I disagree. I cited the relevant sentences from the standard.

This is the sentence you quoted that I am referring to:

> The srand function is not required to avoid data races with other
>     calls to pseudo-random sequence generation functions. ..." 
That is not the same as "... required never to avoid data races ...". 
"not required" means the sentence is not specifying--not requiring--any 
behavior, so you should not depend on the described behaviors.

>
> Other platforms (glibc, Android) avoid race conditions in rand()
> by locking. Which is compliant with the standard.

I'm not sure I understand what you're saying. Do they require the caller 
to implement their own race-avoidance approach via locking, or is it 
implemented within rand()?

As you point out, there is no race condition here, but it looks to me as 
though the Cygwin strategy chooses to avoid race conditions by making 
the rand() state be entirely private to each thread. As a result, the 
call in rand_invocator_thread() acts as though no call to srand() has 
occurred. You are suggesting that the thread should inherit the rand 
state from the parent, which would certainly make the threads subject to 
race conditions, even if the calls themselves were thread-safe. Which 
value your thread gets would depend on how fast it ran relative to the 
other threads. Unless there is language in the standard that says the 
states should be shared, I don't see how the Cygwin approach violates 
the standard.

That said, I am often wrong, and I'd be happy to have you point out my 
misunderstanding.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-11 16:50     ` Allen, Norton T.
@ 2023-11-11 18:25       ` René Berber
  2023-11-11 20:18         ` Allen, Norton T.
  0 siblings, 1 reply; 18+ messages in thread
From: René Berber @ 2023-11-11 18:25 UTC (permalink / raw)
  To: cygwin

On 11/11/2023 10:50 AM, Allen, Norton T. via Cygwin wrote:

[snip]
>> The srand function is not required to avoid data races with other
>>     calls to pseudo-random sequence generation functions. ..." 
> That is not the same as "... required never to avoid data races ...". 
> "not required" means the sentence is not specifying--not requiring--any 
> behavior, so you should not depend on the described behaviors.
[snip]

The elided part on Bruno's message is:

"The implementation shall behave as if no library function calls the
rand function."

Which is the point Bruno is making with the sample code.
-- 
R.Berber


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-11 18:25       ` René Berber
@ 2023-11-11 20:18         ` Allen, Norton T.
  0 siblings, 0 replies; 18+ messages in thread
From: Allen, Norton T. @ 2023-11-11 20:18 UTC (permalink / raw)
  To: René Berber, cygwin

On 11/11/2023 1:25 PM, René Berber via Cygwin wrote:
> On 11/11/2023 10:50 AM, Allen, Norton T. via Cygwin wrote:
>
> [snip]
>>> The srand function is not required to avoid data races with other
>>>     calls to pseudo-random sequence generation functions. ..." 
>> That is not the same as "... required never to avoid data races ...". 
>> "not required" means the sentence is not specifying--not 
>> requiring--any behavior, so you should not depend on the described 
>> behaviors.
> [snip]
>
> The elided part on Bruno's message is:
>
> "The implementation shall behave as if no library function calls the
> rand function."
>
> Which is the point Bruno is making with the sample code.

I would still assert that if the implementation is doing what I 
suggested (maintaining independent state for each thread) it would still 
meet that criteria, although I admit it is debatable. That said, the man 
page says:

"rand and srand are unsafe for multi-threaded applications. rand_r is 
thread-safe and should be used instead."

I read that as "all bets are off" if you are using them in a 
multi-threaded application.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-10 20:19 rand is not ISO C compliant in Cygwin Bruno Haible
  2023-11-10 21:39 ` Norton Allen
@ 2023-11-13 14:17 ` Corinna Vinschen
  2023-11-13 14:25   ` Bruno Haible
  1 sibling, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-13 14:17 UTC (permalink / raw)
  To: Bruno Haible, newlib; +Cc: cygwin


[redirecting to the newlib mailing list]

This is long-standing code in newlib, not actually inside Cygwin.

On Nov 10 21:19, Bruno Haible via Cygwin wrote:
> ISO C 23 § 7.24.2.1 and 7.24.2.2 document how rand() and srand() are
> expected to behave. In particular:
>   "The srand function uses the argument as a seed for a new sequence
>    of pseudo-random numbers to be returned by subsequent calls to rand.
>    If srand is then called with the same seed value, the sequence of
>    pseudo-random numbers shall be repeated. ...
>    The srand function is not required to avoid data races with other
>    calls to pseudo-random sequence generation functions. ..."
> 
> The two attached programs call srand() in one thread and then rand()
> in another thread. There is no data race, since the second thread
> is only created after the call to srand() has returned. The behaviour
> in Cygwin is that the values in the second thread ignore the srand()
> call done in the first thread.

Struct _reent is a kind of per-execution unit datastructure.  This
could be independent code blocks in bare-metal code, or threads in
Cygwin et al.  So the _reent struct also holds the seed state of the
rand and rand48 generators for a good reason.

But that's only half the picture, because newlib actually has two ways
of storing _reent data:  Either in a pre-thread struct _reent, or (if
_REENT_THREAD_LOCAL is defined) as per-member thread_local storage.

Theoretically, this could be easily fixed:

- Either we maintain a global struct _reent which could be used
  from rand().

- Or, in case of _REENT_THREAD_LOCAL, by making the rand48 data globally
  available as static data, rather than as thread_local data.

The rand() function would still not use locking but AFAICS that's
not actually required by POSIX or ISO C.

However, this is something I don't want to decide single-handedly,
so I'm asking how to go forward on the newlib ML.

As far as Cygwin alone is concerned, a patch like this one would suffice:

diff --git a/newlib/libc/stdlib/rand.c b/newlib/libc/stdlib/rand.c
index ba9cc80f2b21..2b48e7a725b1 100644
--- a/newlib/libc/stdlib/rand.c
+++ b/newlib/libc/stdlib/rand.c
@@ -58,6 +58,12 @@ on two different systems.
 #include <stdlib.h>
 #include <reent.h>
 
+#ifdef __CYGWIN__
+#define _RAND_REENT	_GLOBAL_REENT
+#else
+#define _RAND_REENT	_REENT
+#endif
+
 #ifdef _REENT_THREAD_LOCAL
 _Thread_local unsigned long long _tls_rand_next = 1;
 #endif
@@ -65,7 +71,7 @@ _Thread_local unsigned long long _tls_rand_next = 1;
 void
 srand (unsigned int seed)
 {
-  struct _reent *reent = _REENT;
+  struct _reent *reent = _RAND_REENT;
 
   _REENT_CHECK_RAND48(reent);
   _REENT_RAND_NEXT(reent) = seed;
@@ -74,7 +80,7 @@ srand (unsigned int seed)
 int
 rand (void)
 {
-  struct _reent *reent = _REENT;
+  struct _reent *reent = _RAND_REENT;
 
   /* This multiplier was obtained from Knuth, D.E., "The Art of
      Computer Programming," Vol 2, Seminumerical Algorithms, Third




> 
> How to reproduce the bug:
> 
> $ x86_64-pc-cygwin-gcc -Wall rand-in-posix-thread.c
> $ ./a
> 
> or
> 
> $ x86_64-pc-cygwin-gcc -Wall rand-in-isoc-thread.c
> $ ./a
> 
> Expected output:
> 
> Value from main thread:     1583559764
> Value from separate thread: 1583559764
> 
> Actual output:
> 
> Value from main thread:     1583559764
> Value from separate thread: 1481765933
> 

> #include <assert.h>
> #include <pthread.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> static void *
> rand_invocator_thread (void *arg)
> {
>   printf ("Value from separate thread: %d\n", rand ());
>   return NULL;
> }
> 
> int
> main ()
> {
>   unsigned int seed = 19891109;
> 
>   srand (seed);
>   printf ("Value from main thread:     %d\n", rand ());
>   srand (seed);
>   pthread_t t;
>   assert (pthread_create (&t, NULL, rand_invocator_thread, NULL) == 0);
>   assert (pthread_join (t, NULL) == 0);
> 
>   return 0;
> }

> #include <assert.h>
> #include <threads.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> static int
> rand_invocator_thread (void *arg)
> {
>   printf ("Value from separate thread: %d\n", rand ());
>   return 0;
> }
> 
> int
> main ()
> {
>   unsigned int seed = 19891109;
> 
>   srand (seed);
>   printf ("Value from main thread:     %d\n", rand ());
>   srand (seed);
>   thrd_t t;
>   assert (thrd_create (&t, rand_invocator_thread, NULL) == thrd_success);
>   assert (thrd_join (t, NULL) == thrd_success);
> 
>   return 0;
> }

> 
> -- 
> Problem reports:      https://cygwin.com/problems.html
> FAQ:                  https://cygwin.com/faq/
> Documentation:        https://cygwin.com/docs.html
> Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 14:17 ` Corinna Vinschen
@ 2023-11-13 14:25   ` Bruno Haible
  2023-11-13 14:38     ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Bruno Haible @ 2023-11-13 14:25 UTC (permalink / raw)
  To: Bruno Haible, newlib, cygwin

Corinna Vinschen wrote:
> The rand() function would still not use locking but AFAICS that's
> not actually required by POSIX or ISO C.

Correct. Those who want an MT-safe rand-like function need to use random(),
not rand().

Bruno




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 14:25   ` Bruno Haible
@ 2023-11-13 14:38     ` Corinna Vinschen
  2023-11-13 16:21       ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-13 14:38 UTC (permalink / raw)
  To: Bruno Haible; +Cc: newlib, cygwin

On Nov 13 15:25, Bruno Haible wrote:
> Corinna Vinschen wrote:
> > The rand() function would still not use locking but AFAICS that's
> > not actually required by POSIX or ISO C.
> 
> Correct. Those who want an MT-safe rand-like function need to use random(),
> not rand().

FTR, we have to differ here between plain newlib targets and Cygwin.

While Cygwin comes with it's own, table-based random() implementation
taken from BSD back in 2007, newlib's random is basically equivalent to
rand() for the sake of bare-metal and other targets with high memory
pressure.  We shouldn't change the latter since multi-threading
compliance is usually not much of a problem for that target audience.


Corinna


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 14:38     ` Corinna Vinschen
@ 2023-11-13 16:21       ` Corinna Vinschen
  2023-11-13 16:44         ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
  2023-11-13 21:33         ` Bruno Haible
  0 siblings, 2 replies; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-13 16:21 UTC (permalink / raw)
  To: Bruno Haible; +Cc: newlib, cygwin

On Nov 13 15:38, Corinna Vinschen wrote:
> On Nov 13 15:25, Bruno Haible wrote:
> > Corinna Vinschen wrote:
> > > The rand() function would still not use locking but AFAICS that's
> > > not actually required by POSIX or ISO C.
> > 
> > Correct. Those who want an MT-safe rand-like function need to use random(),
> > not rand().

I took a look into POSIX and I'm a bit puzzled now.  From
https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html

RATIONAL

  The ISO C standard rand() and srand() functions allow per-process
                                                  ^^^^^ (not requires)

  pseudo-random streams shared by all threads. Those two functions need
  not change, but there has to be mutual-exclusion that prevents
  interference between two threads concurrently accessing the random
  number generator.

Ok, so, *iff* rand/srand share per-process state, then they have to
use locking to prevent MT interference.  POSIX continues:

  With regard to rand(), there are two different behaviors that may be
  wanted in a multi-threaded program:

  1. A single per-process sequence of pseudo-random numbers that is
     shared by all threads that call rand()

  2. A different sequence of pseudo-random numbers for each thread that
     calls rand()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I read this as the newlib technique being one way of correctly
implementing rand/srand, no?


Corinna


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [EXTERNAL] Re: rand is not ISO C compliant in Cygwin
  2023-11-13 16:21       ` Corinna Vinschen
@ 2023-11-13 16:44         ` Lavrentiev, Anton (NIH/NLM/NCBI) [C]
  2023-11-13 21:33         ` Bruno Haible
  1 sibling, 0 replies; 18+ messages in thread
From: Lavrentiev, Anton (NIH/NLM/NCBI) [C] @ 2023-11-13 16:44 UTC (permalink / raw)
  To: newlib; +Cc: Corinna Vinschen, cygwin

IMHO:

>   2. A different sequence

The word "different" in this context is ambiguous: is it "unrelated" as a generator, or is it "not the same" sequence of the actual numbers?

> I read this as the newlib technique being one way of correctly implementing rand/srand, no?

If the first, then yes; but if the second, then no.

The problem with the first approach is, however, is the inability to adequately randomize your code (e.g. for testing).

You call srand() in the main() thread, and then spawn threads thinking they will inherit the randomization; but in fact, they all start off the same number sequence, regardless.

Anton Lavrentiev
Contractor NIH/NLM/NCBI


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 16:21       ` Corinna Vinschen
  2023-11-13 16:44         ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
@ 2023-11-13 21:33         ` Bruno Haible
  2023-11-13 22:14           ` Glenn Strauss
  2023-11-14 10:11           ` Corinna Vinschen
  1 sibling, 2 replies; 18+ messages in thread
From: Bruno Haible @ 2023-11-13 21:33 UTC (permalink / raw)
  To: newlib, cygwin

Corinna Vinschen wrote:
> I took a look into POSIX and I'm a bit puzzled now.  From
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html

Part of the confusion is that POSIX and ISO C have slightly different
wording. This POSIX page says:
   "The functionality described on this reference page is aligned
    with the ISO C standard. Any conflict between the requirements
    described here and the ISO C standard is unintentional. This
    volume of POSIX.1-2017 defers to the ISO C standard."

In ISO C 99 § 7.20.2, the only relevant sentence is:

  "The srand function uses the argument as a seed for a new sequence
   of pseudo-random numbers to be returned by subsequent calls to rand.
   If srand is then called with the same seed value, the sequence of
   pseudo-random numbers shall be repeated."

In ISO C 11 § 7.22.2 and ISO C 17 § 7.22.2, additionally two sentences
were inserted:

  "The rand function is not required to avoid data races with other
   calls to pseudo-random sequence generation functions."

  "The srand function is not required to avoid data races with other
   calls to pseudo-random sequence generation functions."

ISO C 23 (which is still is draft state, but compared to the draft
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf I cannot
see any change regarding rand() in the changes summary
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3148.doc) has the
same wording.

POSIX does not have these two sentences, but instead has:

  "The rand() function need not be thread-safe."

> RATIONAL
> 
>   The ISO C standard rand() and srand() functions allow per-process
>                                                   ^^^^^ (not requires)
> 
>   pseudo-random streams shared by all threads.

Indeed, "requires" would fit better here, IMO, because the texts of
both ISO C and POSIX have multithreading in mind and still talk about
"subsequent calls to rand" — which makes a reference to time, but not
to threads.

> Ok, so, *iff* rand/srand share per-process state, then they have to
> use locking to prevent MT interference.

... if the implementor wants to prevent MT interference (which both
ISO C and POSIX allows).

> POSIX continues:
> 
>   With regard to rand(), there are two different behaviors that may be
>   wanted in a multi-threaded program:
> 
>   1. A single per-process sequence of pseudo-random numbers that is
>      shared by all threads that call rand()
> 
>   2. A different sequence of pseudo-random numbers for each thread that
>      calls rand()
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This paragraph continues after the two items:
   "This is provided by the modified thread-safe function based on whether
    the seed value is global to the entire process or local to each thread."

My understanding of this paragraph is:
  - If an application wants 1., they can use rand_r with SEED pointing
    to a global variable.
  - If an application wants 2., they can use rand_r with SEED pointing
    to a per-thread variable.

> I read this as the newlib technique being one way of correctly
> implementing rand/srand, no?

I don't think so. The critical sentence is the one with
"subsequent calls to rand".

Bruno




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 21:33         ` Bruno Haible
@ 2023-11-13 22:14           ` Glenn Strauss
  2023-11-14 10:20             ` Corinna Vinschen
  2023-11-14 10:11           ` Corinna Vinschen
  1 sibling, 1 reply; 18+ messages in thread
From: Glenn Strauss @ 2023-11-13 22:14 UTC (permalink / raw)
  To: Bruno Haible; +Cc: newlib, cygwin

On Mon, Nov 13, 2023 at 10:33:48PM +0100, Bruno Haible via Cygwin wrote:
> Corinna Vinschen wrote:
> > I took a look into POSIX and I'm a bit puzzled now.  From
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html
> 
> Part of the confusion is that POSIX and ISO C have slightly different
> wording. This POSIX page says:
>    "The functionality described on this reference page is aligned
>     with the ISO C standard. Any conflict between the requirements
>     described here and the ISO C standard is unintentional. This
>     volume of POSIX.1-2017 defers to the ISO C standard."
> 
> In ISO C 99 § 7.20.2, the only relevant sentence is:
> 
>   "The srand function uses the argument as a seed for a new sequence
>    of pseudo-random numbers to be returned by subsequent calls to rand.
>    If srand is then called with the same seed value, the sequence of
>    pseudo-random numbers shall be repeated."
> 
> In ISO C 11 § 7.22.2 and ISO C 17 § 7.22.2, additionally two sentences
> were inserted:
> 
>   "The rand function is not required to avoid data races with other
>    calls to pseudo-random sequence generation functions."
> 
>   "The srand function is not required to avoid data races with other
>    calls to pseudo-random sequence generation functions."
> 
> ISO C 23 (which is still is draft state, but compared to the draft
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf I cannot
> see any change regarding rand() in the changes summary
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3148.doc) has the
> same wording.
> 
> POSIX does not have these two sentences, but instead has:
> 
>   "The rand() function need not be thread-safe."

I read the above as requiring *reentrancy*, but not *thread-safety*.

If multiple threads are accessing rand() and rand() accesses global
state, the global state should not get corrupted; the sequence
generated by rand() should remain intact.  Since thread-safety is not
guaranteed, is it theoretically possible that multiple threads enter
rand() at nearly the same time and both get the *same* random value in
the sequence.  (Whether or not that is undesirable behavior is
application-specific.)  A mutex can avoid this theoretical duplication,
as can using thread-local state (with difference seeds) instead of
global state.  If the seed value is the same in multiple threads using
thread-local state, the sequence of random values generated in each
thread will be repeated in each thread.  This may be surprising behavior
to some when srand() is called, then multiple threads spawned, and each
thread subsequently gets the same sequence of values from rand().

If the global state is a single item and atomics can be used to access
and update the global state, then an implemntation can use atomics
instead of mutexes to achieve thread-safety, which is allowed by the
standards, but not required for rand().

Cheers, Glenn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 21:33         ` Bruno Haible
  2023-11-13 22:14           ` Glenn Strauss
@ 2023-11-14 10:11           ` Corinna Vinschen
  2023-11-14 11:52             ` Bruno Haible
  1 sibling, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-14 10:11 UTC (permalink / raw)
  To: Bruno Haible; +Cc: newlib, cygwin

Hi Bruno,

On Nov 13 22:33, Bruno Haible via Cygwin wrote:
> Corinna Vinschen wrote:
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html
> > [...]
> >   With regard to rand(), there are two different behaviors that may be
> >   wanted in a multi-threaded program:
> > 
> >   1. A single per-process sequence of pseudo-random numbers that is
> >      shared by all threads that call rand()
> > 
> >   2. A different sequence of pseudo-random numbers for each thread that
> >      calls rand()
> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> This paragraph continues after the two items:
>    "This is provided by the modified thread-safe function based on whether
>     the seed value is global to the entire process or local to each thread."
> 
> My understanding of this paragraph is:
>   - If an application wants 1., they can use rand_r with SEED pointing
>     to a global variable.
>   - If an application wants 2., they can use rand_r with SEED pointing
>     to a per-thread variable.

The problem I have with bringing rand_r() into the picture at this point
is two-fold:

- The paragraph explicitely states "With regard to rand() ..."

- rand_r() is obsolescent and may be removed in a future version.

The rational section is entirely dedicated to the base functions
rand()/srand() and doesn't mention rand_r() even once.  I don't see
that the vague expression "the modified thread-safe function" is really
meant to be rand_r(), or rather rand() after an implementation decides
to make rand() thread-safe.

> > I read this as the newlib technique being one way of correctly
> > implementing rand/srand, no?
> 
> I don't think so. The critical sentence is the one with
> "subsequent calls to rand".

I see what you mean.  However, what sense is there in providing a global
state, while at the same time rand() doesn't need to be thread-safe.  In
the end, if you call srand() once and then run rand() in concurrent
threads, the implementation has no control over the sequences generated
per-thread, unless your application threads will sync the calls explicitely.

We have a potential patch to align rand/srand to your interpretation,
at least for Cygwin if nobody else in the newlib community chimes in.
It's just that, personally, I'm not yet convinced that this is the only
possible interpretation.

Sigh... yet another case of unnecessary vagueness in the standards...


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-13 22:14           ` Glenn Strauss
@ 2023-11-14 10:20             ` Corinna Vinschen
  0 siblings, 0 replies; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-14 10:20 UTC (permalink / raw)
  To: Glenn Strauss; +Cc: Bruno Haible, newlib, cygwin

On Nov 13 17:14, Glenn Strauss wrote:
> On Mon, Nov 13, 2023 at 10:33:48PM +0100, Bruno Haible via Cygwin wrote:
> > POSIX does not have these two sentences, but instead has:
> > 
> >   "The rand() function need not be thread-safe."
> 
> I read the above as requiring *reentrancy*, but not *thread-safety*.
> 
> If multiple threads are accessing rand() and rand() accesses global
> state, the global state should not get corrupted; the sequence
> generated by rand() should remain intact.  Since thread-safety is not
> guaranteed, is it theoretically possible that multiple threads enter
> rand() at nearly the same time and both get the *same* random value in
> the sequence.  (Whether or not that is undesirable behavior is
> application-specific.)  A mutex can avoid this theoretical duplication,
> as can using thread-local state (with difference seeds) instead of
> global state.  If the seed value is the same in multiple threads using
> thread-local state, the sequence of random values generated in each
> thread will be repeated in each thread.  This may be surprising behavior
> to some when srand() is called, then multiple threads spawned, and each
> thread subsequently gets the same sequence of values from rand().

That's a good common sense argument for changing rand() to use a global
state.


Corinna


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-14 10:11           ` Corinna Vinschen
@ 2023-11-14 11:52             ` Bruno Haible
  2023-11-14 16:59               ` Corinna Vinschen
  0 siblings, 1 reply; 18+ messages in thread
From: Bruno Haible @ 2023-11-14 11:52 UTC (permalink / raw)
  To: Bruno Haible, newlib, cygwin

Corinna Vinschen wrote:
> > My understanding of this paragraph is:
> >   - If an application wants 1., they can use rand_r with SEED pointing
> >     to a global variable.
> >   - If an application wants 2., they can use rand_r with SEED pointing
> >     to a per-thread variable.
> 
> The problem I have with bringing rand_r() into the picture at this point
> is two-fold:
> 
> - The paragraph explicitely states "With regard to rand() ..."
> 
> - rand_r() is obsolescent and may be removed in a future version.

This paragraph is not directed at the libc implementor; it is directed at
the application programmer. Because
  - Otherwise the phrase "may be wanted in a multi-threaded program"
    would not make sense.
  - The goals "1. A single per-process sequence of pseudo-random numbers"
    and "2. A different sequence of pseudo-random numbers for each thread"
    cannot be achieved at the same time, since rand() takes no argument.

> I don't see that the vague expression "the modified thread-safe function"
> is really meant to be rand_r(), or rather rand() after an implementation decides
> to make rand() thread-safe.

It must mean rand_r(), because rand() cannot satisfy both goals at the
same time.

> However, what sense is there in providing a global
> state, while at the same time rand() doesn't need to be thread-safe.  In
> the end, if you call srand() once and then run rand() in concurrent
> threads, the implementation has no control over the sequences generated
> per-thread, unless your application threads will sync the calls explicitely.

The sense is reproducibility across threads. The standards comittee
probably wanted as little thread-specific verbiage at this place.

The two sentences added in ISO C 11 mean that the application needs to
sync their threads by itself, yes. But that's not uncommon: Every use of,
say, a global linked-list or hash table object accessed by multiple threads
needs the same precaution.

Bruno




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-14 11:52             ` Bruno Haible
@ 2023-11-14 16:59               ` Corinna Vinschen
  2023-11-14 18:06                 ` Bruno Haible
  0 siblings, 1 reply; 18+ messages in thread
From: Corinna Vinschen @ 2023-11-14 16:59 UTC (permalink / raw)
  To: Bruno Haible; +Cc: newlib, cygwin

On Nov 14 12:52, Bruno Haible wrote:
> Corinna Vinschen wrote:
> > > My understanding of this paragraph is:
> > >   - If an application wants 1., they can use rand_r with SEED pointing
> > >     to a global variable.
> > >   - If an application wants 2., they can use rand_r with SEED pointing
> > >     to a per-thread variable.
> > 
> > The problem I have with bringing rand_r() into the picture at this point
> > is two-fold:
> > 
> > - The paragraph explicitely states "With regard to rand() ..."
> > 
> > - rand_r() is obsolescent and may be removed in a future version.
> 
> This paragraph is not directed at the libc implementor; it is directed at
> the application programmer. Because
>   - Otherwise the phrase "may be wanted in a multi-threaded program"
>     would not make sense.
>   - The goals "1. A single per-process sequence of pseudo-random numbers"
>     and "2. A different sequence of pseudo-random numbers for each thread"
>     cannot be achieved at the same time, since rand() takes no argument.
> 
> > I don't see that the vague expression "the modified thread-safe function"
> > is really meant to be rand_r(), or rather rand() after an implementation decides
> > to make rand() thread-safe.
> 
> It must mean rand_r(), because rand() cannot satisfy both goals at the
> same time.
> 
> > However, what sense is there in providing a global
> > state, while at the same time rand() doesn't need to be thread-safe.  In
> > the end, if you call srand() once and then run rand() in concurrent
> > threads, the implementation has no control over the sequences generated
> > per-thread, unless your application threads will sync the calls explicitely.
> 
> The sense is reproducibility across threads. The standards comittee
> probably wanted as little thread-specific verbiage at this place.
> 
> The two sentences added in ISO C 11 mean that the application needs to
> sync their threads by itself, yes. But that's not uncommon: Every use of,
> say, a global linked-list or hash table object accessed by multiple threads
> needs the same precaution.

Ok, I pushed a Cygwin-only patch for now, which calls Cygwin's random(3)
functions to implement rand(3), just as GLibC does it.

Since that's entirely inside the Cygwin codebase, other newlib targets
are not affected right now.  They still have a rand(3) PRNG per execution
unit using either struct _reent, or the thread_local implementation.


Corinna


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: rand is not ISO C compliant in Cygwin
  2023-11-14 16:59               ` Corinna Vinschen
@ 2023-11-14 18:06                 ` Bruno Haible
  0 siblings, 0 replies; 18+ messages in thread
From: Bruno Haible @ 2023-11-14 18:06 UTC (permalink / raw)
  To: cygwin

Corinna Vinschen wrote:
> Ok, I pushed a Cygwin-only patch for now, which calls Cygwin's random(3)
> functions to implement rand(3), just as GLibC does it.

Thanks!




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-11-14 18:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-10 20:19 rand is not ISO C compliant in Cygwin Bruno Haible
2023-11-10 21:39 ` Norton Allen
2023-11-10 22:27   ` Bruno Haible
2023-11-11 16:50     ` Allen, Norton T.
2023-11-11 18:25       ` René Berber
2023-11-11 20:18         ` Allen, Norton T.
2023-11-13 14:17 ` Corinna Vinschen
2023-11-13 14:25   ` Bruno Haible
2023-11-13 14:38     ` Corinna Vinschen
2023-11-13 16:21       ` Corinna Vinschen
2023-11-13 16:44         ` [EXTERNAL] " Lavrentiev, Anton (NIH/NLM/NCBI) [C]
2023-11-13 21:33         ` Bruno Haible
2023-11-13 22:14           ` Glenn Strauss
2023-11-14 10:20             ` Corinna Vinschen
2023-11-14 10:11           ` Corinna Vinschen
2023-11-14 11:52             ` Bruno Haible
2023-11-14 16:59               ` Corinna Vinschen
2023-11-14 18:06                 ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).