Re: getentropy() vs. getrandom() vs. arc4random()

public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed

From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Fernando Gont <fernando@gont.com.ar>
Cc: Libc-help <libc-help@sourceware.org>
Subject: Re: getentropy() vs. getrandom() vs. arc4random()
Date: Wed, 15 Jun 2022 11:00:58 -0700	[thread overview]
Message-ID: <AB83CF9E-34F7-4805-986F-9404C71B1986@linaro.org> (raw)
In-Reply-To: <f0430062-8001-525a-70dd-199843ab627a@gont.com.ar>



> On 15 Jun 2022, at 07:24, Fernando Gont <fernando@gont.com.ar> wrote:
> 
> Hi,
> 
> I'm currently trying to grasp the functional differences in the different interfaces to generate pseudorandom numbers in different platforms. And I was wondering if you could shed some light on some questions I have.
> 
> 
> ** Brief Background: **
> 
> We're working on a document where we warn users about the security implications of using rand() and random() to generate pseudorandom numbers (in scenarios where cryptographically secure pseudorandom numbers are needed).
> 
> So we want to recommend better PRNG options for different operating systems. For example, in the case of OpenBSD, we recommend the use of arc4random(3), which provides a higher-level interface than the getentropy(2) system call.
> 
> However, we're unsure about what to recommend for the Linux case.
> 
> For the Linux case, I see that there's a lot of code using getrandom(2) -- a syscall --, which is kind of complex/too-low-level. And I see that Linux also has getentropy(3) library function, which is described in random(7) as a "more portable interface the underlying PRNG devices".
> 
> So, for the Linux case, I feel tempted to recommend the usage of getentropy(3) over getrandom(2), but since most code employs getrandom(2), I'm not sure whether I'm missing something.
> 
> Any thoughts?
> 
> Aside, it seems that for OpenBSD, getentropy(2) is a "low-level" syscall, while arc4random(3) is a high-level library function. But in the case of Linux, getentropy(3) is a high-level library function instead, while getrandom(2) is the low-level syscall.  --  which means that usage of these interfaces would probably not be consistent across platforms.
> 
> Is this actually the case?

On glibc, getentropy and getrandom both end calling getrandom syscall although
with different flags. The getentropy calls getrandom without any flag which in turn
get entropy from /dev/urandom. The getrandom function allows us to specify
which source you use through GRND_RANDOM flag.

Also, getentropy current has a hard limit of maximum of 256 bytes and it is not
defined a cancelation entrypoint (so pthread_cancel does not act upon it). 

So both functions drawn entropy direct from the kernel and with recent Linux
random number development to unify both random and urandom the difference
might ended up with just getentropy being a cancellation entrypoint.

The rand and random functions are both userspace only where caller should set
PRNG state and both returns predictable output based on the initial seed. On glibc
both are implemented with either a LGC or a polynomial generated, set by the
seed size. So the quality of the output will depend of the seed entropy and the
limitation of the polynomial used.

The arc4random is similar to getentropy and getrandom, but it tries to use kernel
entropy to initialize a PRNG. Also, the usual implementation that uses ChaCha20
(OpenBSD, FreeBSD) periodically feeds back kernel entropy to improve randomness. 
The arc4random also provides some more guarantees, like fork-detection.

We are aiming to provide arc4random on new glibc version [1].

[1] https://patchwork.sourceware.org/project/glibc/list/?series=9540

> 
> 
> FWIW, if you're curious, the document we're working on is this one: https://www.ietf.org/archive/id/draft-irtf-pearg-numeric-ids-generation-10.txt, and the section that led me to start this thread is Section 7.1:
> 
> ----- cut here ----
> 7.1.  Category #1: Uniqueness (soft failure)
> 
>   The requirement of uniqueness with a soft failure severity can be
>   complied with a Pseudo-Random Number Generator (PRNG).
> 
>   NOTE:
>      Please see [RFC4086] regarding randomness requirements for
>      security.
> 
>   While most systems provide access to a PRNG, many of such PRNG
>   implementations are not cryptographically secure, and therefore might
>   be statistically biased or subject to adversarial influence.  For
>   example, ISO C [C11] rand(3) implementations are not
>   cryptographically secure.
> 
>   NOTE:
>      Section 7.1 ("Uniform Deviates") of [Press1992] discusses the
>      underlying issues affecting ISO C [C11] rand(3) implementations.
> 
>   On the other hand, a number of systems provide an interface to a
>   Cryptographically Secure PRNG (CSPRNG) [RFC8937] [RFC4086], which
>   guarantees high entropy, unpredictability, and good statistical
>   distribution of the random values generated.  For example, GNU/
>   Linux's CSPRNG implementation is available via the getentropy(3)
>   interface [GETENTROPY], while OpenBSD's CSPRNG implementation is
>   available via the arc4random(3) and arc4random_uniform(3) interfaces
>   [ARC4RANDOM].  Where available, these CSPRNGs should be preferred
>   over e.g.  POSIX [POSIX] random(3) or ISO C [C11] rand(3)
>   implementations.
> 
>   In scenarios where a CSPRNG is not readily available to select
>   transient numeric identifiers of Category #1, a security and privacy
>   assessment of employing a regular PRNG should be performed,
>   supporting the implementation decision.
> 
>   NOTE:
>      [Aumasson2018], [Press1992], and [Knuth1983], discuss theoretical
>      and practical aspects of pseudorandom numbers generation, and
>      provide guidance on how to evaluate PRNGs.
> 
>   We note that since the premise is that collisions of transient
>   numeric identifiers of this category only leads to soft failures, in
>   many cases, the algorithm might not need to check the suitability of
>   a selected identifier (i.e., the suitable_id() function, described
>   below, could always return "true").
> ---- cut here ----
> 
> Thanks a lot in advance!
> 
> Regards,
> -- 
> Fernando Gont
> e-mail: fernando@gont.com.ar
> PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1

next prev parent reply	other threads:[~2022-06-15 18:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15 14:24 Fernando Gont
2022-06-15 18:00 ` Adhemerval Zanella [this message]
2022-06-15 18:03   ` Noah Goldstein
2022-06-15 20:29     ` Adhemerval Zanella
2022-06-16 17:12   ` Fernando Gont
2022-06-16 17:27     ` Yann Droneaud
2022-06-16 17:46       ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AB83CF9E-34F7-4805-986F-9404C71B1986@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fernando@gont.com.ar \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).