From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fgont.go6lab.si (fgont.go6lab.si [91.239.96.14]) by sourceware.org (Postfix) with ESMTPS id 7F2EE3858D32 for ; Wed, 15 Jun 2022 14:24:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7F2EE3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gont.com.ar Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gont.com.ar Received: from [IPV6:2800:810:464:8944:3ba1:b695:7f52:9e74] (unknown [IPv6:2800:810:464:8944:3ba1:b695:7f52:9e74]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by fgont.go6lab.si (Postfix) with ESMTPSA id C53CA2824CA; Wed, 15 Jun 2022 14:24:57 +0000 (UTC) Message-ID: Date: Wed, 15 Jun 2022 11:24:53 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Content-Language: en-US To: libc-help@sourceware.org From: Fernando Gont Subject: getentropy() vs. getrandom() vs. arc4random() Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_PERMERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jun 2022 14:25:02 -0000 Hi, I'm currently trying to grasp the functional differences in the different interfaces to generate pseudorandom numbers in different platforms. And I was wondering if you could shed some light on some questions I have. ** Brief Background: ** We're working on a document where we warn users about the security implications of using rand() and random() to generate pseudorandom numbers (in scenarios where cryptographically secure pseudorandom numbers are needed). So we want to recommend better PRNG options for different operating systems. For example, in the case of OpenBSD, we recommend the use of arc4random(3), which provides a higher-level interface than the getentropy(2) system call. However, we're unsure about what to recommend for the Linux case. For the Linux case, I see that there's a lot of code using getrandom(2) -- a syscall --, which is kind of complex/too-low-level. And I see that Linux also has getentropy(3) library function, which is described in random(7) as a "more portable interface the underlying PRNG devices". So, for the Linux case, I feel tempted to recommend the usage of getentropy(3) over getrandom(2), but since most code employs getrandom(2), I'm not sure whether I'm missing something. Any thoughts? Aside, it seems that for OpenBSD, getentropy(2) is a "low-level" syscall, while arc4random(3) is a high-level library function. But in the case of Linux, getentropy(3) is a high-level library function instead, while getrandom(2) is the low-level syscall. -- which means that usage of these interfaces would probably not be consistent across platforms. Is this actually the case? FWIW, if you're curious, the document we're working on is this one: https://www.ietf.org/archive/id/draft-irtf-pearg-numeric-ids-generation-10.txt, and the section that led me to start this thread is Section 7.1: ----- cut here ---- 7.1. Category #1: Uniqueness (soft failure) The requirement of uniqueness with a soft failure severity can be complied with a Pseudo-Random Number Generator (PRNG). NOTE: Please see [RFC4086] regarding randomness requirements for security. While most systems provide access to a PRNG, many of such PRNG implementations are not cryptographically secure, and therefore might be statistically biased or subject to adversarial influence. For example, ISO C [C11] rand(3) implementations are not cryptographically secure. NOTE: Section 7.1 ("Uniform Deviates") of [Press1992] discusses the underlying issues affecting ISO C [C11] rand(3) implementations. On the other hand, a number of systems provide an interface to a Cryptographically Secure PRNG (CSPRNG) [RFC8937] [RFC4086], which guarantees high entropy, unpredictability, and good statistical distribution of the random values generated. For example, GNU/ Linux's CSPRNG implementation is available via the getentropy(3) interface [GETENTROPY], while OpenBSD's CSPRNG implementation is available via the arc4random(3) and arc4random_uniform(3) interfaces [ARC4RANDOM]. Where available, these CSPRNGs should be preferred over e.g. POSIX [POSIX] random(3) or ISO C [C11] rand(3) implementations. In scenarios where a CSPRNG is not readily available to select transient numeric identifiers of Category #1, a security and privacy assessment of employing a regular PRNG should be performed, supporting the implementation decision. NOTE: [Aumasson2018], [Press1992], and [Knuth1983], discuss theoretical and practical aspects of pseudorandom numbers generation, and provide guidance on how to evaluate PRNGs. We note that since the premise is that collisions of transient numeric identifiers of this category only leads to soft failures, in many cases, the algorithm might not need to check the suitability of a selected identifier (i.e., the suitable_id() function, described below, could always return "true"). ---- cut here ---- Thanks a lot in advance! Regards, -- Fernando Gont e-mail: fernando@gont.com.ar PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1