From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp4-g21.free.fr (smtp4-g21.free.fr [IPv6:2a01:e0c:1:1599::13]) by sourceware.org (Postfix) with ESMTPS id CB1033959CAE for ; Thu, 2 Jun 2022 09:44:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CB1033959CAE Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=opteya.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=opteya.com Received: from [IPV6:2a01:e35:39f2:1220:9f4d:1e36:2cff:ddf9] (unknown [IPv6:2a01:e35:39f2:1220:9f4d:1e36:2cff:ddf9]) by smtp4-g21.free.fr (Postfix) with ESMTPS id BF16219F5A1; Thu, 2 Jun 2022 11:44:29 +0200 (CEST) Message-ID: Date: Thu, 2 Jun 2022 11:44:29 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v6 01/10] stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) Content-Language: fr-FR To: Adhemerval Zanella , libc-alpha@sourceware.org Cc: Florian Weimer References: <20220518191424.3630729-1-adhemerval.zanella@linaro.org> <20220518191424.3630729-2-adhemerval.zanella@linaro.org> From: Yann Droneaud Organization: OPTEYA In-Reply-To: <20220518191424.3630729-2-adhemerval.zanella@linaro.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2022 09:44:34 -0000 Hi, Found some time to review arc4random_uniform() implementation this time. Le 18/05/2022 à 21:14, Adhemerval Zanella via Libc-alpha a écrit : > The implementation is based on scalar Chacha20, with global cache and > locking. It uses getrandom or /dev/urandom as fallback to get the > initial entropy, and reseeds the internal state on every 16MB of > consumed buffer. > > It maintains an internal buffer which consumes at maximum one page on > most systems (assuming minimum of 4k pages). The internal buf optimizes > the cipher encrypt calls, by amortizing arc4random calls (where both > function call and lock cost are the dominating factor). > > The ChaCha20 implementation is based on RFC8439 [1], omitting the final > XOR of the keystream with the plaintext because the plaintext is a > stream of zeros. This strategy is similar to what OpenBSD arc4random > does. > > The arc4random_uniform is based on previous work by Florian Weimer, > where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete > Uniform Generation from Coin Flips, and Applications (2013) [2], who > credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform > random number generation (1976), for solving the general case. > > The main advantage of this method is the that the unit of randomness is not > the uniform random variable (uint32_t), but a random bit. It optimizes the > internal buffer sampling by initially consuming a 32-bit random variable > and then sampling byte per byte. Depending of the upper bound requested, > it might lead to better CPU utilization. > > Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. > > Co-authored-by: Florian Weimer > > [1] https://datatracker.ietf.org/doc/html/rfc8439 > [2] https://arxiv.org/pdf/1304.1916.pdf > --- > > diff --git a/stdlib/arc4random_uniform.c b/stdlib/arc4random_uniform.c > new file mode 100644 > index 0000000000..722e92b3a7 > --- /dev/null > +++ b/stdlib/arc4random_uniform.c > @@ -0,0 +1,152 @@ > +/* Random pseudo generator numbers between 0 and 2**-31 (inclusive) upper bound is incorrect > + uniformly distributed but with an upper_bound. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > +#include > +#include > +#include > + > +/* Return the number of bytes which cover values up to the limit. */ > +__attribute__ ((const)) > +static uint32_t > +byte_count (uint32_t n) > +{ > + if (n < (1U << 8)) > + return 1; > + else if (n < (1U << 16)) > + return 2; > + else if (n < (1U << 24)) > + return 3; > + else > + return 4; > +} > + > +/* Fill the lower bits of the result with randomness, according to the > + number of bytes requested. */ > +static void > +random_bytes (uint32_t *result, uint32_t byte_count) > +{ > + *result = 0; > + unsigned char *ptr = (unsigned char *) result; > + if (__BYTE_ORDER == __BIG_ENDIAN) > + ptr += 4 - byte_count; > + __arc4random_buf_internal (ptr, byte_count); > +} > + > +static uint32_t > +compute_uniform (uint32_t n) > +{ > + if (n <= 1) > + /* There is no valid return value for a zero limit, and 0 is the > + only possible result for limit 1. */ > + return 0; > + > + /* The bits variable serves as a source for bits. Prefetch the > + minimum number of bytes needed. */ > + uint32_t count = byte_count (n); > + uint32_t bits_length = count * CHAR_BIT; > + uint32_t bits; > + random_bytes (&bits, count); > + > + /* Powers of two are easy. */ > + if (powerof2 (n)) > + return bits & (n - 1); > + > + /* The general case. This algorithm follows Jérémie Lumbroso, > + Optimal Discrete Uniform Generation from Coin Flips, and > + Applications (2013), who credits Donald E. Knuth and Andrew > + C. Yao, The complexity of nonuniform random number generation > + (1976), for solving the general case. > + > + The implementation below unrolls the initialization stage of the > + loop, where v is less than n. */ > + > + /* Use 64-bit variables even though the intermediate results are > + never larger than 33 bits. This ensures the code is easier to > + compile on 64-bit architectures. */ > + uint64_t v; > + uint64_t c; > + > + /* Initialize v and c. v is the smallest power of 2 which is larger > + than n.*/ > + { > + uint32_t log2p1 = 32 - __builtin_clz (n); > + v = 1ULL << log2p1; > + c = bits & (v - 1); > + bits >>= log2p1; > + bits_length -= log2p1; > + } > + > + /* At the start of the loop, c is uniformly distributed within the > + half-open interval [0, v), and v < 2n < 2**33. */ > + while (true) > + { > + if (v >= n) > + { > + /* If the candidate is less than n, accept it. */ > + if (c < n) > + /* c is uniformly distributed on [0, n). */ > + return c; > + else > + { > + /* c is uniformly distributed on [n, v). */ > + v -= n; > + c -= n; > + /* The distribution was shifted, so c is uniformly > + distributed on [0, v) again. */ > + } > + } > + /* v < n here. */ > + > + /* Replenish the bit source if necessary. */ > + if (bits_length == 0) > + { > + /* Overwrite the least significant byte. */ > + random_bytes (&bits, 1); > + bits_length = CHAR_BIT; > + } > + > + /* Double the range. No overflow because v < n < 2**32. */ > + v *= 2; > + /* v < 2n here. */ > + > + /* Extract a bit and append it to c. c remains less than v and > + thus 2**33. */ > + c = (c << 1) | (bits & 1); > + bits >>= 1; > + --bits_length; > + > + /* At this point, c is uniformly distributed on [0, v) again, > + and v < 2n < 2**33. */ > + } > +} > + > +__libc_lock_define (extern , __arc4random_lock attribute_hidden) > + > +uint32_t > +__arc4random_uniform (uint32_t upper_bound) > +{ > + uint32_t r; > + __libc_lock_lock (__arc4random_lock); > + r = compute_uniform (upper_bound); > + __libc_lock_unlock (__arc4random_lock); > + return r; > +} > +libc_hidden_def (__arc4random_uniform) > +weak_alias (__arc4random_uniform, arc4random_uniform) Seems fine from a computation point of view. So for what it worth: LGTM Reviewed-by: Yann Droneaud Regards. -- Yann Droneaud OPTEYA