From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id E8E33385C325 for ; Fri, 10 Jun 2022 17:26:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E8E33385C325 Received: by mail-oo1-xc30.google.com with SMTP id n24-20020a4ae758000000b0041b82638b42so2793696oov.9 for ; Fri, 10 Jun 2022 10:26:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=kcmz+sN1kwsIUNspuR5C5TkfUR70WXB/uHoNxzgDryc=; b=pa2dmVMwBZeUjYq4Ciy7fakCQROsvY5O6TiOAbJagn47d4cTlM9obRO5tIhFfyXcmi xlldULCv9LxaL4u+uaHvnGTyFGIPmkdqIdNCH7olm52PPOxD6bH7WUqspUY2Ll4O4LBn 7goLBAKiXSL8ijA7WxmqfqWDEfIbuRqLHyP49o1rF2/THRHMvPPka5CpDQwS/LBazx6y vClKqeMxTXvApZkAsoa57OOQ2c2k8lp39kpQM+2lHOJWu5YaX1ynnPq5rIfJWVdb0/lE 89gpjztW8iTBfE3fJK1RyAGdV12uyciK81zseXNPuLH6mmNB/L/+WQqFfd7yBcUouwnz cxVA== X-Gm-Message-State: AOAM532J03jR1NXLbEsvSEDtiE64eqCW86U/Ddv+OHQokPP7gloAwukL FN5D/ZO2lKAHNjzjAFzOQwBSW4jzT3hLRg== X-Google-Smtp-Source: ABdhPJw53IKV/OWO1iAap4vN5EXPSuUttqHE/mdSqoZCr+5hgZCNW+HnIMQBFg1EdkV9xKcNzPRZgQ== X-Received: by 2002:a4a:5dc1:0:b0:40e:444d:bd64 with SMTP id w184-20020a4a5dc1000000b0040e444dbd64mr19199393ooa.92.1654881967147; Fri, 10 Jun 2022 10:26:07 -0700 (PDT) Received: from ?IPV6:2804:431:c7cb:a613:818b:b86c:a3f8:d455? ([2804:431:c7cb:a613:818b:b86c:a3f8:d455]) by smtp.gmail.com with ESMTPSA id e28-20020a544f1c000000b0032c18f04800sm7050oiy.1.2022.06.10.10.26.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Jun 2022 10:26:06 -0700 (PDT) Message-ID: <1d3fd33d-a786-b469-b9be-73265c368c1c@linaro.org> Date: Fri, 10 Jun 2022 14:26:04 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH v6 01/10] stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) Content-Language: en-US To: Yann Droneaud , libc-alpha@sourceware.org Cc: Florian Weimer References: <20220518191424.3630729-1-adhemerval.zanella@linaro.org> <20220518191424.3630729-2-adhemerval.zanella@linaro.org> From: Adhemerval Zanella In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2022 17:26:11 -0000 On 02/06/2022 06:44, Yann Droneaud wrote: > Hi, > > Found some time to review arc4random_uniform() implementation this time. > > > Le 18/05/2022 à 21:14, Adhemerval Zanella via Libc-alpha a écrit : >> The implementation is based on scalar Chacha20, with global cache and >> locking.  It uses getrandom or /dev/urandom as fallback to get the >> initial entropy, and reseeds the internal state on every 16MB of >> consumed buffer. >> >> It maintains an internal buffer which consumes at maximum one page on >> most systems (assuming minimum of 4k pages).  The internal buf optimizes >> the cipher encrypt calls, by amortizing arc4random calls (where both >> function call and lock cost are the dominating factor). >> >> The ChaCha20 implementation is based on RFC8439 [1], omitting the final >> XOR of the keystream with the plaintext because the plaintext is a >> stream of zeros.  This strategy is similar to what OpenBSD arc4random >> does. >> >> The arc4random_uniform is based on previous work by Florian Weimer, >> where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete >> Uniform Generation from Coin Flips, and Applications (2013) [2], who >> credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform >> random number generation (1976), for solving the general case. >> >> The main advantage of this method is the that the unit of randomness is not >> the uniform random variable (uint32_t), but a random bit.  It optimizes the >> internal buffer sampling by initially consuming a 32-bit random variable >> and then sampling byte per byte.  Depending of the upper bound requested, >> it might lead to better CPU utilization. >> >> Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. >> >> Co-authored-by: Florian Weimer >> >> [1] https://datatracker.ietf.org/doc/html/rfc8439 >> [2] https://arxiv.org/pdf/1304.1916.pdf >> --- >> >> diff --git a/stdlib/arc4random_uniform.c b/stdlib/arc4random_uniform.c >> new file mode 100644 >> index 0000000000..722e92b3a7 >> --- /dev/null >> +++ b/stdlib/arc4random_uniform.c >> @@ -0,0 +1,152 @@ >> +/* Random pseudo generator numbers between 0 and 2**-31 (inclusive) > upper bound is incorrect Right, I changed to: /* Random pseudo generator number which returns a single 32 bit value uniformly distributed but with an upper_bound. >> +   uniformly distributed but with an upper_bound. >> +   Copyright (C) 2022 Free Software Foundation, Inc. >> +   This file is part of the GNU C Library. >> + >> +   The GNU C Library is free software; you can redistribute it and/or >> +   modify it under the terms of the GNU Lesser General Public >> +   License as published by the Free Software Foundation; either >> +   version 2.1 of the License, or (at your option) any later version. >> + >> +   The GNU C Library is distributed in the hope that it will be useful, >> +   but WITHOUT ANY WARRANTY; without even the implied warranty of >> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU >> +   Lesser General Public License for more details. >> + >> +   You should have received a copy of the GNU Lesser General Public >> +   License along with the GNU C Library; if not, see >> +   .  */ >> + >> +#include >> +#include >> +#include >> +#include >> + >> +/* Return the number of bytes which cover values up to the limit.  */ >> +__attribute__ ((const)) >> +static uint32_t >> +byte_count (uint32_t n) >> +{ >> +  if (n < (1U << 8)) >> +    return 1; >> +  else if (n < (1U << 16)) >> +    return 2; >> +  else if (n < (1U << 24)) >> +    return 3; >> +  else >> +    return 4; >> +} >> + >> +/* Fill the lower bits of the result with randomness, according to the >> +   number of bytes requested.  */ >> +static void >> +random_bytes (uint32_t *result, uint32_t byte_count) >> +{ >> +  *result = 0; >> +  unsigned char *ptr = (unsigned char *) result; >> +  if (__BYTE_ORDER == __BIG_ENDIAN) >> +    ptr += 4 - byte_count; >> +  __arc4random_buf_internal (ptr, byte_count); >> +} >> + >> +static uint32_t >> +compute_uniform (uint32_t n) >> +{ >> +  if (n <= 1) >> +    /* There is no valid return value for a zero limit, and 0 is the >> +       only possible result for limit 1.  */ >> +    return 0; >> + >> +  /* The bits variable serves as a source for bits.  Prefetch the >> +     minimum number of bytes needed.  */ >> +  uint32_t count = byte_count (n); >> +  uint32_t bits_length = count * CHAR_BIT; >> +  uint32_t bits; >> +  random_bytes (&bits, count); >> + >> +  /* Powers of two are easy.  */ >> +  if (powerof2 (n)) >> +    return bits & (n - 1); >> + >> +  /* The general case.  This algorithm follows Jérémie Lumbroso, >> +     Optimal Discrete Uniform Generation from Coin Flips, and >> +     Applications (2013), who credits Donald E. Knuth and Andrew >> +     C. Yao, The complexity of nonuniform random number generation >> +     (1976), for solving the general case. >> + >> +     The implementation below unrolls the initialization stage of the >> +     loop, where v is less than n.  */ >> + >> +  /* Use 64-bit variables even though the intermediate results are >> +     never larger than 33 bits.  This ensures the code is easier to >> +     compile on 64-bit architectures.  */ >> +  uint64_t v; >> +  uint64_t c; >> + >> +  /* Initialize v and c.  v is the smallest power of 2 which is larger >> +     than n.*/ >> +  { >> +    uint32_t log2p1 = 32 - __builtin_clz (n); >> +    v = 1ULL << log2p1; >> +    c = bits & (v - 1); >> +    bits >>= log2p1; >> +    bits_length -= log2p1; >> +  } >> + >> +  /* At the start of the loop, c is uniformly distributed within the >> +     half-open interval [0, v), and v < 2n < 2**33.  */ >> +  while (true) >> +    { >> +      if (v >= n) >> +        { >> +          /* If the candidate is less than n, accept it.  */ >> +          if (c < n) >> +            /* c is uniformly distributed on [0, n).  */ >> +            return c; >> +          else >> +            { >> +              /* c is uniformly distributed on [n, v).  */ >> +              v -= n; >> +              c -= n; >> +              /* The distribution was shifted, so c is uniformly >> +                 distributed on [0, v) again.  */ >> +            } >> +        } >> +      /* v < n here.  */ >> + >> +      /* Replenish the bit source if necessary.  */ >> +      if (bits_length == 0) >> +        { >> +          /* Overwrite the least significant byte.  */ >> +      random_bytes (&bits, 1); >> +      bits_length = CHAR_BIT; >> +        } >> + >> +      /* Double the range.  No overflow because v < n < 2**32.  */ >> +      v *= 2; >> +      /* v < 2n here.  */ >> + >> +      /* Extract a bit and append it to c.  c remains less than v and >> +         thus 2**33.  */ >> +      c = (c << 1) | (bits & 1); >> +      bits >>= 1; >> +      --bits_length; >> + >> +      /* At this point, c is uniformly distributed on [0, v) again, >> +         and v < 2n < 2**33.  */ >> +    } >> +} >> + >> +__libc_lock_define (extern , __arc4random_lock attribute_hidden) >> + >> +uint32_t >> +__arc4random_uniform (uint32_t upper_bound) >> +{ >> +  uint32_t r; >> +  __libc_lock_lock (__arc4random_lock); >> +  r = compute_uniform (upper_bound); >> +  __libc_lock_unlock (__arc4random_lock); >> +  return r; >> +} >> +libc_hidden_def (__arc4random_uniform) >> +weak_alias (__arc4random_uniform, arc4random_uniform) > > Seems fine from a computation point of view. > > So for what it worth: > > LGTM > > Reviewed-by: Yann Droneaud Thanks. > > > Regards. >