From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <adhemerval.zanella@linaro.org>
Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com
 [IPv6:2607:f8b0:4864:20::22e])
 by sourceware.org (Postfix) with ESMTPS id 4B81138582A1
 for <libc-alpha@sourceware.org>; Thu, 28 Jul 2022 14:58:42 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4B81138582A1
Received: by mail-oi1-x22e.google.com with SMTP id c185so2680988oia.7
 for <libc-alpha@sourceware.org>; Thu, 28 Jul 2022 07:58:42 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
 :content-language:to:references:from:organization:in-reply-to
 :content-transfer-encoding;
 bh=MEOzOPZE65Lh97Owc8MAnYYawo2Tonp7sbC90oyAqRE=;
 b=SnYNwqgUDeOssDrs/4ezLMaikkqvgX/Y7RKGlcqmTZebHvxNcaN9ewtTcRLiluEDXQ
 9Up/gGUCemxjDInN3cLgMxa6gCeuX2tv8y8iFW58lslwiEkFDP+SHgKOVkVhEekkBLqO
 yijid2eSB0YP++dMLxJKYL1bxL+Ati4+WlmQrOqQTNSGRFMJYx2HdAOIIUskx8RAfUjE
 B0dj5H9V+L4xOPaGSKhECpuu21zpUONDVOGubw2W+L7tWkLx2seMMicpoqYiFIRmnblX
 9AhKkTBTYxx0wgtrfu8bk9a7j9lZRjvPBd02lTvHGreq4ZDPH3LrJSrUF7wFQtjfmf/y
 V0vw==
X-Gm-Message-State: AJIora/u5BchTywpRYDlBxL9bMlr1OqXUsGvOc0tiG+mqi7tuxl0OBxu
 32FFKdBdTMMGqSy8cixzorQicw==
X-Google-Smtp-Source: AGRyM1tK426L3PwEuxTO3EpamC4IZb1NWnTVqsrbEdoCQk5frtd9UTYc9MAEQX9JrvZGupH1G5RXTQ==
X-Received: by 2002:a05:6808:120f:b0:325:cef0:3f7 with SMTP id
 a15-20020a056808120f00b00325cef003f7mr4186031oil.38.1659020321552; 
 Thu, 28 Jul 2022 07:58:41 -0700 (PDT)
Received: from ?IPV6:2804:431:c7cb:1e34:8c1:7b8a:41d4:61c7?
 ([2804:431:c7cb:1e34:8c1:7b8a:41d4:61c7])
 by smtp.gmail.com with ESMTPSA id
 o23-20020a4ae597000000b00435a68593ebsm425121oov.27.2022.07.28.07.58.38
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 28 Jul 2022 07:58:41 -0700 (PDT)
Message-ID: <0ec012c4-6240-e8a7-e865-147516c0e64c@linaro.org>
Date: Thu, 28 Jul 2022 11:58:36 -0300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
 Gecko/20100101 Thunderbird/102.0.3
Subject: Re: [PATCH] stdlib: Simplify arc4random_uniform
Content-Language: en-US
To: Yann Droneaud <ydroneaud@opteya.com>, libc-alpha@sourceware.org,
 Florian Weimer <fweimer@redhat.com>, Carlos O'Donell <carlos@redhat.com>
References: <20220728124528.39169-1-adhemerval.zanella@linaro.org>
 <b0a59dc5-fd1b-2095-310e-2a027cbb9b2e@opteya.com>
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Organization: Linaro
In-Reply-To: <b0a59dc5-fd1b-2095-310e-2a027cbb9b2e@opteya.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,
 NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jul 2022 14:58:44 -0000


On 28/07/22 11:46, Yann Droneaud wrote:
> Hi,
> 
> Le 28/07/2022 à 14:45, Adhemerval Zanella a écrit :
>> It uses the bitmask with rejection [1], which calculates a mask
>> being the lowest power of two bounding the request upper bound,
>> successively queries new random values, and rejects values
>> outside the requested range.
>>
>> Performance-wise, there is no much gain in trying to converse
>> bits since arc4random is wrapper on getrandom syscall.  It should
>> be cheaper to just query a uint32_t value.  The algorithm also
>> avoids mudulo and divide operations, which might be costly
>> depending of the architecture.
>>
>> [1] https://www.pcg-random.org/posts/bounded-rands.html
>> ---
>>   stdlib/arc4random_uniform.c | 131 +++++++++---------------------------
>>   1 file changed, 32 insertions(+), 99 deletions(-)
>>
>> diff --git a/stdlib/arc4random_uniform.c b/stdlib/arc4random_uniform.c
>> index 1326dfa593..425282cd15 100644
>> --- a/stdlib/arc4random_uniform.c
>> +++ b/stdlib/arc4random_uniform.c
>> @@ -17,38 +17,19 @@
>>      License along with the GNU C Library; if not, see
>>      <https://www.gnu.org/licenses/>.  */
>>   -#include <endian.h>
>> -#include <libc-lock.h>
>>   #include <stdlib.h>
>>   #include <sys/param.h>
>>   -/* Return the number of bytes which cover values up to the limit.  */
>> -__attribute__ ((const))
>> -static uint32_t
>> -byte_count (uint32_t n)
>> -{
>> -  if (n < (1U << 8))
>> -    return 1;
>> -  else if (n < (1U << 16))
>> -    return 2;
>> -  else if (n < (1U << 24))
>> -    return 3;
>> -  else
>> -    return 4;
>> -}
>> +/* Return a uniformly distributed random number less than N.  The algorithm
>> +   calculates a mask being the lowest power of two bounding the upper bound
>> +   N, successively queries new random values, and rejects values outside of
>> +   the request range.
>>   -/* Fill the lower bits of the result with randomness, according to the
>> -   number of bytes requested.  */
>> -static void
>> -random_bytes (uint32_t *result, uint32_t byte_count)
>> -{
>> -  *result = 0;
>> -  unsigned char *ptr = (unsigned char *) result;
>> -  if (__BYTE_ORDER == __BIG_ENDIAN)
>> -    ptr += 4 - byte_count;
>> -  __arc4random_buf (ptr, byte_count);
>> -}
>> +   For reject values, it also tries if the remaining entropy could fit on
>> +   the asked range after range adjustment.
>>   +   The algorithm avoids modulo and divide operations, which might be costly
>> +   depending on the architecture.  */
>>   uint32_t
>>   __arc4random_uniform (uint32_t n)
>>   {
>> @@ -57,83 +38,35 @@ __arc4random_uniform (uint32_t n)
>>          only possible result for limit 1.  */
>>       return 0;
>>   -  /* The bits variable serves as a source for bits.  Prefetch the
>> -     minimum number of bytes needed.  */
>> -  uint32_t count = byte_count (n);
>> -  uint32_t bits_length = count * CHAR_BIT;
>> -  uint32_t bits;
>> -  random_bytes (&bits, count);
>> -
>>     /* Powers of two are easy.  */
>>     if (powerof2 (n))
>> -    return bits & (n - 1);
>> -
>> -  /* The general case.  This algorithm follows Jérémie Lumbroso,
>> -     Optimal Discrete Uniform Generation from Coin Flips, and
>> -     Applications (2013), who credits Donald E. Knuth and Andrew
>> -     C. Yao, The complexity of nonuniform random number generation
>> -     (1976), for solving the general case.
>> +    return __arc4random () & (n - 1);
>>   -     The implementation below unrolls the initialization stage of the
>> -     loop, where v is less than n.  */
>> +  /* mask is the smallest power of 2 minus 1 number larger than n.  */
>> +  int z = __builtin_clz (n);
>> +  uint32_t mask = ~UINT32_C(0) >> z;
>> +  int bits = CHAR_BIT * sizeof (uint32_t) - z;
>>   -  /* Use 64-bit variables even though the intermediate results are
>> -     never larger than 33 bits.  This ensures the code is easier to
>> -     compile on 64-bit architectures.  */
>> -  uint64_t v;
>> -  uint64_t c;
>> -
>> -  /* Initialize v and c.  v is the smallest power of 2 which is larger
>> -     than n.*/
>> -  {
>> -    uint32_t log2p1 = 32 - __builtin_clz (n);
>> -    v = 1ULL << log2p1;
>> -    c = bits & (v - 1);
>> -    bits >>= log2p1;
>> -    bits_length -= log2p1;
>> -  }
>> -
>> -  /* At the start of the loop, c is uniformly distributed within the
>> -     half-open interval [0, v), and v < 2n < 2**33.  */
>> -  while (true)
>> +  while (1)
>>       {
>> -      if (v >= n)
>> -        {
>> -          /* If the candidate is less than n, accept it.  */
>> -          if (c < n)
>> -            /* c is uniformly distributed on [0, n).  */
>> -            return c;
>> -          else
>> -            {
>> -              /* c is uniformly distributed on [n, v).  */
>> -              v -= n;
>> -              c -= n;
>> -              /* The distribution was shifted, so c is uniformly
>> -                 distributed on [0, v) again.  */
>> -            }
>> -        }
>> -      /* v < n here.  */
>> -
>> -      /* Replenish the bit source if necessary.  */
>> -      if (bits_length == 0)
>> -        {
>> -          /* Overwrite the least significant byte.  */
>> -      random_bytes (&bits, 1);
>> -      bits_length = CHAR_BIT;
>> -        }
>> -
>> -      /* Double the range.  No overflow because v < n < 2**32.  */
>> -      v *= 2;
>> -      /* v < 2n here.  */
>> -
>> -      /* Extract a bit and append it to c.  c remains less than v and
>> -         thus 2**33.  */
>> -      c = (c << 1) | (bits & 1);
>> -      bits >>= 1;
>> -      --bits_length;
>> -
>> -      /* At this point, c is uniformly distributed on [0, v) again,
>> -         and v < 2n < 2**33.  */
>> +      uint32_t value = __arc4random ();
>> +
>> +      /* Return if the lower power of 2 minus 1 satisfy the condition.  */
>> +      uint32_t r = value & mask;
>> +      if (r < n)
>> +    return r;
>> +
>> +      /* Otherwise check if remaining bits of entropy provides fits in the
>> +     bound.  */
>> +      int bits_left = z;
>> +      while (bits_left >= bits)
>> +    {
>> +      value >>= bits;
>> +      r = value & mask;
>> +      if (r < n)
>> +        return r;
>> +      bits_left -= bits;
>> +    }
>>       }
>>   }
>>   libc_hidden_def (__arc4random_uniform)
> 
> 
> It's the algorithm I suggestedin <9b6f8e87-9226-7828-3569-cff0e3575f9a@opteya.com>
> https://sourceware.org/pipermail/libc-alpha/2022-April/138070.html

Indeed I recall you pointed out this method, with current arc4random
being a getrandom wrapper it does make more sense to use a simpler
solution.

> 
> 
> LGTM.
> 
> 
> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
> 

Thanks.