From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2c.google.com (mail-oa1-x2c.google.com [IPv6:2001:4860:4864:20::2c]) by sourceware.org (Postfix) with ESMTPS id 59AA13858D28 for ; Mon, 25 Apr 2022 12:15:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 59AA13858D28 Received: by mail-oa1-x2c.google.com with SMTP id 586e51a60fabf-e5e433d66dso15806561fac.5 for ; Mon, 25 Apr 2022 05:15:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=nVEcQNLNlkLHnqokyrhydW2z/EUkXeOkE/VeK5He7hs=; b=m5YgD5EqBZ4zITOlfHdZyGKRLEf3HzpbDohLvwh69BsRpJ9VF2HAKfVGD8YKa3KsIs UdYCb32inkhstYbSFcoVNliY/qaMJsauSAo6DM9ebuPPSrkESQILv1Lc6nc3oFY189Ux N/yYURnM0xmT8O7keu/PICg4DShm7Y3WbsEKWgS4MeKv59n222K+e/GwRpntEoIouVu9 aSrPtrMwUC8nSsx8hjxNflfZOoUEIVlYSW0w+iJobTPif6J6NgxZpg6QconXBr4w9fjZ IpmEJDrYh6TU0bRFXx+rOE5Dw9AtCAKynW55/eS/n5Oh4ZpZXpK1CtiKTTEtoc9Boi5U B4pQ== X-Gm-Message-State: AOAM532qnoqc/dAaJrlIaLrW7jh0F0R6BanncZFAQJ6xfJpPpe26v23R O0Qn05TifSq62rWGTyHFzH50Hw== X-Google-Smtp-Source: ABdhPJzb/FM9yVMra/E/wNobp9cxRNkssAU6lqKpYhJiOioKLG4ngSXAX7aIZWhI1SicZHFrKeguKw== X-Received: by 2002:a05:6870:f10e:b0:e6:87d1:d1a8 with SMTP id k14-20020a056870f10e00b000e687d1d1a8mr6846969oac.220.1650888913164; Mon, 25 Apr 2022 05:15:13 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:4214:b4dd:3339:98d6:1ec0? ([2804:431:c7ca:4214:b4dd:3339:98d6:1ec0]) by smtp.gmail.com with ESMTPSA id m126-20020aca3f84000000b002ef895f4bf8sm3560295oia.24.2022.04.25.05.15.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Apr 2022 05:15:12 -0700 (PDT) Message-ID: Date: Mon, 25 Apr 2022 09:15:09 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCH v3 1/9] stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) Content-Language: en-US To: Yann Droneaud , libc-alpha@sourceware.org Cc: Florian Weimer References: <20220419212812.2688764-1-adhemerval.zanella@linaro.org> <20220419212812.2688764-2-adhemerval.zanella@linaro.org> <9b6f8e87-9226-7828-3569-cff0e3575f9a@opteya.com> From: Adhemerval Zanella In-Reply-To: <9b6f8e87-9226-7828-3569-cff0e3575f9a@opteya.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Apr 2022 12:15:17 -0000 On 22/04/2022 10:54, Yann Droneaud wrote: > Le 19/04/2022 à 23:28, Adhemerval Zanella via Libc-alpha a écrit : >> The implementation is based on scalar Chacha20, with global cache and >> locking.  It uses getrandom or /dev/urandom as fallback to get the >> initial entropy, and reseeds the internal state on every 16MB of >> consumed buffer. >> >> It maintains an internal buffer which consumes at maximum one page on >> most systems (assuming minimum of 4k pages).  The internal buf optimizes >> the cipher encrypt calls, by amortize arc4random calls (where both >> function call and locks cost are the dominating factor). >> >> The ChaCha20 implementation is based on the RFC8439 [1], with last >> step that XOR with the input omited.  Since the input stream will either >> zero bytes (initial state) or the PRNG output itself this step does not >> add any extra entropy. > > > This can also state the implementation is following OpenBSD arc4random current implementation. > > Agree, it is worth to add it. >> + >> +/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the >> +   current valid bytes not yet consumed in 'buf', while 'count' is the maximum >> +   number of bytes until a reseed. >> + >> +   Both the initial seed an reseed tries to obtain entropy from the kernel > > an ->  and Ack. >> + >> +static void >> +arc4random_rekey (uint8_t *rnd, size_t rndlen) >> +{ >> +  memset (state->buf, 0, sizeof state->buf); > > There's no need to clear buf as call to chacha20_crypt() will overwrite it (since it doesn't XOR with it anymore). > > See https://github.com/openbsd/src/blob/master/lib/libc/crypt/arc4random.c#L121 Ack, I have removed it. >> + >> +      /* Extract a bit and append it to c.  c remains less than v and >> +         thus 2**33.  */ >> +      c = (c << 1) | (bits & 1); >> +      bits >>= 1; >> +      --bits_length; >> + >> +      /* At this point, c is uniformly distributed on [0, v) again, >> +         and v < 2n < 2**33.  */ >> +    } > > I'm not familiar with this method. > > It's not one reviewed at https://www.pcg-random.org/posts/bounded-rands.html > > In this patch I used what's called by PCG author,the "Bitmask with Rejection (Unbiased) — Apple's Method" > > https://github.com/Parrot-Developers/libfutils/commit/9dc7243ae2f2059b4590a702be2ca9c03578067f > > I like it because it doesn't uses modulo at all :) > > But the OpenBSD's arc4random_uniform() is even more simple in term of C code. > You can find the reference paper on arxiv [1]. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. >From the article: But unexpectedly, it turns out that the extra buffering inherent in consuming randomness random-bit-by-random-bit, although time consuming, is more than compensated by the increased efficiency in using random bits compared with most common methods. It is specially true if you consider that both chacha20 block generation and getting kernel entropy (through either getrandom or /dev/urandom) are way more time consuming than bit twiddling. [1] https://arxiv.org/pdf/1304.1916.pdf > >> +} >> + >> +__libc_lock_define (extern , __arc4random_lock attribute_hidden) >> + >> +uint32_t >> +__arc4random_uniform (uint32_t upper_bound) >> +{ >> +  uint32_t r; >> +  __libc_lock_lock (__arc4random_lock); >> +  r = compute_uniform (upper_bound); >> +  __libc_lock_unlock (__arc4random_lock); >> +  return r; >> +} >> +libc_hidden_def (__arc4random_uniform) >> +weak_alias (__arc4random_uniform, arc4random_uniform) > >