Re: [PATCH] Give a useful meaning to arc4random_uniform(0);

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Alejandro Colomar <alx.manpages@gmail.com>
To: Theo de Raadt <deraadt@openbsd.org>
Cc: otto@cvs.openbsd.org, djm@cvs.openbsd.org,
	libc-alpha@sourceware.org, "Alejandro Colomar" <alx@kernel.org>,
	"Theo de Raadt" <deraadt@theos.com>,
	"Todd C . Miller" <Todd.Miller@sudo.ws>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	"Cristian Rodríguez" <crrodriguez@opensuse.org>,
	"Adhemerval Zanella" <adhemerval.zanella@linaro.org>,
	"Yann Droneaud" <ydroneaud@opteya.com>,
	"Joseph Myers" <joseph@codesourcery.com>,
	"Serge Hallyn" <serge@hallyn.com>,
	"Iker Pedrosa" <ipedrosa@redhat.com>
Subject: Re: [PATCH] Give a useful meaning to arc4random_uniform(0);
Date: Sat, 31 Dec 2022 15:56:44 +0100	[thread overview]
Message-ID: <ced48d59-2032-0350-916e-7740d1d53f66@gmail.com> (raw)
In-Reply-To: <78070.1672476692@cvs.openbsd.org>

[-- Attachment #1.1: Type: text/plain, Size: 10185 bytes --]

Hello Theo,

I'm going to state it in case it wasn't obvious:

This patch was for glibc, which can be inferred from the fact that the email was 
sent TO libc-alpha@, and all other addresses were just CCd.

Oh, and sorry about the strong words if any.  It's not my intention to attack 
you personally, but it's already been several times that you attacked me 
personally, and I expect that my words will not be the kindest that I'd like to 
write, even if I will try to.  Please, understand.

On 12/31/22 09:51, Theo de Raadt wrote:
> Your proposal failed to update the manual page.

There's still no manual page for arc4random(3); so there's no way I could have 
updated it.  Since the function was added only a few months ago to glibc, it 
makes sense that it's not documented; although, I would have preferred that the 
manual page had been developed together with the function itself.  Since I'm 
subscribed to libc-alpha@, I read some emails discussing the inclussion of the 
function, but I didn't read it entirely, and I don't follow closely the glibc 
development, so I didn't know when the function was officially added and later 
released.  I rely considerably on contributions form glibc developers that send 
new manual pages, because I wouldn't be able to document by myself everything 
that goes new into Linux and glibc.

So, I was waiting for someone to add a manual page for arc4random(3) into the 
Linux man-pages.  However, now that I've investigated myself enough on this 
topic, I feel ready to document it myself.

>  That was my first
> hint that somethign is amiss.
> 
> Right now the manual it is super clear that it returns a value "less
> than upper_bound".  Which by prototype is uint32_t.

Is it super clear?  That's a lie.  For an input of 0, it returns 0.  In which 
world is that a value "less that"?  In C, (0 < 0) always evaluates to false.

> 
> How would you rewrite the manual page to explain this behaviour?

I had struggles yesterday to come up with a description of the function, but 
overnight I came up with an idea that would even solve the lie in the current 
OpenBSD page.

"
This function returns a value in the range [0, upper_bound - 1].
"

That wording is valid for current non-0 input to arc4random_uniform(3), and 
thanks to unsigned integers wrapping around, has the property that for an input 
of 0, it is effectively [0, UINT32_MAX].

> 
> Since your change choose a strange condition where it will return
> a value greater than upper_bound.

0 isn't less than 0 either.  At least, my proposal is consistent with 
wrap-around of unsigned integers.

> 
> Also, right now an (incorrect?) call of arc4random_uniform(0)

The arc4random_uniform(3) manual page in OpenBSD doesn't seem to declare this 
call to be incorrect.

RETURN VALUES
        These functions are always successful, and no return value is  reserved
        to indicate an error.

And there's no ERRORS or CAVEATS sections.  You should specify 0 as an error 
code for the invalid input 0 (in your implementation).

> will return 0, but with your proposal it will return a non-zero
> number.  Have you audited the entire universe of software to
> ensure that your change doesn't introduce a bug in some other
> piece of software?  I doubt you did that.  Very unprofessional
> of you to not study the impact and just wave the issue away.

Since it's impossible, I didn't.  Considering that this was a patch to glibc, 
which only added this function a few months ago, there probably isn't any 
software that relies on this under glibc.  However, it would be odd to add a 
different behavior to glibc than in OpenBSD, since it would surprise porters.

> 
> I think Special-casing the value of 0 to mean something new
> and undocumented behaviour makes no sense.  It is even potentially
> undocumentable.

Actually, you're already special-casing 0 to mean something different than "this 
returns less than upper_bound".  And it's undocumented.  Returning 42 would be 
as bad as 0 with your documentation.  Therefore, I'd call this a fix, rather 
than a regression.  After this patch, it can be documented.

> 
> The decision to return 0 was made very carefully.  It results programs
> passing mathematically calculated erroneous values not going further off
> the rails in a compliant fashion.  If they are using the value to index
> into an array, they will tend to access in-range, rather than out of
> range.

I'll try to turn your own design decisions against you.  Let's bring strlcpy(3) 
into the table, which I think has some nice features.

strlcpy(3) crashes on invalid input, by accessing (reading) the input array out 
of bounds, if the input was invalid.  This allegedly helped find several cases 
of invalid input, didn't it?  IIRC, you told me that.

arc4random_uniform(3), instead, when the input is invalid, it silently returns a 
value that is likely to be confused with a valid index (it's really off-by-one, 
since it's one greater than the maximum valid index), and will allow programs to 
continue with invalid behavior, instead of just crashing.  If it returned a 
random 32-bit integer, the chances are high that any access will just crash, and 
thus it would help uncover erroneous calculations.

>  Do you like introducing memory safety bugs into software you
> did not review but which will be affected by your proposal to change
> a standard well-known API which has been in use since 2008.

Of course not.  I think this was unnecessary of you to ask, and feels to me like 
a personal attack.  At least you didn't mention ego this time.  Please, come 
back to the technical discussion.

> 
> I do not like your proposal at all.  A function like arc4random_range()
> is even more likely to be used wrong by passing backwards points
> and you seem to have a lot of hubris to add a range check to it.

Oh, I just checked hubris in the dictionary and it seems you did mention ego. 
I'll try to rebate you with something useful.

If you run `grep -rn 'arc4random_uniform('` in the OpenBSD tree, there will be 
many cases where you'd really benefit from this.  I'll just pick a few:

sys/net/pf_lb.c:224:
			cut = arc4random_uniform(1 + high - low) + low;
better as:
			cut = arc4random_range(low, high);

sys/kern/kern_fork.c:648:
		pid = 2 + arc4random_uniform(PID_MAX - 1);
better as:
		pid = arc4random_range(2, PID_MAX);

usr.bin/nc/netcat.c:1501:
				cp = arc4random_uniform(x + 1);
better as:
				cp = arc4random_range(0, x);

> Is this really all just "everyone else's big code will be perfect like
> my 12 lines of code"?
> 
> Also, you have not cc'd all authors of this layer.  I am adding them.

Thanks.  I didn't know all of the authors.

Cheers, honestly,
Alex

> 
> 
> Alejandro Colomar <alx.manpages@gmail.com> wrote:
> 
>> Special-casing it in the implementation to return 0 was useless.
>> Instead, considering 0 as the value after UINT32_MAX has the property
>> that it allows implementing the following function without any
>> special cases:
>>
>> uint32_t
>> arc4random_range(uint32_t min, uint32_t max)
>> {
>> 	return arc4random_uniform(max - min + 1) + min;
>> }
>>
>> This works for any values of min and max (as long as min <= max, of
>> course), even for (0, UINT32_MAX).
>>
>> Oh, and the implementation of arc4random_uniform(3) is now 2 lines
>> simpler. :)
>>
>> This will work with the current implementation because powerof2(3) will
>> also consider 0 as a power of 2.  See powerof2(3):
>>
>>   SYNOPSIS
>>         #include <sys/param.h>
>>
>>         int powerof2(x);
>>
>>   DESCRIPTION
>>         This  macro  returns true if x is a power of 2, and false
>>         otherwise.
>>
>>         0 is considered a power of 2.  This can make  sense  con‐
>>         sidering wrapping of unsigned integers, and has interest‐
>>         ing properties.
>>
>> Cc: Theo de Raadt <deraadt@theos.com>
>> Cc: Todd C. Miller <Todd.Miller@sudo.ws>
>> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
>> Cc: Cristian Rodríguez <crrodriguez@opensuse.org>
>> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
>> Cc: Yann Droneaud <ydroneaud@opteya.com>
>> Cc: Joseph Myers <joseph@codesourcery.com>
>> Signed-off-by: Alejandro Colomar <alx@kernel.org>
>> ---
>>
>> Hi,
>>
>> I CCd Theo and Todd, because theirs is the original implementation, and
>> while this is a useful feature (IMO), it wouldn't make sense to do it
>> without consensus with other implementations, and especially with the
>> original implementation.
>>
>> I found this useful for shadow, where the existing code had a function
>> that produced a "random" value within a range, but due to the bogus
>> implementation, it had bias for higher values.  Implementing a *_range()
>> variant in terms of *_uniform() made it really simple, but the
>> *_uniform() function needed to do something useful for 0 for that to
>> work.
>>
>> Cheers,
>>
>> Alex
>>
>>   stdlib/arc4random_uniform.c | 10 ++++------
>>   1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/stdlib/arc4random_uniform.c b/stdlib/arc4random_uniform.c
>> index 5aa98d1c13..1cd52c0d1c 100644
>> --- a/stdlib/arc4random_uniform.c
>> +++ b/stdlib/arc4random_uniform.c
>> @@ -29,15 +29,13 @@
>>      the asked range after range adjustment.
>>   
>>      The algorithm avoids modulo and divide operations, which might be costly
>> -   depending on the architecture.  */
>> +   depending on the architecture.
>> +
>> +   0 is treated as if it were UINT32_MAX + 1, and so arc4random_uniform(0)
>> +   is equivalent to arc4random().  */
>>   uint32_t
>>   __arc4random_uniform (uint32_t n)
>>   {
>> -  if (n <= 1)
>> -    /* There is no valid return value for a zero limit, and 0 is the
>> -       only possible result for limit 1.  */
>> -    return 0;
>> -
>>     /* Powers of two are easy.  */
>>     if (powerof2 (n))
>>       return __arc4random () & (n - 1);
>> -- 
>> 2.39.0
>>

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

next prev parent reply	other threads:[~2022-12-31 14:56 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-31  2:36 Alejandro Colomar
2022-12-31  2:48 ` Alejandro Colomar
2022-12-31  2:57 ` Jason A. Donenfeld
2022-12-31 13:39   ` Alejandro Colomar
2022-12-31  8:50 ` Theo de Raadt
2022-12-31  8:51   ` Theo de Raadt
2022-12-31 14:56     ` Alejandro Colomar [this message]
2022-12-31 15:13       ` Alejandro Colomar
2022-12-31 15:17         ` Alejandro Colomar
2022-12-31 15:59         ` Alejandro Colomar
2022-12-31 16:03           ` Alejandro Colomar
2023-01-01  8:41           ` Theo de Raadt
2022-12-31 23:07   ` Damien Miller
2022-12-31 23:58     ` Alejandro Colomar
2023-01-01  7:48       ` Ariadne Conill
2023-01-01  9:21         ` Otto Moerbeek
2023-01-01 14:05         ` Alejandro Colomar
2023-01-01  8:34       ` Theo de Raadt
2023-01-01 21:37 ` Arsen Arsenović
2023-01-01 23:50   ` Alejandro Colomar
2023-01-02  0:02     ` Arsen Arsenović
2023-01-02 11:24       ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ced48d59-2032-0350-916e-7740d1d53f66@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=Jason@zx2c4.com \
    --cc=Todd.Miller@sudo.ws \
    --cc=adhemerval.zanella@linaro.org \
    --cc=alx@kernel.org \
    --cc=crrodriguez@opensuse.org \
    --cc=deraadt@openbsd.org \
    --cc=deraadt@theos.com \
    --cc=djm@cvs.openbsd.org \
    --cc=ipedrosa@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    --cc=otto@cvs.openbsd.org \
    --cc=serge@hallyn.com \
    --cc=ydroneaud@opteya.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).