From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by sourceware.org (Postfix) with ESMTPS id 5B43A3858C27 for ; Sat, 23 Jul 2022 16:25:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5B43A3858C27 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E2E3DB80B8F; Sat, 23 Jul 2022 16:25:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C18D4C341C0; Sat, 23 Jul 2022 16:25:25 +0000 (UTC) Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 92b3a31e (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Sat, 23 Jul 2022 16:25:24 +0000 (UTC) Date: Sat, 23 Jul 2022 18:25:21 +0200 From: "Jason A. Donenfeld" To: libc-alpha@sourceware.org, Adhemerval Zanella Netto , Florian Weimer , Yann Droneaud , jann@thejh.net, Michael@phoronix.com Subject: arc4random - are you sure we want these? Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jul 2022 16:25:30 -0000 [Resending to right address.] Hi glibc developers, I learned about the addition of the arc4random functions in glibc this morning, thanks to Phoronix. I wish somebody would have CC'd me into those discussions before it got committed, but here we are. I really wonder whether this is a good idea, whether this is something that glibc wants, and whether it's a design worth committing to in the long term. Firstly, for what use cases does this actually help? As of recent changes to the Linux kernels -- now backported all the way to 4.9! -- getrandom() and /dev/urandom are extremely fast and operate over per-cpu states locklessly. Sure you avoid a syscall by doing that in userspace, but does it really matter? Who exactly benefits from this? Seen that way, it seems like a lot of complexity for nothing, and complexity that will lead to bugs and various oversights eventually. For example, the kernel reseeds itself when virtual machines fork using an identifier passed to the kernel via ACPI. It also reseeds itself on system resume, both from ordinary S3 sleep but also, more importantly, from hibernation. And in general, being the arbiter of entropy, the kernel is much better poised to determine when it makes sense to reseed. Glibc, on the other hand, can employ some heuristics and make some decisions -- on fork, after 16 MiB, and the like -- but in general these are lacking, compared to the much wider array of information the kernel has. You miss out on this with arc4random, and if that information _is_ to be exported to userspace somehow in the future, it would be awfully nice to design the userspace interface alongside the kernel one. For that reason, past discussion of having some random number generation in userspace libcs has geared toward doing this in the vDSO, somehow, where the kernel can be part and parcel of that effort. Seen from this perspective, going with OpenBSD's older paradigm might be rather limiting. Why not work together, between the kernel and libc, to see if we can come up with something better, before settling on an interface with semantics that are hard to walk back later? As-is, it's hard to recommend that anybody really use these functions. Just keep using getrandom(2), which has mostly favorable semantics. Yes, I get it: it's fun to make a random number generator, and so lots of projects figure out some way to make yet another one somewhere somehow. But the tendency to do so feels like a weird computer tinkerer disease rather something that has ever helped the overall ecosystem. So I'm wondering: who actually needs this, and why? What's the performance requirement like, and why is getrandom(2) insufficient? And is this really the best approach to take? If this is something needed, how would you feel about working together on a vDSO approach instead? Or maybe nobody actually needs this in the first place? And secondly, is there anyway that glibc can *not* do this, or has that ship fully sailed, and I really missed out by not being part of that discussion whenever it was happening? Thanks, Jason