From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 68670 invoked by alias); 12 Sep 2016 11:52:40 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 68646 invoked by uid 89); 12 Sep 2016 11:52:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:3734, canceled, awareness, widespread X-HELO: mx1.redhat.com Subject: Re: [PATCH v3] getrandom system call wrapper [BZ #17252] To: Torvald Riegel References: <661db778-8110-82b2-2c41-d6195916cbea@redhat.com> <1473430905.30192.5.camel@localhost.localdomain> <1473434601.30192.13.camel@localhost.localdomain> <771c9c53-44df-cfdb-1ad2-c7a249b7d6c5@redhat.com> <1473673221.30192.39.camel@localhost.localdomain> Cc: GNU C Library From: Florian Weimer Message-ID: Date: Mon, 12 Sep 2016 11:52:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <1473673221.30192.39.camel@localhost.localdomain> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-09/txt/msg00201.txt.bz2 On 09/12/2016 11:40 AM, Torvald Riegel wrote: > On Mon, 2016-09-12 at 09:25 +0200, Florian Weimer wrote: >> On 09/09/2016 05:23 PM, Torvald Riegel wrote: >>> On Fri, 2016-09-09 at 16:28 +0200, Florian Weimer wrote: >>>> On 09/09/2016 04:21 PM, Torvald Riegel wrote: >>>>> Can't we just let cancellation rot in its corner? >>>> >>>> No, we have many customers who use it (and this despite the fact that >>>> the current implementation has a critical race condition). >>> >>> Usage of it doesn't mean that it has to be the default. >> >> It's not used by default. Something has to call pthread_cancel. > > I do mean the other side. That is, in all the code that may see a > cancellation request. Only very little code supports that. Even in glibc, we have many parts that have cancellation points, but do not install cancellation handlers to clean up resources. Changing that probably needs some form of tools support. Doing this right is mostly a matter of looking at the call graph. >>> Have we made >>> other syscall wrappers cancellation points in the past (ie, syscalls >>> that don't already have a matching POSIX function that is specified to >>> be a cancellation point too)? >> >> I found open_by_handle. > > OK, though that's much like open(), which is a cancellation point, so > making the syscall a cancellation point too would make sense. > The pseudo-RNG functions are not cancellation points. I see getrandom more like read, or ioctl. It's a matter of perspective. arc4random would be more like a PRNG (can't fail, no short reads, no blocking (although this part may be difficult), no cancellation point, hundreds of megabytes per second throughput on current machines). getentropy is somewhere in the middle. >>> I'm worried about people who just want to use the syscall but don't know >>> that much about POSIX cancellation. They couldn't use the syscall >>> safely in a library without also being aware of POSIX cancellation, and >>> I'm concerned that they might just forget to disable cancellation around >>> the syscall, thus creating resource leaks, deadlocks (eg, cancellation >>> handler doesn't release locks), etc. If this is primarily a Linux API >>> currently (ignoring the Solaris case for a while), then marrying it to >>> POSIX seems wrong. >> >> If we add getentropy, I suggest that it will not be a cancellation point >> (even if it can still block indefinitely). > > Can you elaborate on your reasoning? getentropy is supposed to be the simple interface cryptographic libraries should use to obtain a seed for their own PRNG. Not making it a cancellation point is part of keeping the interface simple. >> I looked at quite a few getrandom emulations using /dev/urandom, and not >> one of them was cancellation-aware (it leaked the file descriptor on >> cancellation, for example). Based on that, I really doubt getrandom >> would introduce an unexpected cancellation point that causes actual >> problems. > > Interesting, thanks. That might be one interpretation of the situation > (ie, that users know that they don't have to worry about cancellation > requests concurrent or pending while getting a random number). > However, it might also mean that what I worry about is actually > realistic (ie, that user code should be cancellation-aware but isn't). It matters only if the code can run in a thread which is canceled. I have to admit that references to pthread_cancel are more widespread than I assumed. Yet we see few bug reports related to cancellation, despite that both glibc's implementation and library awareness of cancellation leave a lot to be desired. Anyway, this is probably material for another thread. :) Florian