From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28434 invoked by alias); 26 Feb 2007 16:03:07 -0000 Received: (qmail 28418 invoked by uid 22791); 26 Feb 2007 16:03:06 -0000 X-Spam-Check-By: sourceware.org Received: from sunsite.ms.mff.cuni.cz (HELO sunsite.mff.cuni.cz) (195.113.15.26) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 26 Feb 2007 16:02:59 +0000 Received: from sunsite.mff.cuni.cz (localhost.localdomain [127.0.0.1]) by sunsite.mff.cuni.cz (8.13.8/8.13.8) with ESMTP id l1QG5Pss003967; Mon, 26 Feb 2007 17:05:25 +0100 Received: (from jakub@localhost) by sunsite.mff.cuni.cz (8.13.8/8.13.8/Submit) id l1QG5Onm003966; Mon, 26 Feb 2007 17:05:24 +0100 Date: Mon, 26 Feb 2007 16:03:00 -0000 From: Jakub Jelinek To: Steven Munroe Cc: GNU libc hacker , Ryan Arnold , Mark Brown Subject: Re: Timing window in NPTL fork.c causes hangs. Message-ID: <20070226160524.GH4219@sunsite.mff.cuni.cz> Reply-To: Jakub Jelinek References: <45DA0AD6.5060107@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45DA0AD6.5060107@us.ibm.com> User-Agent: Mutt/1.4.2.2i Mailing-List: contact libc-hacker-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-hacker-owner@sourceware.org X-SW-Source: 2007-02/txt/msg00012.txt.bz2 On Mon, Feb 19, 2007 at 02:38:46PM -0600, Steven Munroe wrote: > However the code in sysdeps/unix/sysv/linux/fork.c is exposed to signals > interupting its operation. If the thread calling fork is interrupted by > a signal, after it has processed atfork prepare handlers but before it > has processed the atfork parent handles, and the signal handler blocks > for any reason (sigsuspend or attempts IO) the process can hang. For > example any other thread attempting to call malloc will wait for the > atfork handlers to release the "list_lock" but the thread processing the > fork in now blocked and can not proceed. If the forking thread is > dependent on one of the other threads to wake it (via signal) that > thread may block on the list_lock first and now we have deadlock. > > So is it OK for NPTLs fork implementation to not be atomic relative to > signals? If you have an async signal handler that can block the app indefinitely, then that's to be expected. How is that different from the same signal handler e.g. interrupting in the middle of malloc or stdio? Some malloc or stdio lock can be held at that point, so if your async signal handler waits till some other thread wakes it up and those other threads need malloc or stdio, you hang exactly the same way. > >From the POSIX spec we see statements like: > > 13089 ... Since the fork ( ) call can be considered as atomic > 13090 from the application???s perspective, the set would be initialized > as empty and such signals would > 13091 have arrived after the fork ( ); see also . This IMHO talks just about the issue whether a signal sent to the process is sent just to parent or also to the child. fork() as a whole can't be considered atomic, you can e.g. block indefinitely in one of the atfork handlers, using async signal safe function. > So what should we do about this? One possible solution is to use the > signal mask and disable async signals for the duration of __libc_fork(). > Or at least from just before atfork prepare processing to after atfork > parent/child processing. So you just break different apps (in addition to making fork() considerably slower)? Apps have full right to expect the signal masks weren't messed up by the library, can very well e.g. sigsuspend in an atfork handler and expect to be woken up. If you block all signals before running the atfork handlers, that would never happen. Not to mention that the atfork handlers can sigprocmask. > We have experimented with this in our application (masking signals > before the fork call and restoring them after in the parent and child). > And this does seem to elliminate the hang. Then just do that in your application if you need it. > But should we change the libc NPTL fork implement to use signal masks to > give the application the appeirence that fork is atomic? No. Jakub