From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29489 invoked by alias); 19 Feb 2007 20:24:58 -0000 Received: (qmail 29462 invoked by uid 22791); 19 Feb 2007 20:24:51 -0000 X-Spam-Check-By: sourceware.org Received: from e31.co.us.ibm.com (HELO e31.co.us.ibm.com) (32.97.110.149) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 19 Feb 2007 20:24:45 +0000 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id l1JKOfGC010633 for ; Mon, 19 Feb 2007 15:24:41 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay02.boulder.ibm.com (8.13.8/8.13.8/NCO v8.2) with ESMTP id l1JKOfmn427214 for ; Mon, 19 Feb 2007 13:24:41 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l1JKOeKM013178 for ; Mon, 19 Feb 2007 13:24:41 -0700 Received: from [9.10.86.122] (spokane1.rchland.ibm.com [9.10.86.122]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l1JKOeKl013127; Mon, 19 Feb 2007 13:24:40 -0700 Message-ID: <45DA0AD6.5060107@us.ibm.com> Date: Mon, 19 Feb 2007 20:24:00 -0000 From: Steven Munroe User-Agent: Mozilla/5.0 (X11; U; Linux ppc64; en-US; rv:1.8.0.9) Gecko/20060906 SUSE/1.8_seamonkey_1.0.7-1.1 SeaMonkey/1.0.7 MIME-Version: 1.0 To: GNU libc hacker , Ryan Arnold , Mark Brown Subject: Timing window in NPTL fork.c causes hangs. Content-Type: multipart/mixed; boundary="------------050301030609050102030202" Mailing-List: contact libc-hacker-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-hacker-owner@sourceware.org X-SW-Source: 2007-02/txt/msg00009.txt.bz2 This is a multi-part message in MIME format. --------------050301030609050102030202 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-length: 2635 One of our larger application is experiencing hangs and we have tracked this down to interaction between fork/atfork and the malloc implementation. We have a simplified test case (attached) that illuminates this problem. Basically the NPTL fork is not atomic to signal due to the at_fork handling which must run before (atfork prepare) and after (atfork parent and child) the fork syscall. The GLIBC runtime uses atfork processing internal to insure correct behaviour for the parent and child after the fork. This includes IO and malloc, for example the calloc contains the following code sequence: /* Suspend the thread until the `atfork' handlers have completed. By that time, the hooks will have been reset as well, so that mALLOc() can be used again. */ (void)mutex_lock(&list_lock); (void)mutex_unlock(&list_lock); return public_mALLOc(sz); This is no problem as long as fork processing continues and call the malloc atfork parent/child handler. However the code in sysdeps/unix/sysv/linux/fork.c is exposed to signals interupting its operation. If the thread calling fork is interrupted by a signal, after it has processed atfork prepare handlers but before it has processed the atfork parent handles, and the signal handler blocks for any reason (sigsuspend or attempts IO) the process can hang. For example any other thread attempting to call malloc will wait for the atfork handlers to release the "list_lock" but the thread processing the fork in now blocked and can not proceed. If the forking thread is dependent on one of the other threads to wake it (via signal) that thread may block on the list_lock first and now we have deadlock. So is it OK for NPTLs fork implementation to not be atomic relative to signals? >From the POSIX spec we see statements like: 13089 ... Since the fork ( ) call can be considered as atomic 13090 from the application’s perspective, the set would be initialized as empty and such signals would 13091 have arrived after the fork ( ); see also . In this case fork is definitely not atomic. So what should we do about this? One possible solution is to use the signal mask and disable async signals for the duration of __libc_fork(). Or at least from just before atfork prepare processing to after atfork parent/child processing. We have experimented with this in our application (masking signals before the fork call and restoring them after in the parent and child). And this does seem to elliminate the hang. But should we change the libc NPTL fork implement to use signal masks to give the application the appeirence that fork is atomic? --------------050301030609050102030202 Content-Type: text/x-c; name="calloc-fork-hang.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="calloc-fork-hang.c" Content-length: 2396 #include #include #include #include #include #include #include #include #define CALLOC_NMEMB 10000 /* * VERSION 1.1 * This testcase has 4 threads. The main thread simply starts the other threads * and then sleeps on a pthread_join. The forkingThread repeatedly calls * fork. The signalingThread repeatedly signals the forking thread, which * causes the forking thread to do sigsuspend. The third thread repeatedly * callocs and frees memory. Only when it is done with the calloc does it * signal the suspended thread to continue. The theory is that when the forking * thread gets suspended in the right place, it is holding a lock that the * callocing thread needs to continue, so the calloc thread hangs waiting on * that lock, and it cannot signal the forking thread to continue, creating a * deadlock. */ int killflag = 1; pthread_t forkThread; pthread_t sigThread; pthread_t calThread; void sigusr1Handler(int signum){ sigset_t set1; sigfillset(&set1); sigdelset(&set1, SIGUSR2); sigsuspend(&set1); killflag = 1; } void sigusr2Handler(int signum){ return; } void* callocingThread(void *ptr) { int * memptr; while(1) { memptr = calloc(CALLOC_NMEMB,4); if (!memptr){ fprintf(stderr, "calloc failed\n"); } pthread_kill(forkThread, SIGUSR2); free(memptr); } } void* signalingThread(void *ptr) { while(1) { if (killflag) { killflag = 0; pthread_kill(forkThread, SIGUSR1); } } } void* forkingThread(void *ptr) { pid_t pid; int i; struct sigaction sigusr1_action; struct sigaction sigusr2_action; sigfillset(&sigusr1_action.sa_mask); sigfillset(&sigusr2_action.sa_mask); sigusr1_action.sa_handler = &sigusr1Handler; sigusr2_action.sa_handler = &sigusr2Handler; sigaction(SIGUSR1, &sigusr1_action, NULL); sigaction(SIGUSR2, &sigusr2_action, NULL); while(1) { pid = fork(); fprintf(stderr, "."); if (pid == 0){ /* child */ exit(0); } else if (pid > 0) { /* parent */ waitpid(pid,NULL,NULL); continue; } else { fprintf(stderr, "fork failed\n"); } } } int main(int argc , char *argv[]) { pthread_create(&forkThread, 0, &forkingThread, 0); pthread_create(&calThread, 0, &callocingThread, 0); pthread_create(&sigThread, 0, &signalingThread, 0); pthread_join(forkThread, NULL); return 0; } --------------050301030609050102030202--