One of our larger application is experiencing hangs and we have tracked this down to interaction between fork/atfork and the malloc implementation. We have a simplified test case (attached) that illuminates this problem. Basically the NPTL fork is not atomic to signal due to the at_fork handling which must run before (atfork prepare) and after (atfork parent and child) the fork syscall. The GLIBC runtime uses atfork processing internal to insure correct behaviour for the parent and child after the fork. This includes IO and malloc, for example the calloc contains the following code sequence: /* Suspend the thread until the `atfork' handlers have completed. By that time, the hooks will have been reset as well, so that mALLOc() can be used again. */ (void)mutex_lock(&list_lock); (void)mutex_unlock(&list_lock); return public_mALLOc(sz); This is no problem as long as fork processing continues and call the malloc atfork parent/child handler. However the code in sysdeps/unix/sysv/linux/fork.c is exposed to signals interupting its operation. If the thread calling fork is interrupted by a signal, after it has processed atfork prepare handlers but before it has processed the atfork parent handles, and the signal handler blocks for any reason (sigsuspend or attempts IO) the process can hang. For example any other thread attempting to call malloc will wait for the atfork handlers to release the "list_lock" but the thread processing the fork in now blocked and can not proceed. If the forking thread is dependent on one of the other threads to wake it (via signal) that thread may block on the list_lock first and now we have deadlock. So is it OK for NPTLs fork implementation to not be atomic relative to signals? From the POSIX spec we see statements like: 13089 ... Since the fork ( ) call can be considered as atomic 13090 from the application’s perspective, the set would be initialized as empty and such signals would 13091 have arrived after the fork ( ); see also . In this case fork is definitely not atomic. So what should we do about this? One possible solution is to use the signal mask and disable async signals for the duration of __libc_fork(). Or at least from just before atfork prepare processing to after atfork parent/child processing. We have experimented with this in our application (masking signals before the fork call and restoring them after in the parent and child). And this does seem to elliminate the hang. But should we change the libc NPTL fork implement to use signal masks to give the application the appeirence that fork is atomic?