From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9584 invoked by alias); 11 Sep 2008 15:07:26 -0000 Received: (qmail 9303 invoked by uid 22791); 11 Sep 2008 15:07:25 -0000 X-Spam-Check-By: sourceware.org Received: from mtagate4.de.ibm.com (HELO mtagate4.de.ibm.com) (195.212.29.153) by sourceware.org (qpsmtpd/0.31) with ESMTP; Thu, 11 Sep 2008 15:06:42 +0000 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate4.de.ibm.com (8.13.8/8.13.8) with ESMTP id m8BF2FuY152756 for ; Thu, 11 Sep 2008 15:06:36 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id m8BEpWXb2969610 for ; Thu, 11 Sep 2008 16:51:32 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m8BEpT0S026846 for ; Thu, 11 Sep 2008 16:51:29 +0200 Received: from [9.152.198.52] (dyn-9-152-198-52.boeblingen.de.ibm.com [9.152.198.52]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id m8BEpTfG025894 for ; Thu, 11 Sep 2008 16:51:29 +0200 Subject: [PATCH] endless loop in __libc_fork From: Martin Schwidefsky Reply-To: schwidefsky@de.ibm.com To: Glibc hackers Content-Type: text/plain Date: Thu, 11 Sep 2008 15:07:00 -0000 Message-Id: <1221144185.14064.1.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit Mailing-List: contact libc-hacker-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-hacker-owner@sourceware.org X-SW-Source: 2008-09/txt/msg00000.txt.bz2 Greetings, Christian Borntraeger found a bug in __libc_fork with a KVM stress on s390: pid_t __libc_fork (void) { ... /* Run all the registered preparation handlers. In reverse order. While doing this we build up a list of all the entries. */ struct fork_handler *runp; while ((runp = __fork_handlers) != NULL) { unsigned int oldval = runp->refcntr; if (oldval == 0) /* This means some other thread removed the list just after the pointer has been loaded. Try again. Either the list is empty or we can retry it. */ continue; ... } The (oldval == 0) check with the continue is translated to an endless loop on s390-64: 16: e3 60 10 00 00 04 lg %r6,0(%r1) 1c: b9 02 00 66 ltgr %r6,%r6 # runp != NULL check 20: b9 04 00 bf lgr %r11,%r15 24: 41 40 60 28 la %r4,40(%r6) 28: a7 74 00 9f jne 166 <__libc_fork+0x166> ... 166: 58 30 60 28 l %r3,40(%r6) > 16a: 12 33 ltr %r3,%r3 # oldval == 0 check > 16c: a7 84 ff ff je 16a <__libc_fork+0x16a> # endless loop Once you stumbled over the problem it is obvious that a memory barrier is missing here. See patch below. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- 2008-09-11 Martin Schwidefsky * sysdeps/unix/sysv/linux/fork.c (__libc_fork): Add memory barrier to force runp->refcntr to be read from memory. diff -urpN libc/nptl/sysdeps/unix/sysv/linux/fork.c libc-s390/nptl/sysdeps/unix/sysv/linux/fork.c --- libc/nptl/sysdeps/unix/sysv/linux/fork.c 2007-08-07 13:09:01.000000000 +0200 +++ libc-s390/nptl/sysdeps/unix/sysv/linux/fork.c 2008-09-08 13:20:13.000000000 +0200 @@ -64,7 +64,11 @@ __libc_fork (void) struct fork_handler *runp; while ((runp = __fork_handlers) != NULL) { - unsigned int oldval = runp->refcntr; + unsigned int oldval; + + atomic_full_barrier(); + + oldval = runp->refcntr; if (oldval == 0) /* This means some other thread removed the list just after