From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8154 invoked by alias); 23 Jun 2004 06:56:56 -0000 Mailing-List: contact libc-hacker-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-hacker-owner@sources.redhat.com Received: (qmail 8138 invoked from network); 23 Jun 2004 06:56:55 -0000 Received: from unknown (HELO Cantor.suse.de) (195.135.220.2) by sourceware.org with SMTP; 23 Jun 2004 06:56:55 -0000 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id D4498789E98; Wed, 23 Jun 2004 08:56:35 +0200 (CEST) Date: Wed, 23 Jun 2004 06:56:00 -0000 From: Thorsten Kukuk To: Steve Munroe Cc: Jakub Jelinek , libc-hacker@sources.redhat.com Subject: Re: deadlock in signal handler with NPTL Message-ID: <20040623065635.GA21813@suse.de> References: <20040623042256.GA6177@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: SuSE Linux AG, Nuernberg, Germany User-Agent: Mutt/1.5.6i X-SW-Source: 2004-06/txt/msg00047.txt.bz2 On Tue, Jun 22, Steve Munroe wrote: > Thorsten Kukuk wrote on 06/22/2004 11:22:56 PM: > > > On Tue, Jun 22, Jakub Jelinek wrote: > > > > > On Tue, Jun 22, 2004 at 11:50:59PM +0200, Thorsten Kukuk wrote: > > > > > > > > Hi, > > > > > > > > I got the following test program. I know, it is very ugly and there > > > > are a lot of things somebody should not do, but this is something > > > > what programs like sshd are doing. > > > > > > Then they should be fixed. Neither syslog, nor printf, nor fflush > > > are supposed to be async-signal safe, nor they actually are in glibc. > > > > Yes, but the problem is: Nearly every daemon on a Linux system is > > calling syslog() in a signal handler and it seems to be very easy > > to deadlock them on every Linux system running glibc/NPTL. While > > there seems to be no other system with the same problem. > > > > Then what has change from glibc-2.3.3 (RHEL 3) until now? Because I have > not seen this problem before. The test case also deadlocks on a RHEL 3 machine very fast. > I have reviewed all the changes to > lowlevellock.h since and I do not see any change that would effect this. In > fact your test case should show that same hang there. The difference is: glibc with linuxthreads compiled only uses the locking, if the program is linked against pthread. glibc with NPTL compiled always uses locking (__libc_lock_lock always calls lll_lock). Uli, Jakub, is this really necessary? Wouldn't it be better to add the one extra compare? > Have the daemon's changed recently to add the syslog() call to the signal > handler? No, this is very, very old. Thorsten -- Thorsten Kukuk http://www.suse.de/~kukuk/ kukuk@suse.de SuSE Linux AG Maxfeldstr. 5 D-90409 Nuernberg -------------------------------------------------------------------- Key fingerprint = A368 676B 5E1B 3E46 CFCE 2D97 F8FD 4E23 56C6 FB4B