From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6179 invoked by alias); 21 Nov 2014 14:12:54 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 6157 invoked by uid 89); 21 Nov 2014 14:12:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: artax.karlin.mff.cuni.cz Received: from artax.karlin.mff.cuni.cz (HELO artax.karlin.mff.cuni.cz) (195.113.26.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Nov 2014 14:12:52 +0000 Received: by artax.karlin.mff.cuni.cz (Postfix, from userid 17421) id 2B737468001; Fri, 21 Nov 2014 15:11:14 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by artax.karlin.mff.cuni.cz (Postfix) with ESMTP id 12D3E488001; Fri, 21 Nov 2014 15:11:14 +0100 (CET) Date: Fri, 21 Nov 2014 14:43:00 -0000 From: Mikulas Patocka To: Corinna Vinschen cc: cygwin@cygwin.com Subject: Re: Instability with signals and threads In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) X-Personality-Disorder: Schizoid MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2014-11/txt/msg00508.txt.bz2 > Do you use a DLL built with optimization by any chance? I wouldn't take > the backtraces too serious in that case. For debugging it helps a lot > to use a Cygwin DLL built without -O2. I use optimization. The stacktrace may contain some other functions that already finished but left the address on the stack. > Btw., are you testing on 32 or 64 bit? On 32-bit. The rebuild of cygwin1.dll requires large number of packages to create the documentation (including tex and java) and I haven't bloated the 64-bit cygwin installation with them yet. I wish it were possible to build the library without documentation and without such big dependecies. > I'm testing on 64 bit. I can't reproduce your backtrace, but I can > reproduce another one, which is related to thread_exit. At one point > after a couple thousand runs through your testcase I have a variable > number of threads hanging in thread_exit, and a timer thread which is > unable to send its signal. the other threads all hang in thread_exit, > waiting for a muto which is taken by a thread which doesn't exist > anymore. So you can - just for debugging - add a counter to thread local storage that is incremented when muto is taken and decremented when muto is released. If the thread exists, test the counter, if it is non-zero, print the backtrace or attach the debugger. > That's a very serious downside of the muto implementation not > being able to recognize being abandoned. I wonder if that shouldn't be > using a real OS mutex. That would hide the problem that a thread is exiting with locked muto, but not fix it. > As a sidenote, the snapshot doesn't work well in > other scenarios, too, apparently. Yaakov reported hangs in KDE :( Mikulas -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6188 invoked by alias); 21 Nov 2014 14:12:54 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 6166 invoked by uid 9078); 21 Nov 2014 14:12:54 -0000 Received: (qmail 6157 invoked by uid 89); 21 Nov 2014 14:12:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: artax.karlin.mff.cuni.cz Received: from artax.karlin.mff.cuni.cz (HELO artax.karlin.mff.cuni.cz) (195.113.26.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Nov 2014 14:12:52 +0000 Received: by artax.karlin.mff.cuni.cz (Postfix, from userid 17421) id 2B737468001; Fri, 21 Nov 2014 15:11:14 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by artax.karlin.mff.cuni.cz (Postfix) with ESMTP id 12D3E488001; Fri, 21 Nov 2014 15:11:14 +0100 (CET) Date: Fri, 21 Nov 2014 14:46:00 -0000 From: Mikulas Patocka To: Corinna Vinschen cc: cygwin@cygwin.com Subject: Re: Instability with signals and threads In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) X-Personality-Disorder: Schizoid MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2014-11/txt/msg00509.txt.bz2 Message-ID: <20141121144600.GqPuk89lje2-6IReKGP6RnzLnuhhaEP9MrDQoxuXbcA@z> > Do you use a DLL built with optimization by any chance? I wouldn't take > the backtraces too serious in that case. For debugging it helps a lot > to use a Cygwin DLL built without -O2. I use optimization. The stacktrace may contain some other functions that already finished but left the address on the stack. > Btw., are you testing on 32 or 64 bit? On 32-bit. The rebuild of cygwin1.dll requires large number of packages to create the documentation (including tex and java) and I haven't bloated the 64-bit cygwin installation with them yet. I wish it were possible to build the library without documentation and without such big dependecies. > I'm testing on 64 bit. I can't reproduce your backtrace, but I can > reproduce another one, which is related to thread_exit. At one point > after a couple thousand runs through your testcase I have a variable > number of threads hanging in thread_exit, and a timer thread which is > unable to send its signal. the other threads all hang in thread_exit, > waiting for a muto which is taken by a thread which doesn't exist > anymore. So you can - just for debugging - add a counter to thread local storage that is incremented when muto is taken and decremented when muto is released. If the thread exists, test the counter, if it is non-zero, print the backtrace or attach the debugger. > That's a very serious downside of the muto implementation not > being able to recognize being abandoned. I wonder if that shouldn't be > using a real OS mutex. That would hide the problem that a thread is exiting with locked muto, but not fix it. > As a sidenote, the snapshot doesn't work well in > other scenarios, too, apparently. Yaakov reported hangs in KDE :( Mikulas -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple