From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5957 invoked by alias); 27 Aug 2011 20:37:47 -0000 Received: (qmail 5927 invoked by uid 22791); 27 Aug 2011 20:37:26 -0000 X-Spam-Check-By: sourceware.org Received: from aquarius.hirmke.de (HELO calimero.vinschen.de) (217.91.18.234) by sourceware.org (qpsmtpd/0.83/v0.83-20-g38e4449) with ESMTP; Sat, 27 Aug 2011 20:37:11 +0000 Received: by calimero.vinschen.de (Postfix, from userid 500) id DC8492C00F4; Sat, 27 Aug 2011 22:37:06 +0200 (CEST) Date: Sat, 27 Aug 2011 20:37:00 -0000 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: STC for libapr1 failure Message-ID: <20110827203706.GA15411@calimero.vinschen.de> Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <4E56EB24.5000505@acm.org> <20110826111509.GH10490@calimero.vinschen.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20110826111509.GH10490@calimero.vinschen.de> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com X-SW-Source: 2011-08/txt/msg00495.txt.bz2 On Aug 26 13:15, Corinna Vinschen wrote: > On Aug 25 17:39, David Rothenberger wrote: > > For a while now, the test cases that come with libapr1 have been > > bombing with this message: > > > > *** fatal error - NtCreateEvent(lock): 0xC0000035 > > > > I finally took some time to investigate and have extracted a STC > > that demonstrates the problem. > > Thanks a lot for the testcase. In theory, the NtCreateEvent call should > not have happened at all, since it's called under lock, and the code > around that should have made sure that the object doesn't exist at the > time. > > After a few hours of extrem puzzlement, I now finally know what happens. > It's kinda hard to explain. > > A lock on a file is represented by an event object. Process A holds the > lock corresponding with event a. Process B tries to lock, but the lock > of process A blocks that. So B now waits for event a, until it gets > signalled. Now A unlocks, thus signalling event a and closing the handle > afterwards. But A's time slice isn't up yet, so it tries again to lock > the file, before B returned from the wait for a. And here a wrong > condition fails to recognize the situation. It finds the event object, > but since it's recognized as "that's me", it doesn't treat the event as > a blocking factor. This in turn is the allowance to create its own lock > event object. However, the object still exists, since b has still an > open handle to it. So creating the event fails, and rightfully so. > > What I don't have is an idea how to fix this problem correctly. I have > to think about that. Stay tuned. Please test the latest snapshot. It should fix this problem, as well as a starvation problem with signals (and, fwiw, thread cancel events) in flock, lockf, and POSIX fcntl locks. Thanks again for the testcase. It was very helpful to test both problems. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple