From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31398 invoked by alias); 20 Sep 2012 10:28:09 -0000 Received: (qmail 18472 invoked by uid 48); 20 Sep 2012 10:21:44 -0000 From: "mihaylov.mihail at gmail dot com" To: glibc-bugs@sources.redhat.com Subject: [Bug nptl/13165] pthread_cond_wait() can consume a signal that was sent before it started waiting Date: Thu, 20 Sep 2012 10:28:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: nptl X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: mihaylov.mihail at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: drepper.fsp at gmail dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org X-SW-Source: 2012-09/txt/msg00178.txt.bz2 http://sourceware.org/bugzilla/show_bug.cgi?id=13165 --- Comment #19 from Mihail Mihaylov 2012-09-20 10:21:39 UTC --- (In reply to comment #16) Sorry for the long reply. Please, bare with me, because this issue is very subtle and I don't know how to explain it more succinctly. First of all, let me clarify that this is a test that exposes the race, and not the usage scenario that I claim should be supported. The usage scenario is described in the bug description. Well, actually, I do claim that the scenario in the test should be supported too, but the scenario in the description makes more sense. > I'm not aware of any requirement that pthread_cond_signal should block until a > waiter has actually woken up. (Your test case relies on it to not block, so > that it can send out multiple signals while holding the mutex, right?) I'm > also not aware of any ordering requirement wrt. waiters (i.e., fairness). If > you combine both, you will see that the behavior you observe is a valid > execution. I'm not making any assumptions about the state of the waiters when pthread_cond_signal returns. All I'm assuming is that, no matter if the signaling thread releases and reacquires the mutex after each sent signal or sends all signals without releasing the mutex, at least as many waiters as the number of signals will wake (eventually). But even if this assumption is wrong (and it's not), if you set releaseMutexBetweenSignals to true, the test will release the mutex after each sent signal. In this case the test doesn't send multiple signals while holding the mutex, and the problem still occurs. As for fairness, this is not about fairness. It is also not about ordering between the waiters. It's about ordering between waiters and signalers. I'm getting tired of people jumping to fairness at the first mention of ordering. You could say that I'm requesting fairness if I wanted the first single signal to wake the waiter that blocked first. But all I'm requesting is for the signal to wake at least one of the waiters that started waiting before the signal was sent. I don't care which one of them. This is guaranteed by the standard (from the documentation of pthread_cond_wait and pthread_cond_signal on the opengroup site): "The pthread_cond_signal() function shall unblock at least one of the threads that are blocked on the specified condition variable cond (if any threads are blocked on cond)." And I think the next quote makes it very clear what threads are considered to be blocked on the condvar at the time of the call to pthread_cond_signal(): "That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked." In effect this means that each call to pthread_cond_signal() defines a point in time and all waiters (or calls to pthread_cond_wait() if you prefer) are either before this call, or after it. And only the ones that are before it are allowed to consume the signal sent by this call. Now, of course in a multiprocessor system it is hard to order events in time, but that's where the mutex comes in. And if the signaling thread sends multiple signals while holding the mutex, we can consider all these signals to be simultaneous. But that doesn't change the validity of the test. On the other hand, the standard doesn't guarantee that there won't be spurious wakeups. However, glibc tries to prevent them. But the logic for this prevention is flawed and causes the race that this bug is about. So the net result is that glibc chose to provide a feature that is not required, but dropped a much more important feature which is actually required. Hence, this bug is not a fairness feature request, it is a correctness defect report. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.