public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "mihaylov.mihail at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sources.redhat.com
Subject: [Bug nptl/13165] New: pthread_cond_wait() can consume a signal that was sent before it started waiting
Date: Wed, 07 Sep 2011 19:15:00 -0000	[thread overview]
Message-ID: <bug-13165-131@http.sourceware.org/bugzilla/> (raw)

http://sourceware.org/bugzilla/show_bug.cgi?id=13165

             Bug #: 13165
           Summary: pthread_cond_wait() can consume a signal that was sent
                    before it started waiting
           Product: glibc
           Version: 2.14
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: mihaylov.mihail@gmail.com
    Classification: Unclassified


I was implementing something like a monitor on top of pthread condition
variables and I observed some strange behaviour. I was always holding the mutex
when calling pthread_cond_signal(). My code relied on only two assumptions
about the way pthread_cond_signal() works:

1) A call to pthread_cond_signal() will wake at least one thread which is
blocked on the condition, and the woken threads will start waiting on the
mutex.

2) If the signaling thread holds the mutex when it calls pthread_cond_signal(),
only threads which are already waiting on the condition variable may be woken.
In particular, if the signaling thread releases the mutex and then another
thread acquires the mutex and calls pthread_cond_wait(), the waiting thread
cannot be woken by this signal, no matter what other waiters are present before
or after the signal.

The only explanation that I could find for the observed behaviour was that my
second assumption was wrong. It seemed that I was hitting the following
scenario:

1) We have several threads which are blocked on the condvar in
pthread_cond_wait(). I'll call these threads "group A".

2) We then send N signals from another thread while holding the mutex. We are
releasing the mutex and acquiring it again between the signals.

3) Next we have several more threads (at least two) that acquire the mutex and
enter pthread_cond_wait(). I'll call these threads "group B"

4) Then we acquire the mutex in the signaling thread again and call
pthread_cond_signal() just once, then we release the mutex.

5) Two threads from group B wake up, and N-1 threads from group A wake up. In
effect one of the threads from group B has stolen a signal that was sent before
it started waiting from a thread from group A.

My expectation in this scenario is that at least N threads from group A should
wake up. I don't expect that exactly one thread from group B should wake up,
because spurious wakeups are possible. But this is not a spurious wakeup - I
have N signals, and N woken threads, it's just that the order is wrong.

I ran some experiments and they seemed to confirm my theory, so I looked at the
condvar implementation in nptl. I'm new to POSIX and Linux programing, but I
think I see how this can happen:

1) When we send the first N signals, N threads from group A that are waiting on
the cond->__data.__futex are woken and start waiting on cond->__data.__lock.

2) Then while the threads from group B enter pthread_cond_wait, some of the
woken threads from group A may remain waiting on the lock.

3) When we send the last signal, one thread from group B will wake and consume
this signal.

4) But suppose one more thread from group B wakes spuriously from
lll_futex_wait. At this moment it is possible that some of the woken threads
from group A will still be waiting on cond->__data.__lock. In that case the
spuriously woken thread from group B will see that cond->__data.__wakeup_seq
has changed (because of the last signal) and cond->__data._woken_seq has not
reached cond->__data.__wakeup_seq (because some of the woken threads in group A
are still waiting to acquire cond->__data.__lock), so it will exit the retry
loop and increase cond->__data.__woken_seq. The result is that the thread will
steal the signal.

Is this scenario really possible? And if it is, is this on purpose or is it a
bug?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


             reply	other threads:[~2011-09-07 19:15 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-07 19:15 mihaylov.mihail at gmail dot com [this message]
2011-09-21  9:12 ` [Bug nptl/13165] " mihaylov.mihail at gmail dot com
2011-09-21 18:19 ` bugdal at aerifal dot cx
2011-09-21 22:29 ` bugdal at aerifal dot cx
2011-09-22 22:21 ` mihaylov.mihail at gmail dot com
2011-09-25 21:33 ` mihaylov.mihail at gmail dot com
2011-09-25 21:44 ` mihaylov.mihail at gmail dot com
2011-09-26  9:27 ` mihaylov.mihail at gmail dot com
2011-09-26 16:20 ` bugdal at aerifal dot cx
2011-09-27 10:10 ` mihaylov.mihail at gmail dot com
2011-09-27 10:13 ` mihaylov.mihail at gmail dot com
2011-09-28  2:07 ` bugdal at aerifal dot cx
2011-09-28  2:08 ` bugdal at aerifal dot cx
2011-09-28  9:03 ` mihaylov.mihail at gmail dot com
2011-09-28 16:06 ` bugdal at aerifal dot cx
2011-09-28 21:00 ` mihaylov.mihail at gmail dot com
2012-09-19 15:15 ` triegel at redhat dot com
2012-09-19 15:21 ` triegel at redhat dot com
2012-09-19 17:23 ` bugdal at aerifal dot cx
2012-09-20 10:28 ` mihaylov.mihail at gmail dot com
2012-09-20 10:43 ` triegel at redhat dot com
2012-09-20 11:05 ` mihaylov.mihail at gmail dot com
2012-09-20 11:23 ` triegel at redhat dot com
2012-09-20 11:58 ` triegel at redhat dot com
2012-09-20 12:46 ` mihaylov.mihail at gmail dot com
2012-09-20 12:49 ` mihaylov.mihail at gmail dot com
2012-09-20 16:21 ` triegel at redhat dot com
2012-09-20 18:39 ` bugdal at aerifal dot cx
2012-09-20 19:48 ` mihaylov.mihail at gmail dot com
2012-09-20 20:31 ` bugdal at aerifal dot cx
2012-09-21  8:04 ` triegel at redhat dot com
2012-09-21  8:05 ` siddhesh at redhat dot com
2012-09-21  8:54 ` bugdal at aerifal dot cx
2012-09-21 15:45 ` triegel at redhat dot com
2012-10-18  6:26 ` mihaylov.mihail at gmail dot com
2012-10-18 12:25 ` bugdal at aerifal dot cx
2012-10-24 20:26 ` triegel at redhat dot com
2012-10-25  4:08 ` bugdal at aerifal dot cx
2013-01-19 16:19 ` scot4spam at yahoo dot com
2014-02-16 17:45 ` jackie.rosen at hushmail dot com
2014-03-28  9:23 ` dancol at dancol dot org
2014-05-28 19:44 ` schwab at sourceware dot org
2014-06-27 12:09 ` fweimer at redhat dot com
2014-08-18 21:22 ` triegel at redhat dot com
2014-08-18 21:42 ` bugdal at aerifal dot cx
2015-08-26 15:29 ` kkersten at cray dot com
2017-01-01 21:32 ` triegel at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-13165-131@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).