public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/16892] New: Invalid futex demotion in __lll_timedlock
@ 2014-05-01 13:00 bernie.ogden at linaro dot org
  2014-05-01 13:00 ` [Bug nptl/16892] " bernie.ogden at linaro dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: bernie.ogden at linaro dot org @ 2014-05-01 13:00 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16892

            Bug ID: 16892
           Summary: Invalid futex demotion in __lll_timedlock
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: bernie.ogden at linaro dot org
                CC: drepper.fsp at gmail dot com

Following description culled from Carlos O'Donell's analysis at
https://sourceware.org/ml/libc-ports/2013-02/msg00021.html.

Some platforms (m68k, aarch64, arm, sh/sh4) have an implementation of
__lll_timedlock that sets futex to 1 without first checking that it is 0. This
allows the futex to move from 2 (locked with waiters) to 1 (locked with no
waiters) on these platforms.

This does not create a correctness problem, but it does create a pair of
performance problems.

1) Up to N threads can fail to sleep when they ought to have done, where N is
the number of threads expecting futex==2. For example:

* T1 calls __lll_timedlock setting futex to 1 and taking the lock.
* T2 calls __lll_timedlock setting futex to 1 but does not take the lock.
* T2 calls __lll_timedlock_wait and sets the futex to 2 and does not
gain the lock.
* T3 calls __lll_timedlock setting futex to 1 but does not take the lock.
* T2 calls lll_futex_time_wait but fails with -EWOULDBLOCK because T3 reset
futex to 1.
-> One inflight thread (T2), and one spurious failed futex wait syscall.
* T2 again sets the futex to 2 and does not gain the lock.
* ... T2 and T3 go on to call futex wait syscall and both sleep.

2) __lll_unlock only wakes if futex was > 1 prior to release. Thus it can
happen that __lll_timedlock keeps setting futex from 2 to 1 just prior to
__lll_unlock calls, preventing waiters from being awoken. This certainly
affects m68k, arm and aarch64 - sh may also be affected but it's a little
harder to tell as its written in asm.

In both cases, the solution is simply to do an atomic_compare_and_exchange_acq
(as the unaffected platforms already do), rather than an atomic_exchange_acq,
so that __lll_timedlock does not change futex from 2 to 1. It's easy to apply
this fix to at least the targets that are implemented in C. Better still would
be to combine as many as possible of the lowlevellock.h implementations into a
generic implementation that behaves in this way.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-09-02 10:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-01 13:00 [Bug nptl/16892] New: Invalid futex demotion in __lll_timedlock bernie.ogden at linaro dot org
2014-05-01 13:00 ` [Bug nptl/16892] " bernie.ogden at linaro dot org
2014-05-01 13:01 ` bernie.ogden at linaro dot org
2014-06-12 19:29 ` fweimer at redhat dot com
2014-08-12 11:58 ` cvs-commit at gcc dot gnu.org
2014-08-12 12:02 ` cvs-commit at gcc dot gnu.org
2014-08-12 12:03 ` will.newton at gmail dot com
2014-09-02 10:16 ` bernie.ogden at linaro dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).