public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
@ 2015-05-29 11:09 mpa at pengutronix dot de
  2015-05-29 11:12 ` [Bug nptl/18463] pthread_cond_broadcast issue " mpa at pengutronix dot de
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: mpa at pengutronix dot de @ 2015-05-29 11:09 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

            Bug ID: 18463
           Summary: pthread_cond_broadcast bug when surrounded by
                    PTHREAD_PRIO_INHERIT mutex on ARM
           Product: glibc
           Version: 2.20
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: mpa at pengutronix dot de
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 8335
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8335&action=edit
test program that triggers this issue

The attached program uses pthread_cond_broadcast() to wake up two threads
waiting on a condition variable which is surrounded by a mutex with
PTHREAD_PRIO_INHERIT set. The pthread_mutex_unlock() after
pthread_cond_broadcast() returns EPERM in the main thread.

This does not happen when using pthread_cond_signal(). Also it does not happen
when using just one thread that is waiting or when PTHREAD_PRIO_INHERIT is not
set.

This issue is reproducable with the attached program on the following setups:
 - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for cortex a8, glibc-2.20
 - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for armv7a without NEON,
glibc-2.20
 - Kernel 4.0.0-1-armmp with native Debian experimental toolchain for armhf,
glibc-2.21, gcc-5
 - Kernel 3.18.0-trunk-kirkwood native Debian experimental toolchain for armel,
glibc-2.21, gcc-5

This issue does not exist on:
 - Kernel 3.16 with Debian toolchain for x86_64 , glibc-2.19, gcc-4.9.2
 - Kernel 4.0.4 (vanilla, not debian) with Debian experimental toolchain for
x86_64, glibc-2.21, gcc-5
 - Kernel 4.0.0-1-armmp with native Debian testing toolchain for armhf,
glibc-2.19, gcc-4.9.2
 - Kernel 4.0.0-1-armmp crosscompile Debian unstable toolchain for armhf,
glibc-2.19, gcc-4.9.2
 - Kernel 3.18.0-trunk-kirkwood native Debian testing toolchain for armel,
glibc-2.19, gcc-4.9.2

I found bug 17013 which seems related. But 17013 is marked as fixed in
glibc-2.20 which is where the issue described above was introduced.

We tried reverting
        8f630cca5c36941db1cb48726016bbed80ec1041 Fix lll_unlock twice in
pthread_cond_broadcast
it does not solve the issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
@ 2015-05-29 11:12 ` mpa at pengutronix dot de
  2015-05-29 11:20 ` mkl at pengutronix dot de
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mpa at pengutronix dot de @ 2015-05-29 11:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

mpa at pengutronix dot de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|pthread_cond_broadcast bug  |pthread_cond_broadcast
                   |when surrounded by          |issue when surrounded by
                   |PTHREAD_PRIO_INHERIT mutex  |PTHREAD_PRIO_INHERIT mutex
                   |on ARM                      |on ARM

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
  2015-05-29 11:12 ` [Bug nptl/18463] pthread_cond_broadcast issue " mpa at pengutronix dot de
@ 2015-05-29 11:20 ` mkl at pengutronix dot de
  2015-05-29 12:04 ` r.schwebel at pengutronix dot de
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-05-29 11:20 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

Marc Kleine-Budde <mkl at pengutronix dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mkl at pengutronix dot de

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
  2015-05-29 11:12 ` [Bug nptl/18463] pthread_cond_broadcast issue " mpa at pengutronix dot de
  2015-05-29 11:20 ` mkl at pengutronix dot de
@ 2015-05-29 12:04 ` r.schwebel at pengutronix dot de
  2015-06-13 17:06 ` mkl at pengutronix dot de
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: r.schwebel at pengutronix dot de @ 2015-05-29 12:04 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

Robert Schwebel <r.schwebel at pengutronix dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |r.schwebel at pengutronix dot de

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (2 preceding siblings ...)
  2015-05-29 12:04 ` r.schwebel at pengutronix dot de
@ 2015-06-13 17:06 ` mkl at pengutronix dot de
  2015-06-13 22:09 ` mkl at pengutronix dot de
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-13 17:06 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #1 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
I bisected the problem to the following comit:

    47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894).

Applying this HACK to glibc-2.20 works around to problem:

index e755741de60b..a56290ce4ca4 100644
--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h
@@ -37,6 +37,6 @@
 /* The ARM kernel may or may not support
    futex_atomic_cmpxchg_inatomic, depending on kernel
    configuration.  */
-#undef __ASSUME_FUTEX_LOCK_PI
-#undef __ASSUME_REQUEUE_PI
-#undef __ASSUME_SET_ROBUST_LIST
+//#undef __ASSUME_FUTEX_LOCK_PI
+//#undef __ASSUME_REQUEUE_PI
+//#undef __ASSUME_SET_ROBUST_LIST

The glibc-2.20 is build with "--enable-kernel=3.0" using linux-3.16 header
generated by "make headers_install".

Marc

-------->8-------->8-------->8-------->8-------->8-------->8--------

47c5adebd2c864a098c3af66e61e1147dc3cf0b4 is the first bad commit
commit 47c5adebd2c864a098c3af66e61e1147dc3cf0b4
Author: Joseph Myers <joseph@codesourcery.com>
Date:   Mon Mar 31 12:55:18 2014 +0000

    Correct robust mutex / PI futex kernel assumptions (bug 9894).

    This patch continues fixing __ASSUME_* issues in preparation for
    moving to a 2.6.32 minimum kernel version by addressing assumptions on
    robust mutex and PI futex support availability.  Those assumptions are
    bug 9894, but to be clear this patch does not address all the issues
    from that bug about wrong version assumptions, only those still
    applicable for --enable-kernel=2.6.32 or later (with the expectation
    that the move to that minimum kernel will obsolete the other parts of
    the bug).  The patch is independent of
    <https://sourceware.org/ml/libc-alpha/2014-03/msg00585.html>, my other
    pending-review patch preparing for the kernel version change; the two
    together complete all the changes I believe are needed in preparation
    regarding any macro in sysdeps/unix/sysv/linux/kernel-features.h that
    would be affected by such a change.  (I have not checked the
    correctness of macros whose conditions are unaffected by such a
    change, or macros only defined in other kernel-features.h files.)

    As discussed in that bug, robust mutexes and PI futexes need
    futex_atomic_cmpxchg_inatomic to be implemented, in addition to
    certain syscalls needed for robust mutexes (and
    architecture-independent kernel pieces for all the features in
    question).  That is, as I understand it, they need
    futex_atomic_cmpxchg_inatomic to *work* (not return an ENOSYS error).

    The issues identified in my analysis relate to ARM, M68K, MicroBlaze,
    MIPS and SPARC.

    On ARM, whether futex_atomic_cmpxchg_inatomic works depends on the
    kernel configuration.  As of 3.13, the condition for *not* working is
    CONFIG_CPU_USE_DOMAINS && CONFIG_SMP.  As of 2.6.32 it was simply
    CONFIG_SMP that meant the feature was not implemented.  I don't know
    if there are any circumstances in which we can say "we can assume a
    userspace glibc binary built with these options will never run on a
    kernel with the problematic configuration", but at least for now I'm
    just undefining the relevant __ASSUME_* macros for ARM.

    On M68K, two of the three macros are undefined for kernels before
    3.10, but as far as I can see __ASSUME_FUTEX_LOCK_PI is in the same
    group needing futex_atomic_cmpxchg_inatomic support and so should be
    undefined as well.

    On MicroBlaze the required support was added in 2.6.33.

    On MIPS, the support depends on cpu_has_llsc in the kernel - that is,
    actual hardware LL/SC support (GCC and glibc for MIPS GNU/Linux rely
    on the instructions being supported in some way, but it may be kernel
    emulation; futex_atomic_cmpxchg_inatomic doesn't work with that
    emulation).  The same condition as in GCC for indicating LL/SC support
    may not be available is used for undefining the macros in glibc,
    __mips == 1 || defined _MIPS_ARCH_R5900.  (Maybe we could in fact
    desupport MIPS processors without the hardware support in glibc.)

    On SPARC, 32-bit kernels don't support futex_atomic_cmpxchg_inatomic;
    __arch64__ || __sparc_v9__ is used as the condition for binaries that
    won't run on 32-bit kernels.

    This patch is not tested beyond the sanity check of an x86_64 build.
        [BZ #9894]
        * sysdeps/unix/sysv/linux/kernel-features.h
        [__sparc__ && !__arch64__ && !__sparc_v9__]
        (__ASSUME_SET_ROBUST_LIST): Do not define.
        [__sparc__ && !__arch64__ && !__sparc_v9__]
        (__ASSUME_FUTEX_LOCK_PI): Likewise.
        [__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_REQUEUE_PI):
        Likewise.
        * sysdeps/unix/sysv/linux/arm/kernel-features.h
        (__ASSUME_FUTEX_LOCK_PI): Undefine.
        (__ASSUME_REQUEUE_PI): Likewise.
        (__ASSUME_SET_ROBUST_LIST): Likewise.
        * sysdeps/unix/sysv/linux/m68k/kernel-features.h
        [__LINUX_KERNEL_VERSION < 0x030a00] (__ASSUME_FUTEX_LOCK_PI):
        Undefine.
        * sysdeps/unix/sysv/linux/microblaze/kernel-features.h
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_FUTEX_LOCK_PI):
        Likewise.
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_REQUEUE_PI):
        Likewise.
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_SET_ROBUST_LIST):
        Likewise.
        * sysdeps/unix/sysv/linux/mips/kernel-features.h
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_FUTEX_LOCK_PI):
        Undefine.
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_REQUEUE_PI): Likewise.
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_SET_ROBUST_LIST):
        Likewise.

:100644 100644 f33dd23ea0b30c71c4b96993b011a0ac02f174cf
ee4ede675ac9f30aeda61917312114242f174c9e M      ChangeLog
:040000 040000 08bd195d28b8e094591eafc08616c99a707eef25
030ca0573138769c7de36870b31c431a925cc7c3 M      sysdeps

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (3 preceding siblings ...)
  2015-06-13 17:06 ` mkl at pengutronix dot de
@ 2015-06-13 22:09 ` mkl at pengutronix dot de
  2015-06-14 14:07 ` joseph at codesourcery dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-13 22:09 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #2 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
Just checked, the problem exists on current glibc master (3164bf09f529 Fix
syslog fputs_unlocked namespace (bug 18530).), too.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (4 preceding siblings ...)
  2015-06-13 22:09 ` mkl at pengutronix dot de
@ 2015-06-14 14:07 ` joseph at codesourcery dot com
  2015-06-14 15:30 ` mkl at pengutronix dot de
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: joseph at codesourcery dot com @ 2015-06-14 14:07 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #3 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
Since pthread_cond_* are being rewritten, it might be worth seeing if 
Torvald's new implementation 
<https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (5 preceding siblings ...)
  2015-06-14 14:07 ` joseph at codesourcery dot com
@ 2015-06-14 15:30 ` mkl at pengutronix dot de
  2015-06-14 16:31 ` mkl at pengutronix dot de
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-14 15:30 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #4 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
I dug a bit deeper:

Due to commit:

    03d41216fe09 arm: Re-enable PI futex support for ARM kernels >= 3.14.3

(a.k.a. glibc-2.21~456) and compiling glibc with --enable-kernel=3.15 it PI
futex support is working again.

This means PI futex support is deliberately broken^wdisabled for _all_ ARMs on
glibc-2.20.

Marc

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (6 preceding siblings ...)
  2015-06-14 15:30 ` mkl at pengutronix dot de
@ 2015-06-14 16:31 ` mkl at pengutronix dot de
  2015-06-20 16:12 ` mkl at pengutronix dot de
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-14 16:31 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #5 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
(In reply to joseph@codesourcery.com from comment #3)
> Since pthread_cond_* are being rewritten, it might be worth seeing if 
> Torvald's new implementation 
> <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

With Torvald Riegel's patch ported to current master, the test case works for
both pre and post 3.14.3 kernel versions supplied to --enable-kernel=. I've
tested --enable-kernel=3.0 and --enable-kernel=3.15.

This makes sense, as AFAICS all the __ASSUME defines in question[1] have been
removed from the pthread_cond_ code.

Marc

[1] __ASSUME_FUTEX_LOCK_PI, __ASSUME_REQUEUE_PI, __ASSUME_SET_ROBUST_LIST

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (7 preceding siblings ...)
  2015-06-14 16:31 ` mkl at pengutronix dot de
@ 2015-06-20 16:12 ` mkl at pengutronix dot de
  2015-06-26 16:17 ` mkl at pengutronix dot de
  2015-07-17 10:17 ` mkl at pengutronix dot de
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-20 16:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #6 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
We had another, deeper look at the glibc and kernel code.

For PI and robust mutexes there is runtime detection, but no
fallback code. But that's not possible without kernel support. Instead a
proper error code is returned:

For example PI mutex:

> int
> __pthread_mutex_init (mutex, mutexattr)
>      pthread_mutex_t *mutex;
>      const pthread_mutexattr_t *mutexattr;
> {
...
>     case PTHREAD_PRIO_INHERIT << PTHREAD_MUTEXATTR_PROTOCOL_SHIFT:
>       if (__glibc_unlikely (prio_inherit_missing ()))
>         return ENOTSUP;
>       break;

and prio_inherit_missing() is:

> static bool
> prio_inherit_missing (void)
> {
> #ifdef __NR_futex
> # ifndef __ASSUME_FUTEX_LOCK_PI
>   static int tpi_supported;
>   if (__glibc_unlikely (tpi_supported == 0))
>     {
>       int lock = 0;
>       INTERNAL_SYSCALL_DECL (err);
>       int ret = INTERNAL_SYSCALL (futex, err, 4, &lock, FUTEX_UNLOCK_PI, 0, 0);

This is the runtime detection, if the syscall return ENOSYS, then the kernel
has no support for PI mutexes.

>       assert (INTERNAL_SYSCALL_ERROR_P (ret, err));
>       tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1;
>     }
>   return __glibc_unlikely (tpi_supported < 0);
> # endif
>   return false;
> #endif
>   return true;
> }

So far so good. We have runtime detection for FUTEX_LOCK_PI and
SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI.

The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting
the Kernel's ./Documentation/futex-requeue-pi.txt:

> In order to ensure the rt_mutex has an owner if it has waiters, it
> is necessary for both the requeue code, as well as the waiting code,
> to be able to acquire the rt_mutex before returning to user space.
> The requeue code cannot simply wake the waiter and leave it to
> acquire the rt_mutex as it would open a race window between the
                                          ^^^^^^^^^^^
> requeue call returning to user space and the waiter waking and
> starting to run.  This is especially true in the uncontended case.
> 
> The solution involves two new rt_mutex helper routines,
> rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which
> allow the requeue code to acquire an uncontended rt_mutex on behalf
> of the waiter and to enqueue the waiter on a contended rt_mutex.
> Two new system calls provide the kernel<->user interface to
> requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
              ^^^^^^^^^^^^^^^^^^^^^
> 
> FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait()
  ^^^^^^^^^^^^^^^^^^^^^
> and pthread_cond_timedwait()) to block on the initial futex and wait
> to be requeued to a PI-aware futex.  The implementation is the
> result of a high-speed collision between futex_wait() and
> futex_lock_pi(), with some extra logic to check for the additional
> wake-up scenarios.

Looking at pthread_cond_wait:

> int
> __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
> {
[...]
> #if (defined lll_futex_wait_requeue_pi \
>      && defined __ASSUME_REQUEUE_PI)
                  ^^^^^^^^^^^^^^^^^^^
>       /* If pi_flag remained 1 then it means that we had the lock and the mutex                                                                                                                                    
>          but a spurious waker raced ahead of us.  Give back the mutex before                                                                                                                                       
>          going into wait again.  */
>       if (pi_flag)
>         {
>           __pthread_mutex_cond_lock_adjust (mutex);
>           __pthread_mutex_unlock_usercnt (mutex, 0);
>         }
>       pi_flag = USE_REQUEUE_PI (mutex);
> 
>       if (pi_flag)
>         {
>           err = lll_futex_wait_requeue_pi (&cond->__data.__futex,
>                                            futex_val, &mutex->__data.__lock,
>                                            pshared);

Here's the requeue_pi syscall, but no runtime check for it. See
__ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this
code is not compiled in, resulting in the breakage we're seeing.

> 
>           pi_flag = (err == 0);
>         }
>       else
> #endif
>           /* Wait until woken by signal or broadcast.  */
>         lll_futex_wait (&cond->__data.__futex, futex_val, pshared);


Looking a the kernel's kernel/futex.c:

> long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
>                 u32 __user *uaddr2, u32 val2, u32 val3)
> {
...
>         switch (cmd) {
>         case FUTEX_LOCK_PI:
>         case FUTEX_UNLOCK_PI:
>         case FUTEX_TRYLOCK_PI:
>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:
>                 if (!futex_cmpxchg_enabled)
>                         return -ENOSYS;
>         }

This is the corresponding code for the runtime checks in the glibc. The
runtime check for:

>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:

has been added in v3.4-rc1~195^2~1. The glibc however only uses
lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked
during pthread_mutex_init. So from my point of view the correct fix in
the glibc is to get rid of __ASSUME_REQUEUE_PI completely - or - for ARM
remove the undef __ASSUME_REQUEUE_PI.


=======================================

diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h
b/sysdeps/unix/sysv/linux/arm/kernel-features.h
index cb407dbc300c..65046b16974b 100644

--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h

+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h

@@ -39,6 +39,6 @@

    configuration.  */ 
 #if __LINUX_KERNEL_VERSION < 0x030E03 
 # undef __ASSUME_FUTEX_LOCK_PI 
-# undef __ASSUME_REQUEUE_PI 
+//# undef __ASSUME_REQUEUE_PI 
 # undef __ASSUME_SET_ROBUST_LIST 
 #endif 

Marc

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (8 preceding siblings ...)
  2015-06-20 16:12 ` mkl at pengutronix dot de
@ 2015-06-26 16:17 ` mkl at pengutronix dot de
  2015-07-17 10:17 ` mkl at pengutronix dot de
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-06-26 16:17 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #7 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
Created attachment 8400
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8400&action=edit
Proposed fix for ARM

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM
  2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
                   ` (9 preceding siblings ...)
  2015-06-26 16:17 ` mkl at pengutronix dot de
@ 2015-07-17 10:17 ` mkl at pengutronix dot de
  10 siblings, 0 replies; 12+ messages in thread
From: mkl at pengutronix dot de @ 2015-07-17 10:17 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #8 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
On 06/14/2015 04:07 PM, joseph at codesourcery dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=18463
> 
> --- Comment #3 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> Since pthread_cond_* are being rewritten, it might be worth seeing if 
> Torvald's new implementation 
> <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

I've created a detailed analysis of the problem, created a patch and
attached it to the bugreport. Can someone have a look at the issue.

Marc

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-07-17 10:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-29 11:09 [Bug nptl/18463] New: pthread_cond_broadcast bug when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM mpa at pengutronix dot de
2015-05-29 11:12 ` [Bug nptl/18463] pthread_cond_broadcast issue " mpa at pengutronix dot de
2015-05-29 11:20 ` mkl at pengutronix dot de
2015-05-29 12:04 ` r.schwebel at pengutronix dot de
2015-06-13 17:06 ` mkl at pengutronix dot de
2015-06-13 22:09 ` mkl at pengutronix dot de
2015-06-14 14:07 ` joseph at codesourcery dot com
2015-06-14 15:30 ` mkl at pengutronix dot de
2015-06-14 16:31 ` mkl at pengutronix dot de
2015-06-20 16:12 ` mkl at pengutronix dot de
2015-06-26 16:17 ` mkl at pengutronix dot de
2015-07-17 10:17 ` mkl at pengutronix dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).