public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access
@ 2012-02-14 14:28 anemo at mba dot ocn.ne.jp
  2012-02-14 14:29 ` [Bug nptl/13690] " anemo at mba dot ocn.ne.jp
                   ` (63 more replies)
  0 siblings, 64 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-14 14:28 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

             Bug #: 13690
           Summary: pthread_mutex_unlock potentially cause invalid access
           Product: glibc
           Version: 2.15
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: anemo@mba.ocn.ne.jp
    Classification: Unclassified


It seems pthread_mutex_unlock() potentially cause invalid access on
most platforms (except for i386 and x86_64).

In nptl/pthread_mutex_unlock.c, lll_unlock() is called like this:
      lll_unlock (mutex->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex));

And PTHREAD_MUTEX_PSHARED() is defined like this:
# define PTHREAD_MUTEX_PSHARED(m) \
  ((m)->__data.__kind & 128)

On most platforms, lll_unlock() is defined as a macro like this:
#define lll_unlock(lock, private) \
  ((void) ({                              \
    int *__futex = &(lock);                      \
    int __val = atomic_exchange_rel (__futex, 0);          \
    if (__builtin_expect (__val > 1, 0))              \
      lll_futex_wake (__futex, 1, private);              \
  }))

Thus, the lll_unlock() call in pthread_mutex_unlock.c will be expanded as:
    int *__futex = &(mutex->__data.__lock);
    int __val = atomic_exchange_rel (__futex, 0);
    if (__builtin_expect (__val > 1, 0))        /* A */
      lll_futex_wake (__futex, 1, ((mutex)->__data.__kind & 128)); /* B */

On point "A", the mutex is actually unlocked, so other threads can
lock the mutex, unlock, destroy and free.  If the mutex was destroyed
and freed by other thread, reading '__kind' on point "B" is not valid.

This can happen with this example in pthread_mutex_destroy manual.

http://pubs.opengroup.org/onlinepubs/007904875/functions/pthread_mutex_destroy.html
------------------------------------------------------------------------
    Destroying Mutexes

    A mutex can be destroyed immediately after it is unlocked. For
    example, consider the following code:

    struct obj {
    pthread_mutex_t om;
    int refcnt;
    ...
    };

    obj_done(struct obj *op)
    {
    pthread_mutex_lock(&op->om);
    if (--op->refcnt == 0) {
        pthread_mutex_unlock(&op->om);
    (A)     pthread_mutex_destroy(&op->om);
    (B)     free(op);
    } else
    (C)     pthread_mutex_unlock(&op->om);
    }

    In this case obj is reference counted and obj_done() is called
    whenever a reference to the object is dropped. Implementations are
    required to allow an object to be destroyed and freed and potentially
    unmapped (for example, lines A and B) immediately after the object is
    unlocked (line C).
------------------------------------------------------------------------

In this example, (A) and (B) can be executed in middle of (C) execution.

It can happen in this way (explanation by KOSAKI-san):
1) Thread-1) atomic_exchange_rel(0)
2) preempt
3) Thread-2) call mutex_lock(). (ok, it's success)
4) Thread-2) call mutex_unlock()
5) Thread-2) call mutex_destroy()
6) Thread-2) free(mutex)
7) preempt
8) Thread-3)  reuse memory of the mutex
9) preempt
10) Thread-1) dereference (mutex)->__data__.__kind

Copying __kind to a local variable before atomic_exchange_rel
will fix this.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
@ 2012-02-14 14:29 ` anemo at mba dot ocn.ne.jp
  2012-02-14 15:39 ` carlos at systemhalted dot org
                   ` (62 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-14 14:29 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Atsushi Nemoto <anemo at mba dot ocn.ne.jp> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |glibc_2.15

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
  2012-02-14 14:29 ` [Bug nptl/13690] " anemo at mba dot ocn.ne.jp
@ 2012-02-14 15:39 ` carlos at systemhalted dot org
  2012-02-14 15:41 ` carlos at systemhalted dot org
                   ` (61 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-14 15:39 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at systemhalted dot
                   |                            |org
         AssignedTo|drepper.fsp at gmail dot    |carlos at systemhalted dot
                   |com                         |org

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
  2012-02-14 14:29 ` [Bug nptl/13690] " anemo at mba dot ocn.ne.jp
  2012-02-14 15:39 ` carlos at systemhalted dot org
@ 2012-02-14 15:41 ` carlos at systemhalted dot org
  2012-02-14 15:42 ` carlos at systemhalted dot org
                   ` (60 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-14 15:41 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #1 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-14 15:41:13 UTC ---
Nemoto-san,

Do you think you could come up with a patch to fix this?

I think we need to adjust ntpl/pthread_mutex_unlock.c to pass down private as a
copy of a local variable.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (2 preceding siblings ...)
  2012-02-14 15:41 ` carlos at systemhalted dot org
@ 2012-02-14 15:42 ` carlos at systemhalted dot org
  2012-02-15  6:47 ` ppluzhnikov at google dot com
                   ` (59 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-14 15:42 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (3 preceding siblings ...)
  2012-02-14 15:42 ` carlos at systemhalted dot org
@ 2012-02-15  6:47 ` ppluzhnikov at google dot com
  2012-02-15 13:18 ` anemo at mba dot ocn.ne.jp
                   ` (58 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: ppluzhnikov at google dot com @ 2012-02-15  6:47 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Paul Pluzhnikov <ppluzhnikov at google dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ppluzhnikov at google dot
                   |                            |com

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (4 preceding siblings ...)
  2012-02-15  6:47 ` ppluzhnikov at google dot com
@ 2012-02-15 13:18 ` anemo at mba dot ocn.ne.jp
  2012-02-15 14:35 ` carlos at systemhalted dot org
                   ` (57 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-15 13:18 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #2 from Atsushi Nemoto <anemo at mba dot ocn.ne.jp> 2012-02-15 13:17:17 UTC ---
(In reply to comment #1)
> Do you think you could come up with a patch to fix this?
> 
> I think we need to adjust ntpl/pthread_mutex_unlock.c to pass down private as a
> copy of a local variable.

Though fixing just one lll_unlock call in pthread_mutex_unlock.c seems so easy,
I wonder there might be other places to fix, but not sure.

For example:
* lll_futex_wake call in __pthread_mutex_unlock_full (PP mutex case)
* lll_unlock call in __pthread_rwlock_unlock

So I hope nptl experts fix this properly.
Thank you.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (5 preceding siblings ...)
  2012-02-15 13:18 ` anemo at mba dot ocn.ne.jp
@ 2012-02-15 14:35 ` carlos at systemhalted dot org
  2012-02-16  5:09 ` bugdal at aerifal dot cx
                   ` (56 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-15 14:35 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #3 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-15 14:33:47 UTC ---
(In reply to comment #2)
> For example:
> * lll_futex_wake call in __pthread_mutex_unlock_full (PP mutex case)
> * lll_unlock call in __pthread_rwlock_unlock
> 
> So I hope nptl experts fix this properly.
> Thank you.

Start slowly. Fix the first known problem. Then move on to the next.

Do you think you could come up with a patch to *only* fix the lll_unlock usage
problem?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (6 preceding siblings ...)
  2012-02-15 14:35 ` carlos at systemhalted dot org
@ 2012-02-16  5:09 ` bugdal at aerifal dot cx
  2012-02-16 14:43 ` anemo at mba dot ocn.ne.jp
                   ` (55 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2012-02-16  5:09 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #4 from Rich Felker <bugdal at aerifal dot cx> 2012-02-16 05:07:32 UTC ---
Analogous bugs are ENDEMIC in glibc/NPTL and so far there's been a complete
unwillingless to fix them or even acknowledge that they exist. See
http://sourceware.org/bugzilla/show_bug.cgi?id=12674

Fixing the issue to ensure that a synchronization object's memory is not
touched whatsoever after it's unlocked/posted is non-trivial, but once you
figure out the solution, it's rather general-purpose. A while back I audited
all my synchronization primitives in musl libc for similar bugs and fixed them,
so it might be a useful source for ideas to fix glibc/NPTL.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (7 preceding siblings ...)
  2012-02-16  5:09 ` bugdal at aerifal dot cx
@ 2012-02-16 14:43 ` anemo at mba dot ocn.ne.jp
  2012-02-16 14:47 ` anemo at mba dot ocn.ne.jp
                   ` (54 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-16 14:43 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #5 from Atsushi Nemoto <anemo at mba dot ocn.ne.jp> 2012-02-16 14:39:24 UTC ---
Created attachment 6222
  --> http://sourceware.org/bugzilla/attachment.cgi?id=6222
a patch to fix only lll_unlock,lll_robust_unlock call in pthread_mutex_unlock

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (8 preceding siblings ...)
  2012-02-16 14:43 ` anemo at mba dot ocn.ne.jp
@ 2012-02-16 14:47 ` anemo at mba dot ocn.ne.jp
  2012-02-16 15:37 ` carlos at systemhalted dot org
                   ` (53 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-16 14:47 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #6 from Atsushi Nemoto <anemo at mba dot ocn.ne.jp> 2012-02-16 14:45:15 UTC ---
(In reply to comment #3)
> Start slowly. Fix the first known problem. Then move on to the next.
> 
> Do you think you could come up with a patch to *only* fix the lll_unlock usage
> problem?

OK, I just have uploaded a patch with obvious fixes only.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (9 preceding siblings ...)
  2012-02-16 14:47 ` anemo at mba dot ocn.ne.jp
@ 2012-02-16 15:37 ` carlos at systemhalted dot org
  2012-02-16 15:41 ` carlos at systemhalted dot org
                   ` (52 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-16 15:37 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #8 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-16 15:32:46 UTC ---
(In reply to comment #6)
> (In reply to comment #3)
> > Start slowly. Fix the first known problem. Then move on to the next.
> > 
> > Do you think you could come up with a patch to *only* fix the lll_unlock usage
> > problem?
> 
> OK, I just have uploaded a patch with obvious fixes only.

Nemoto-san,

Thank you very much for posting the patch!

What kind of testing have you done with the patch?

Do you have a small test case that can trigger the failure even sporadically?

It would be nice to get a test case added that documents the class of failure
we tried to fix.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (10 preceding siblings ...)
  2012-02-16 15:37 ` carlos at systemhalted dot org
@ 2012-02-16 15:41 ` carlos at systemhalted dot org
  2012-02-16 16:22 ` bugdal at aerifal dot cx
                   ` (51 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-16 15:41 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #7 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-16 15:26:24 UTC ---
(In reply to comment #4)
> Analogous bugs are ENDEMIC in glibc/NPTL and so far there's been a complete
> unwillingless to fix them or even acknowledge that they exist. See
> http://sourceware.org/bugzilla/show_bug.cgi?id=12674
> 
> Fixing the issue to ensure that a synchronization object's memory is not
> touched whatsoever after it's unlocked/posted is non-trivial, but once you
> figure out the solution, it's rather general-purpose. A while back I audited
> all my synchronization primitives in musl libc for similar bugs and fixed them,
> so it might be a useful source for ideas to fix glibc/NPTL.

I've assigned myself to BZ#12674 and I'll review the issue. Thank you for
bringing up the issue.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (11 preceding siblings ...)
  2012-02-16 15:41 ` carlos at systemhalted dot org
@ 2012-02-16 16:22 ` bugdal at aerifal dot cx
  2012-02-16 16:35 ` carlos at systemhalted dot org
                   ` (50 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2012-02-16 16:22 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #9 from Rich Felker <bugdal at aerifal dot cx> 2012-02-16 16:21:08 UTC ---
For all bugs like this, I suspect hitting the race condition will require
running the test case for months or years. That's part of why bugs like this
are so frustrating: imagine your mission-critical system crashing just a couple
times a year and the crash not being reproducible. It might be easier to hit
the race with some extreme usage of scheduling priorities to prevent the
unlocking thread from executing for a long time.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (12 preceding siblings ...)
  2012-02-16 16:22 ` bugdal at aerifal dot cx
@ 2012-02-16 16:35 ` carlos at systemhalted dot org
  2012-02-17  5:11 ` bugdal at aerifal dot cx
                   ` (49 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-16 16:35 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #10 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-16 16:32:43 UTC ---
(In reply to comment #9)
> For all bugs like this, I suspect hitting the race condition will require
> running the test case for months or years. That's part of why bugs like this
> are so frustrating: imagine your mission-critical system crashing just a couple
> times a year and the crash not being reproducible. It might be easier to hit
> the race with some extreme usage of scheduling priorities to prevent the
> unlocking thread from executing for a long time.

I agree, but a test case need not be an exact representation of your
application under test.

The trick I normally use is to preload an auditing library and use the
plt_enter and plt_exit stubs to slow down a thread, widening the race window or
in some cases making it a 100% reliable reproducer.

Such a trick is a perfectly acceptable way to make a test case, but might be
hard to use in this situation.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (13 preceding siblings ...)
  2012-02-16 16:35 ` carlos at systemhalted dot org
@ 2012-02-17  5:11 ` bugdal at aerifal dot cx
  2012-02-17 13:27 ` anemo at mba dot ocn.ne.jp
                   ` (48 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2012-02-17  5:11 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #11 from Rich Felker <bugdal at aerifal dot cx> 2012-02-17 05:10:16 UTC ---
If that's acceptable, you could just make the test case either a gdb script or
a dedicated ptrace-using parent process that puts a breakpoint at the right
location to hit the race...

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (14 preceding siblings ...)
  2012-02-17  5:11 ` bugdal at aerifal dot cx
@ 2012-02-17 13:27 ` anemo at mba dot ocn.ne.jp
  2012-02-17 16:18 ` carlos at systemhalted dot org
                   ` (47 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-17 13:27 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #12 from Atsushi Nemoto <anemo at mba dot ocn.ne.jp> 2012-02-17 13:26:59 UTC ---
(In reply to comment #8)
> What kind of testing have you done with the patch?
> 
> Do you have a small test case that can trigger the failure even sporadically?
> 
> It would be nice to get a test case added that documents the class of failure
> we tried to fix.

Unfortunately I could not reproduce the problem and do not have any test case.

Actually, this problem was discovered by an code analysis from a report that
indicates futex_wake syscall was called with a wrong private flag.

So my patch is just a theoretical fix.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (15 preceding siblings ...)
  2012-02-17 13:27 ` anemo at mba dot ocn.ne.jp
@ 2012-02-17 16:18 ` carlos at systemhalted dot org
  2012-02-17 16:37 ` carlos at systemhalted dot org
                   ` (46 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-17 16:18 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #13 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-17 16:17:25 UTC ---
(In reply to comment #11)
> If that's acceptable, you could just make the test case either a gdb script or
> a dedicated ptrace-using parent process that puts a breakpoint at the right
> location to hit the race...

I like the idea of a ptrace-using parent process to trigger the race condition,
but it's fragile. I like your suggestion though.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (16 preceding siblings ...)
  2012-02-17 16:18 ` carlos at systemhalted dot org
@ 2012-02-17 16:37 ` carlos at systemhalted dot org
  2012-02-20 11:42 ` anemo at mba dot ocn.ne.jp
                   ` (45 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-17 16:37 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at redhat dot com

--- Comment #14 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-17 16:36:54 UTC ---

Nemoto-san,

The patch looks good to me.

My only worry is that the performance of the robust unlock fast path is
impacted.

I notice that PTHREAD_ROBUST_MUTEX_PSHARED always returns LLL_SHARED.

Therefore it would be optimal if instead of a temporary we just passed down
LLL_SHARED (a constant).

I don't know why glibc has the PTHREAD_ROBUST_MUTEX_PSHARED macro.

The code comment says:
~~~
/* The kernel when waking robust mutexes on exit never uses
   FUTEX_PRIVATE_FLAG FUTEX_WAKE.  */
#define PTHREAD_ROBUST_MUTEX_PSHARED(m) LLL_SHARED
~~~

Which appears to imply that at some point in the future it might not always
return LLL_SHARED.

Therefore your patch is correct.

Can you verify that the instruction sequences generated are identical given the
optimization level -O2 used to compile glibc?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (17 preceding siblings ...)
  2012-02-17 16:37 ` carlos at systemhalted dot org
@ 2012-02-20 11:42 ` anemo at mba dot ocn.ne.jp
  2012-02-22 14:57 ` carlos at systemhalted dot org
                   ` (44 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: anemo at mba dot ocn.ne.jp @ 2012-02-20 11:42 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #15 from Atsushi Nemoto <anemo at mba dot ocn.ne.jp> 2012-02-20 11:38:58 UTC ---
(In reply to comment #14)
> Can you verify that the instruction sequences generated are identical given the
> optimization level -O2 used to compile glibc?

Yes, I verified __pthread_mutex_unlock_full code sequences are identical.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (18 preceding siblings ...)
  2012-02-20 11:42 ` anemo at mba dot ocn.ne.jp
@ 2012-02-22 14:57 ` carlos at systemhalted dot org
  2012-02-29 16:54 ` carlos at systemhalted dot org
                   ` (43 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-22 14:57 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #16 from Carlos O'Donell <carlos at systemhalted dot org> 2012-02-22 14:54:30 UTC ---
(In reply to comment #15)
> (In reply to comment #14)
> > Can you verify that the instruction sequences generated are identical given the
> > optimization level -O2 used to compile glibc?
> 
> Yes, I verified __pthread_mutex_unlock_full code sequences are identical.

Nemoto-san,

Thank you for checking.

I've gone through the patch and I think it's good, but I'd like someone else to
also review the change.

Given that Jakub commented on the patch on libc-alpha I'll ask him to review
the patch attached to this issue.

Jakub,

Could you please review this patch?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (19 preceding siblings ...)
  2012-02-22 14:57 ` carlos at systemhalted dot org
@ 2012-02-29 16:54 ` carlos at systemhalted dot org
  2012-03-07 10:30 ` drepper.fsp at gmail dot com
                   ` (42 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-02-29 16:54 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |review?(jakub at redhat dot
                   |                            |com)

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (20 preceding siblings ...)
  2012-02-29 16:54 ` carlos at systemhalted dot org
@ 2012-03-07 10:30 ` drepper.fsp at gmail dot com
  2012-03-07 17:53 ` bugdal at aerifal dot cx
                   ` (41 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: drepper.fsp at gmail dot com @ 2012-03-07 10:30 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Ulrich Drepper <drepper.fsp at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |WAITING
                 CC|                            |drepper.fsp at gmail dot
                   |                            |com

--- Comment #17 from Ulrich Drepper <drepper.fsp at gmail dot com> 2012-03-07 10:29:21 UTC ---
(In reply to comment #0)
> Thus, the lll_unlock() call in pthread_mutex_unlock.c will be expanded as:
>     int *__futex = &(mutex->__data.__lock);
>     int __val = atomic_exchange_rel (__futex, 0);
>     if (__builtin_expect (__val > 1, 0))        /* A */
>       lll_futex_wake (__futex, 1, ((mutex)->__data.__kind & 128)); /* B */
> 
> On point "A", the mutex is actually unlocked, so other threads can
> lock the mutex, unlock, destroy and free.  If the mutex was destroyed
> and freed by other thread, reading '__kind' on point "B" is not valid.

You read the code incorrectly.

If B is reached there must be another thread using the mutex.  It is currently
waiting.  In that case it is invalid to destroy the mutex.  In any case would
there be another memory access, from the thread which is woken by the
lll_futex_wake call.

The same applies to whatever you try to change with your patch.

Again, as long as a thread is waiting on a mutex you cannot destroy it legally.
 Show me a place where the code is accessing the futex after the unlock when
there is no locker.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (21 preceding siblings ...)
  2012-03-07 10:30 ` drepper.fsp at gmail dot com
@ 2012-03-07 17:53 ` bugdal at aerifal dot cx
  2012-03-08  3:23 ` carlos at systemhalted dot org
                   ` (40 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2012-03-07 17:53 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #18 from Rich Felker <bugdal at aerifal dot cx> 2012-03-07 17:52:03 UTC ---
You misunderstand the race.

Suppose thread A is unlocking the mutex and gets descheduled after the
atomic_exchange_rel but before the lll_futex_wake, and thread B is waiting to
lock the mutex. At this point, as far thread B can observe, A is no longer a
user of the mutex. Thread B obtains the mutex, performs some operations,
unlocks the mutex, and assuming (correctly) that it's now the last user of the
mutex, destroys it and frees the memory it occupied.

Now at some later point, thread A gets scheduled again and crashes accessing
freed memory.

If you're wondering how thread B could wake up without thread A calling
lll_futex_wake, here are several reasons:

1. Never going to sleep due to value mismatch on the original futex wait call.
2. Receipt of a signal, and value mismatch when the signal handler returns and
futex wait is called again.
3. Spurious wakes that look like successful returns from wait. These do exist
in Linux, and I have not been able to determine the reason, but I have a test
program which can successfully produce them in one case (unrelated to mutexes).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (22 preceding siblings ...)
  2012-03-07 17:53 ` bugdal at aerifal dot cx
@ 2012-03-08  3:23 ` carlos at systemhalted dot org
  2012-03-08  5:13 ` bugdal at aerifal dot cx
                   ` (39 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-03-08  3:23 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |2.16

--- Comment #19 from Carlos O'Donell <carlos at systemhalted dot org> 2012-03-08 03:21:39 UTC ---
Given Ulrich's comments I'm going back to review the code.

I'd like this fixed before the 2.16 release, and therefore I'm marking this
with a target milestone of 2.16.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (23 preceding siblings ...)
  2012-03-08  3:23 ` carlos at systemhalted dot org
@ 2012-03-08  5:13 ` bugdal at aerifal dot cx
  2012-04-28  9:57 ` coolhair24 at verizon dot net
                   ` (38 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2012-03-08  5:13 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #20 from Rich Felker <bugdal at aerifal dot cx> 2012-03-08 05:12:48 UTC ---
Please also examine the corresponding bug for sem_post, #12674, which is
exactly the same type of bug.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (24 preceding siblings ...)
  2012-03-08  5:13 ` bugdal at aerifal dot cx
@ 2012-04-28  9:57 ` coolhair24 at verizon dot net
  2012-06-27 22:32 ` jsm28 at gcc dot gnu.org
                   ` (37 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: coolhair24 at verizon dot net @ 2012-04-28  9:57 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Sherri <coolhair24 at verizon dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |coolhair24 at verizon dot
                   |                            |net

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (25 preceding siblings ...)
  2012-04-28  9:57 ` coolhair24 at verizon dot net
@ 2012-06-27 22:32 ` jsm28 at gcc dot gnu.org
  2012-11-29 15:55 ` carlos_odonell at mentor dot com
                   ` (36 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2012-06-27 22:32 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|glibc_2.15                  |

--- Comment #21 from Joseph Myers <jsm28 at gcc dot gnu.org> 2012-06-27 22:32:24 UTC ---
Removing glibc_2.15 keyword as backport suitability can't be judged until there
is a fix on master.  After this is fixed on master, anyone wanting a backport
should feel free to attach a tested backport patch to this bug (or a new bug,
if this one is closed as fixed).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (26 preceding siblings ...)
  2012-06-27 22:32 ` jsm28 at gcc dot gnu.org
@ 2012-11-29 15:55 ` carlos_odonell at mentor dot com
  2012-12-01 16:43 ` aj at suse dot de
                   ` (35 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos_odonell at mentor dot com @ 2012-11-29 15:55 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos_odonell at mentor dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos_odonell at mentor
                   |                            |dot com
   Target Milestone|2.16                        |2.18

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (27 preceding siblings ...)
  2012-11-29 15:55 ` carlos_odonell at mentor dot com
@ 2012-12-01 16:43 ` aj at suse dot de
  2012-12-03 23:57 ` carlos at systemhalted dot org
                   ` (34 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: aj at suse dot de @ 2012-12-01 16:43 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #22 from Andreas Jaeger <aj at suse dot de> 2012-12-01 16:43:34 UTC ---
Jakub, could you review this one, please?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (28 preceding siblings ...)
  2012-12-01 16:43 ` aj at suse dot de
@ 2012-12-03 23:57 ` carlos at systemhalted dot org
  2013-10-09 20:14 ` neleai at seznam dot cz
                   ` (33 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at systemhalted dot org @ 2012-12-03 23:57 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at systemhalted dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|carlos_odonell at mentor    |
                   |dot com                     |

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (29 preceding siblings ...)
  2012-12-03 23:57 ` carlos at systemhalted dot org
@ 2013-10-09 20:14 ` neleai at seznam dot cz
  2013-12-18 20:13 ` triegel at redhat dot com
                   ` (32 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: neleai at seznam dot cz @ 2013-10-09 20:14 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Ondrej Bilka <neleai at seznam dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |neleai at seznam dot cz

--- Comment #23 from Ondrej Bilka <neleai at seznam dot cz> ---
ping, this was waiting almost year for second review.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (30 preceding siblings ...)
  2013-10-09 20:14 ` neleai at seznam dot cz
@ 2013-12-18 20:13 ` triegel at redhat dot com
  2013-12-18 20:33 ` bugdal at aerifal dot cx
                   ` (31 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2013-12-18 20:13 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Torvald Riegel <triegel at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |ASSIGNED
                 CC|                            |triegel at redhat dot com

--- Comment #24 from Torvald Riegel <triegel at redhat dot com> ---
I agree that after release of a mutex (i.e., atomic_exchange_rel (__futex, 0)),
we have both (1) a pending load from mutex->__data.__kind and (2) a pending
futex_wake call.

However, I think it is an open question whether POSIX actually allows
destruction of the mutex just based on having obtained ownership of the lock. 
The example given in the standard and reproduced in Comment #1 is in an
informative section, and it conflicts with a statement in the normative
section: "Attempting to destroy a locked mutex or a mutex that is referenced
[...] by another thread results in undefined behavior."

Arguably, a mutex could still be considered "referenced" as long as a call to
mutex_unlock has not yet returned.  This would make the example in the
normative text incorrect.  OTOH, the intended semantics could also be that if a
program ensures that (1) a thread is the last one to lock a mutex, and (2) this
thread is able to lock and unlock a mutex, then this thread is also allowed to
destroy the mutex; IOW, being able to doing the last lock and unlock of the
mutex could be the defining constraint for when destruction is allowed.

(That is what C++11 seems to require based on a quick read.  C11 isn't very
verbose but requires just that no thread is blocked on the mutex when it is
destructed; nonetheless, it also mentions that all resources of the mutex are
claimed, which could be understood to mean the same as the "referenced"
constraint in POSIX.)

I've asked the Austin Group for clarification:
http://austingroupbugs.net/view.php?id=811
Depending on how they decide, this is either not a bug, or we'll have to avoid
the pending load and futex_wake call, or make them harmless.  The proposed
patch should be right for the pending load, but the futex_wake needs more
investigation: A futex_wake to a futex without waiters (or even to a futex not
mapped anymore) should be harmless, but it could be different with PI futexes.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (31 preceding siblings ...)
  2013-12-18 20:13 ` triegel at redhat dot com
@ 2013-12-18 20:33 ` bugdal at aerifal dot cx
  2013-12-18 20:49 ` bugdal at aerifal dot cx
                   ` (30 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2013-12-18 20:33 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #25 from Rich Felker <bugdal at aerifal dot cx> ---
Torvald, while I agree there is *some* room for arguing about the semantics
intended by the standard, the fact of the matter is that it's really hard to
use mutexes that reside in dynamically-allocated memory if you can't rely on
the self-synchronized destruction semantics. Some common cases are easy, for
example, if you're freeing an object that's part of a larger data structure,
acquiring the lock on the larger data structure before locking the individual
object assures that another thread cannot still be in the tail part of the
pthread_mutex_unlock call for the individual object when you obtain the mutex.
But in general it's not so easy; at the very least, it requires non-trivial,
non-intuitive, error-prone reasoning to determine if any particular usage is
safe. And this is not good.

Fixing the bug, on the other hand, is not hard, so I think it should just be
fixed, even if the Austin Group chooses to relax this requirement. It's a
quality of implementation issue because, however it's resolved, many apps will
continue to have this kind of race condition on implementations where
self-synchronized destruction does not work, resulting in random
near-impossible-to-reproduce crashes.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (32 preceding siblings ...)
  2013-12-18 20:33 ` bugdal at aerifal dot cx
@ 2013-12-18 20:49 ` bugdal at aerifal dot cx
  2013-12-20 19:08 ` lopresti at gmail dot com
                   ` (29 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2013-12-18 20:49 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #26 from Rich Felker <bugdal at aerifal dot cx> ---
BTW I posted on the Austin Group ticket what's perhaps the most important usage
case for self-synchronized destruction: refcounted objects.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (33 preceding siblings ...)
  2013-12-18 20:49 ` bugdal at aerifal dot cx
@ 2013-12-20 19:08 ` lopresti at gmail dot com
  2013-12-20 19:38 ` carlos at redhat dot com
                   ` (28 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: lopresti at gmail dot com @ 2013-12-20 19:08 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Pat <lopresti at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |lopresti at gmail dot com

--- Comment #27 from Pat <lopresti at gmail dot com> ---
If you have to wait for all calls to mutex_unlock to return before you can
destroy the mutex, how are you supposed to guarantee that, exactly?

You can only do so by synchronizing the threads in some other way. So every
mutex has to be guarded by another mutex which has to be guarded by another
mutex...

...except for the last mutex, which is global or static or whatever and never
gets destroyed. Problem solved!

Seriously, think about the kind of code you would have to write to deal with
these semantics. Is that what you think POSIX wanted to put people through
(despite the actual example they give)? Is that what glibc wants to put people
through?

You do not need "clarification" from anybody to recognize that these are
serious bugs. The only interesting question is how many years it's going to
take.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (34 preceding siblings ...)
  2013-12-20 19:08 ` lopresti at gmail dot com
@ 2013-12-20 19:38 ` carlos at redhat dot com
  2013-12-20 20:25 ` triegel at redhat dot com
                   ` (27 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at redhat dot com @ 2013-12-20 19:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #28 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Pat from comment #27)
> If you have to wait for all calls to mutex_unlock to return before you can
> destroy the mutex, how are you supposed to guarantee that, exactly?
> 
> You can only do so by synchronizing the threads in some other way. So every
> mutex has to be guarded by another mutex which has to be guarded by another
> mutex...
> 
> ...except for the last mutex, which is global or static or whatever and
> never gets destroyed. Problem solved!
> 
> Seriously, think about the kind of code you would have to write to deal with
> these semantics. Is that what you think POSIX wanted to put people through
> (despite the actual example they give)? Is that what glibc wants to put
> people through?
> 
> You do not need "clarification" from anybody to recognize that these are
> serious bugs. The only interesting question is how many years it's going to
> take.

That may be the case for normal software projects, but this is glibc. We are a
conservative project and we work through a standards process and collaborate
with the Austin Group and the ISO group on POSIX and ISO C. I understand that
this is sometimes frustratingly slow, but it ensures a clarity and quality that
we desire to achieve with the project.

I don't disagree that it seems ridiculous to require such complexity, but we
want to gather consensus from the Austin Group to ensure that we understand all
the implications before we make a change.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (35 preceding siblings ...)
  2013-12-20 19:38 ` carlos at redhat dot com
@ 2013-12-20 20:25 ` triegel at redhat dot com
  2013-12-20 22:51 ` bugdal at aerifal dot cx
                   ` (26 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2013-12-20 20:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #29 from Torvald Riegel <triegel at redhat dot com> ---
(In reply to Pat from comment #27)
> If you have to wait for all calls to mutex_unlock to return before you can
> destroy the mutex, how are you supposed to guarantee that, exactly?
> 
> You can only do so by synchronizing the threads in some other way. So every
> mutex has to be guarded by another mutex which has to be guarded by another
> mutex...
> 
> ...except for the last mutex, which is global or static or whatever and
> never gets destroyed. Problem solved!

There are other ways to do garbage collection / end-of-lifetime-detection than
doing so via mutex-protected reference counts (and I don't mean you should use
garbage collection for memory management...).

If your mutexes are part of other data structures that you have to destroy, you
can destroy the mutexes when destructing the data structure; unless every
access to the data structure is protected by the mutex, you'll likely need
another mechanism anyway to decide when it's safe to destruct.

> Seriously, think about the kind of code you would have to write to deal with
> these semantics.

I have.  This is a trade-off between implementation constraints and guarantees
given to users, as I pointed out in the POSIX clarification request.  If you
have anything to say about that, then please comment on the POSIX issue.

> Is that what you think POSIX wanted to put people through
> (despite the actual example they give)?

That's what the clarification request is for.

> Is that what glibc wants to put
> people through?

As Carlos said, we're approaching issues like this with the care that is
needed.  This involves making sure that the standards and implementations are
in sync.

> You do not need "clarification" from anybody to recognize that these are
> serious bugs. The only interesting question is how many years it's going to
> take.

It would be a bug if it is not adhering to the specification of the function. 
Arguably, the normative text in the standard can be understood to not allow the
use you seem to want to have.  The POSIX reply will clarify this.

If POSIX clarifies that it wants the strong guarantees re destruction, this is
a bug compared to a clarified standard.  Otherwise, this bug is an enhancement
request, which we would then evaluate as well.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (36 preceding siblings ...)
  2013-12-20 20:25 ` triegel at redhat dot com
@ 2013-12-20 22:51 ` bugdal at aerifal dot cx
  2014-01-03  9:10 ` kevin.dempsey at aculab dot com
                   ` (25 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2013-12-20 22:51 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #30 from Rich Felker <bugdal at aerifal dot cx> ---
Carlos, there's nothing "conservative" about giving applications broken
semantics on the basis that you're not sure the standard _requires_ the correct
semantics. And there's no reason to wait for a response from the Austin Group
to fix this. Even if, by some fluke, they removed the requirement that ending
the "reference" to a mutex is atomic with unlocking it, you would still want to
have the correct, safe semantics just for the sake of applications using it.
THIS is the conservative behavior.

Torvald, in regards that there are "other ways" to do end-of-lifetime
detection, the only way to do so in a strictly conforming application, if you
don't have the self-synchronized destruction property for at least one of
mutexes, semaphores, or spinlocks, is with a _global_ lock ensuring that no two
threads can be attempting to release a reference to the same type of
reference-counted object at the same time. This is obviously not practical from
a performance standpoint, and it's also hideous from a "global state considered
harmful" standpoint. Obviously with other tools that will be available in
future editions of POSIX (e.g. C11 atomics) and that are available now as
extensions, you can work around the problem by refraining from using mutexes,
but that's not a good solution.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (37 preceding siblings ...)
  2013-12-20 22:51 ` bugdal at aerifal dot cx
@ 2014-01-03  9:10 ` kevin.dempsey at aculab dot com
  2014-01-06 16:58 ` triegel at redhat dot com
                   ` (24 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: kevin.dempsey at aculab dot com @ 2014-01-03  9:10 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Kevin Dempsey <kevin.dempsey at aculab dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kevin.dempsey at aculab dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (38 preceding siblings ...)
  2014-01-03  9:10 ` kevin.dempsey at aculab dot com
@ 2014-01-06 16:58 ` triegel at redhat dot com
  2014-01-06 17:46 ` lopresti at gmail dot com
                   ` (23 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-01-06 16:58 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #31 from Torvald Riegel <triegel at redhat dot com> ---
(In reply to Rich Felker from comment #30)
> Carlos, there's nothing "conservative" about giving applications broken
> semantics on the basis that you're not sure the standard _requires_ the
> correct semantics. And there's no reason to wait for a response from the
> Austin Group to fix this. Even if, by some fluke, they removed the
> requirement that ending the "reference" to a mutex is atomic with unlocking
> it, you would still want to have the correct, safe semantics just for the
> sake of applications using it. THIS is the conservative behavior.

You can't claim that just one of the semantics is "correct".  They, obviously,
can both be used in meaningful ways.  We can certainly argue about the utility
of both of the semantics, but that's different from correctness.

Also, because you mentioned conforming applications: those won't be helped by
glibc implementing something (incl. something stronger) that's not guaranteed
by POSIX.

> Torvald, in regards that there are "other ways" to do end-of-lifetime
> detection, the only way to do so in a strictly conforming application, if
> you don't have the self-synchronized destruction property for at least one
> of mutexes, semaphores, or spinlocks, is with a _global_ lock ensuring that
> no two threads can be attempting to release a reference to the same type of
> reference-counted object at the same time.

No, it does not need to be a global lock.  You just make sure that all threads
that use the resource you want to destruct have quiesced.  For example, if
you've spawned a set of threads to do work concurrently on a resource, and join
them after they've done the job, then pthread_join does exactly this for you;
afterwards, whichever thread spawned those threads can initiate destruction. 
If you've used a task queue or similar on a thread pool, the task queue can do
the same.

> This is obviously not practical
> from a performance standpoint,

I don't see how the thread pool, for example, is bad in terms of performance.

> and it's also hideous from a "global state
> considered harmful" standpoint.

This is not about global vs. non-global state, but instead about how to ensure
quiescence of concurrent threads: you initiate concurrent execution, and once
that's done and there's no concurrency anymore, you destruct.  The main thread
might be the thread that's doing that, and it might use global state for that,
but that's not necessarily so.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (39 preceding siblings ...)
  2014-01-06 16:58 ` triegel at redhat dot com
@ 2014-01-06 17:46 ` lopresti at gmail dot com
  2014-01-06 20:38 ` triegel at redhat dot com
                   ` (22 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: lopresti at gmail dot com @ 2014-01-06 17:46 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #32 from Pat <lopresti at gmail dot com> ---
(In reply to Torvald Riegel from comment #31)
> 
> You can't claim that just one of the semantics is "correct".

Um, actually, you can... Based on (a) the only sane interpretation and (b) THE
ACTUAL EXAMPLE USAGE GIVEN IN THE POSIX SPEC.

> Also, because you mentioned conforming applications: those won't be helped
> by glibc implementing something (incl. something stronger) that's not
> guaranteed by POSIX.

"Hey, POSIX authors, did you actually mean the example code you gave in the,
you know, spec?"

"Yes, we actually meant it."

Is that what you are waiting to hear before you fix this bug? Seriously?

Most people writing "conforming applications" are going to expect the examples
in the spec to... let's see... I dunno... work?

> No, it does not need to be a global lock.  You just make sure that all
> threads that use the resource you want to destruct have quiesced.

To "make sure that all threads ... have quiesced", you must do one of two
things: (a) Rely on some synchronization mechanism, all of which are currently
broken due to this bug; or (b) wait for all threads to exit.

You are arguing for (b): To destroy a mutex -- any mutex -- you must first wait
for every thread that ever touched that mutex to exit.

Is it possible to code against these semantics? Of course. It is also possible
to code without using threads. Or function calls. Or multiplication.

But the whole point of a primitive is to provide useful semantics across a
variety of applications. And by "a variety", I mean more than one (1).

There is one nice thing about this bug, though. It provides a quintessential
(and hilarious) example of why people laugh and roll their eyes when they hear
the phrase "glibc maintainer".

I wonder, what's the over/under on whether this bug gets fixed before 2017?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (40 preceding siblings ...)
  2014-01-06 17:46 ` lopresti at gmail dot com
@ 2014-01-06 20:38 ` triegel at redhat dot com
  2014-01-06 20:47 ` bugdal at aerifal dot cx
                   ` (21 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-01-06 20:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #33 from Torvald Riegel <triegel at redhat dot com> ---
(In reply to Pat from comment #32)
> (In reply to Torvald Riegel from comment #31)
> > 
> > You can't claim that just one of the semantics is "correct".
> 
> Um, actually, you can... Based on (a) the only sane interpretation and (b)
> THE ACTUAL EXAMPLE USAGE GIVEN IN THE POSIX SPEC.

You're still saying that you like one semantics better than the other.  That
doesn't make another *semantics* incorrect.

> > Also, because you mentioned conforming applications: those won't be helped
> > by glibc implementing something (incl. something stronger) that's not
> > guaranteed by POSIX.
> 
> "Hey, POSIX authors, did you actually mean the example code you gave in the,
> you know, spec?"
> 
> "Yes, we actually meant it."
> 
> Is that what you are waiting to hear before you fix this bug? Seriously?

If you want to know what I'd like to get clarified by the Austin Group, please
read the Austin group issue.  It should be easy to understand.

> Most people writing "conforming applications" are going to expect the
> examples in the spec to... let's see... I dunno... work?

You can say the same thing about the normative text.  Which brings us right
back to the clarification request...

> > No, it does not need to be a global lock.  You just make sure that all
> > threads that use the resource you want to destruct have quiesced.
> 
> To "make sure that all threads ... have quiesced", you must do one of two
> things: (a) Rely on some synchronization mechanism, all of which are
> currently broken due to this bug;

No, they aren't "broken".  See the examples I gave.

> or (b) wait for all threads to exit.

Precisely, wait for all threads that use the particular resource to not use it
anymore.  That's different from "wait[ing] for all threads to exit".

> You are arguing for (b): To destroy a mutex -- any mutex -- you must first
> wait for every thread that ever touched that mutex to exit.

This could be a reasonable semantics.

> Is it possible to code against these semantics? Of course.

And often, programs will do just that.  All that don't do reference counting or
similar, for example.

> It is also
> possible to code without using threads. Or function calls. Or multiplication.
> 
> But the whole point of a primitive is to provide useful semantics across a
> variety of applications. And by "a variety", I mean more than one (1).
> 
> There is one nice thing about this bug, though. It provides a quintessential
> (and hilarious) example of why people laugh and roll their eyes when they
> hear the phrase "glibc maintainer".
> 
> I wonder, what's the over/under on whether this bug gets fixed before 2017?

This bug is about a technical issue.  I'm not going to respond to off-topic
statements like this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (41 preceding siblings ...)
  2014-01-06 20:38 ` triegel at redhat dot com
@ 2014-01-06 20:47 ` bugdal at aerifal dot cx
  2014-01-06 21:20 ` triegel at redhat dot com
                   ` (20 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-01-06 20:47 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #34 from Rich Felker <bugdal at aerifal dot cx> ---
> > or (b) wait for all threads to exit.
> 
> Precisely, wait for all threads that use the particular resource to not use it > anymore.  That's different from "wait[ing] for all threads to exit".

That requires a mutex. So you just moved the mutex issue to a different mutex;
you didn't solve it.

> > You are arguing for (b): To destroy a mutex -- any mutex -- you must first
> > wait for every thread that ever touched that mutex to exit.
> 
> This could be a reasonable semantics.

No, you stopped being reasonable several posts back in this thread. I'm not
sure what your emotional attachment to glibc's current brokenness with respect
to this issue is, but it's completely clouding your judgement and making you
look like a fool. It's really sad to see this kind of response to bugs again
when glibc was just recovering from the madness of the former maintainer and
his attitude towards bug reports...

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (42 preceding siblings ...)
  2014-01-06 20:47 ` bugdal at aerifal dot cx
@ 2014-01-06 21:20 ` triegel at redhat dot com
  2014-01-06 21:24 ` bugdal at aerifal dot cx
                   ` (19 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-01-06 21:20 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #35 from Torvald Riegel <triegel at redhat dot com> ---
(In reply to Rich Felker from comment #34)
> > > or (b) wait for all threads to exit.
> > 
> > Precisely, wait for all threads that use the particular resource to not use it > anymore.  That's different from "wait[ing] for all threads to exit".
> 
> That requires a mutex. So you just moved the mutex issue to a different
> mutex; you didn't solve it.

Consider a thread pool, or something else that manages a set of threads.  The
lifetime of this thing will be larger than the lifetime of the threads
themselves.  You will want to do pthread_join on the threads eventually, if
you're interested in safe destruction.  Once you do that, you can safely
destruct the thread pool at that time, including any mutexes in it.  If you
want to keep the threads around for longer (e.g., so that they can work on more
than one task), you can easily let them signal the thread pool once they've
finished the task.  For that, you can use a mutex in the thread pool for
example.

Thus, there is a straightforward way to do it without reference counting. 
Having a mutex or similar on the thread pool is not something that's bad.  You
will have the thread pool (or a pthread_t at the very least anyway).

If we didn't have pthread_join, or one would have to implement its
functionality with a pthread_mutex_t, then we would have a problem.  But that's
not the case, we do have pthread_join() to eventually break the chain you seem
to be concerned about.

> > > You are arguing for (b): To destroy a mutex -- any mutex -- you must first
> > > wait for every thread that ever touched that mutex to exit.
> > 
> > This could be a reasonable semantics.
> 
> No, you stopped being reasonable several posts back in this thread. I'm not
> sure what your emotional attachment to glibc's current brokenness with
> respect to this issue is, but it's completely clouding your judgement and
> making you look like a fool. It's really sad to see this kind of response to
> bugs again when glibc was just recovering from the madness of the former
> maintainer and his attitude towards bug reports...

I have no emotional attachment to anything here.  That includes the stronger
semantics you want to have, your assumptions about my judgement, etc.

You haven't made a convincing argument why the semantics as targeted by the
current glibc implementation would be incorrect (if the issue above is what
worries you, let's keep discussing that one).  I understood that you'd want
something stronger, and I appreciate that you have an opinion on this, but
ultimately I think glibc should implement what POSIX wants, thus the
clarification request.

Also, if you (and Pat) want the glibc community to be a place where technical
issues are solved in a constructive manner, then you should probably remind
yourselves that you and your actions are very much a part of this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (43 preceding siblings ...)
  2014-01-06 21:20 ` triegel at redhat dot com
@ 2014-01-06 21:24 ` bugdal at aerifal dot cx
  2014-03-28  1:27 ` dancol at dancol dot org
                   ` (18 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-01-06 21:24 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #36 from Rich Felker <bugdal at aerifal dot cx> ---
My comment about the previous maintainership had nothing to do with
"constructive manner" (whatever that is supposed to mean) but the attitude of
making excuses and arguments against fixing bugs instead of just fixing them.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (44 preceding siblings ...)
  2014-01-06 21:24 ` bugdal at aerifal dot cx
@ 2014-03-28  1:27 ` dancol at dancol dot org
  2014-03-28 20:07 ` tudorb at gmail dot com
                   ` (17 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: dancol at dancol dot org @ 2014-03-28  1:27 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Daniel Colascione <dancol at dancol dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dancol at dancol dot org

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (45 preceding siblings ...)
  2014-03-28  1:27 ` dancol at dancol dot org
@ 2014-03-28 20:07 ` tudorb at gmail dot com
  2014-06-20 12:23 ` kevin.dempsey at aculab dot com
                   ` (16 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: tudorb at gmail dot com @ 2014-03-28 20:07 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Tudor Bosman <tudorb at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tudorb at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (46 preceding siblings ...)
  2014-03-28 20:07 ` tudorb at gmail dot com
@ 2014-06-20 12:23 ` kevin.dempsey at aculab dot com
  2014-06-20 18:29 ` triegel at redhat dot com
                   ` (15 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: kevin.dempsey at aculab dot com @ 2014-06-20 12:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #37 from Kevin Dempsey <kevin.dempsey at aculab dot com> ---
Now that the austin group have clarified the expected behaviour
(http://austingroupbugs.net/view.php?id=811) can progress be made on fixing
this?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (47 preceding siblings ...)
  2014-06-20 12:23 ` kevin.dempsey at aculab dot com
@ 2014-06-20 18:29 ` triegel at redhat dot com
  2014-06-20 19:02 ` bugdal at aerifal dot cx
                   ` (14 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-06-20 18:29 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #38 from Torvald Riegel <triegel at redhat dot com> ---
We already started looking at the implications and into possible fixes.  The
any accesses to the mutex' memory are the easy part of the problem of this. 
The fact that the futex_wake call can now hit reused memory (i.e., after
destruction of the mutex) is the trickier issue IMO.  More details on the
latter can be found in an email I sent a while ago to libc-alpha.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (48 preceding siblings ...)
  2014-06-20 18:29 ` triegel at redhat dot com
@ 2014-06-20 19:02 ` bugdal at aerifal dot cx
  2014-06-20 19:10 ` bugdal at aerifal dot cx
                   ` (13 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-20 19:02 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #39 from Rich Felker <bugdal at aerifal dot cx> ---
On Fri, Jun 20, 2014 at 06:28:59PM +0000, triegel at redhat dot com wrote:
> The fact that the futex_wake call can now hit reused memory (i.e., after
> destruction of the mutex) is the trickier issue IMO.  More details on the
> latter can be found in an email I sent a while ago to libc-alpha.

I'm not convinced that the resulting spurious futex wakes are a
serious problem. As I've mentioned before, I have observed spurious
futex wakes coming from the kernel, so applications need to be
prepared to deal with them anyway. I'll see if I can dig up and post
my test case demonstrating them.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (49 preceding siblings ...)
  2014-06-20 19:02 ` bugdal at aerifal dot cx
@ 2014-06-20 19:10 ` bugdal at aerifal dot cx
  2014-06-23  3:06 ` bugdal at aerifal dot cx
                   ` (12 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-20 19:10 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #40 from Rich Felker <bugdal at aerifal dot cx> ---
Created attachment 7653
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7653&action=edit
demonstration of spurious futex wakes from kernel

The attached file demonstrates spurious futex wakes coming from the kernel; it
was designed to mimic the situation I was experiencing where they were breaking
my implementation of pthread_join when it failed to retry the wait. I'm not
sure if they happen or not in situations not connected to CLONE_CHILD_CLEARTID.
I experienced the issue on 3.2 and 3.5 kernels (the latter re-checked just now)
but have not been able to reproduce it on 3.15.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (50 preceding siblings ...)
  2014-06-20 19:10 ` bugdal at aerifal dot cx
@ 2014-06-23  3:06 ` bugdal at aerifal dot cx
  2014-06-25 14:34 ` triegel at redhat dot com
                   ` (11 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-23  3:06 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #41 from Rich Felker <bugdal at aerifal dot cx> ---
By the way I just ran across FUTEX_WAKE_OP, which might be able to perform the
desired atomic-unlock-and-wake if that's deemed necessary. I haven't worked out
the details for how it would work, but it seems promising.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (51 preceding siblings ...)
  2014-06-23  3:06 ` bugdal at aerifal dot cx
@ 2014-06-25 14:34 ` triegel at redhat dot com
  2014-06-25 16:01 ` bugdal at aerifal dot cx
                   ` (10 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-06-25 14:34 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #42 from Torvald Riegel <triegel at redhat dot com> ---
Rich, thanks for the test case, but the code doesn't checks what the futex
syscall returns.  The issue I'm concerned about is not whether there are
spurious wake-ups, but that a spurious wake-up can incorrectly appear to be a
non-spurious one due to the return value of the futex syscall reporting
success.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (52 preceding siblings ...)
  2014-06-25 14:34 ` triegel at redhat dot com
@ 2014-06-25 16:01 ` bugdal at aerifal dot cx
  2014-06-25 17:40 ` triegel at redhat dot com
                   ` (9 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-25 16:01 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #43 from Rich Felker <bugdal at aerifal dot cx> ---
Indeed, it's not testing that, but the syscall does return success. Just change
the futex syscall line to if (!syscall(...)) and remove the final semicolon and
it still eventually ends with Killed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (53 preceding siblings ...)
  2014-06-25 16:01 ` bugdal at aerifal dot cx
@ 2014-06-25 17:40 ` triegel at redhat dot com
  2014-06-25 18:03 ` bugdal at aerifal dot cx
                   ` (8 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-06-25 17:40 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #44 from Torvald Riegel <triegel at redhat dot com> ---
Looking closely, the wakeup seems to come from the CLONE_CHILD_CLEARTID that
you used, which is specified as:
       CLONE_CHILD_CLEARTID (since Linux 2.5.49)
              Erase  child thread ID at location ctid in child memory when the
              child exits, and do a wakeup on the futex at that address.

Without this flag, I don't see non-spurious wakeups anymore.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (54 preceding siblings ...)
  2014-06-25 17:40 ` triegel at redhat dot com
@ 2014-06-25 18:03 ` bugdal at aerifal dot cx
  2014-06-27  7:26 ` fweimer at redhat dot com
                   ` (7 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-25 18:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #45 from Rich Felker <bugdal at aerifal dot cx> ---
But the child provably hasn't exited at the point where the wake occurs, since
the subsequent tgkill syscall succeeds.

However if the issue is just that the kernel is doing these operations in the
wrong order, it probably doesn't cause other spurious wakes to be observed, and
it's probably not relevant to this PR.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (55 preceding siblings ...)
  2014-06-25 18:03 ` bugdal at aerifal dot cx
@ 2014-06-27  7:26 ` fweimer at redhat dot com
  2014-08-09 20:38 ` triegel at redhat dot com
                   ` (6 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: fweimer at redhat dot com @ 2014-06-27  7:26 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (56 preceding siblings ...)
  2014-06-27  7:26 ` fweimer at redhat dot com
@ 2014-08-09 20:38 ` triegel at redhat dot com
  2014-08-12  2:29 ` bugdal at aerifal dot cx
                   ` (5 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2014-08-09 20:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #47 from Torvald Riegel <triegel at redhat dot com> ---
This would increase the unlock latency whenever there is any waiter (because we
let the kernel do it, and after it has found and acquired the futex lock).  I
don't have numbers for this increase, but if there's a non-neglible increase in
latency, then I wouldn't want to see this in glibc.

I still think that the only thing we need to fix is to make sure that no
program can interpret a spurious wake-up (by a pending futex_wake) as a real
wake-up.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (57 preceding siblings ...)
  2014-08-09 20:38 ` triegel at redhat dot com
@ 2014-08-12  2:29 ` bugdal at aerifal dot cx
  2015-01-15  8:45 ` mtk.manpages at gmail dot com
                   ` (4 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: bugdal at aerifal dot cx @ 2014-08-12  2:29 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #48 from Rich Felker <bugdal at aerifal dot cx> ---
> This would increase the unlock latency whenever there is any waiter (because
> we let the kernel do it, and after it has found and acquired the futex lock). 
> I don't have numbers for this increase, but if there's a non-neglible increase
> in latency, then I wouldn't want to see this in glibc.

Torvald, I agree you have a legitimate concern (unlock latency), but while I
don't have evidence to back this up (just high-level reasoning), I think the
difference in time at which the atomic-store actually works in favor of
performance with FUTEX_WAKE_OP. I'll try to explain:

In the case where there is no waiter at the time of unlock, no wake occurs,
neither by FUTEX_WAKE nor FUTEX_WAKE_OP. There's only an atomic operation (CAS,
if we want to fix the bug this whole issue tracker thread is about). So for the
sake of comparing performance, we need to consider the case where there is at
least one waiter.

Right now (with FUTEX_WAKE), there's a great deal of latency between the atomic
operation that releases the lock and the FUTEX_WAKE being dispatched, due to
kernel entry overhead, futex hash overhead, etc. During that window, a thread
which is not a waiter can race for the lock and acquire it first, despite there
being waiters. This acquisition inherently happens with very low latency, but I
think it's actually likely to be bad for performance:

If the thread which "stole" the lock has not released it by the time the thread
woken by FUTEX_WAKE gets scheduled, the latter thread will uselessly contend
for the lock again, imposing additional cache synchronization overhead and an
additional syscall to wait on the futex again. It will also wrongly get moved
to the end of the wait queue.

If on the other hand, the thread which "stole" the lock immediately releases
it, before the woken thread gets scheduled, my understanding is that it will
see that there are waiters and issue an additional FUTEX_WAKE at unlock time.
At the very least this is a wasted syscall. If there actually are two or more
waiters, it's a lot more expensive, since an extra thread wakes up only to
contend the lock and re-wait.

As both of these situations seem undesirable to me, I think the optimal
behavior should be to minimize the latency between the atomic-release operation
that makes the lock available to other threads and the futex wake. And the only
way to make this latency small is to perform the atomic release in kernel
space.

> I still think that the only thing we need to fix is to make sure that no
> program can interpret a spurious wake-up (by a pending futex_wake) as a real
> wake-up.

As I understand it, all of the current code treats futex wakes much like POSIX
condition variable waits: as an indication to re-check an external predicate
rather than as the bearer of notification about state. If not, things would
already be a lot more broken than they are now in regards to this issue.

On the other hand, if you eliminate all sources of spurious wakes, I think it's
possible to achieve better behavior; in particular I think it may be possible
to prevent "stealing" of locks entirely and ensure that the next futex waiter
always gets the lock on unlock. Whether this behavior is desirable for glibc or
not, I'm not sure. I'm going to do research on it as a future possibility for
musl.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (58 preceding siblings ...)
  2014-08-12  2:29 ` bugdal at aerifal dot cx
@ 2015-01-15  8:45 ` mtk.manpages at gmail dot com
  2015-05-30 18:25 ` dancol at dancol dot org
                   ` (3 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: mtk.manpages at gmail dot com @ 2015-01-15  8:45 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Michael Kerrisk <mtk.manpages at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mtk.manpages at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (59 preceding siblings ...)
  2015-01-15  8:45 ` mtk.manpages at gmail dot com
@ 2015-05-30 18:25 ` dancol at dancol dot org
  2015-06-03  4:08 ` carlos at redhat dot com
                   ` (2 subsequent siblings)
  63 siblings, 0 replies; 65+ messages in thread
From: dancol at dancol dot org @ 2015-05-30 18:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

--- Comment #49 from Daniel Colascione <dancol at dancol dot org> ---
What's blocking the application of the fix? I doubt spurious futex wakeups
cause problems in practice.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (60 preceding siblings ...)
  2015-05-30 18:25 ` dancol at dancol dot org
@ 2015-06-03  4:08 ` carlos at redhat dot com
  2015-06-03  4:09 ` carlos at redhat dot com
  2015-07-14 20:23 ` triegel at redhat dot com
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at redhat dot com @ 2015-06-03  4:08 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|carlos at redhat dot com           |unassigned at sourceware dot org
   Target Milestone|2.18                        |2.22

--- Comment #50 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Daniel Colascione from comment #49)
> What's blocking the application of the fix? I doubt spurious futex wakeups
> cause problems in practice.

Senior developers to review it, and collate a final consensus, and test the
patch, and commit it. It's best to ping on libc-alpha and try to summarize a
status for people like Rich and Torvald to help review.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (61 preceding siblings ...)
  2015-06-03  4:08 ` carlos at redhat dot com
@ 2015-06-03  4:09 ` carlos at redhat dot com
  2015-07-14 20:23 ` triegel at redhat dot com
  63 siblings, 0 replies; 65+ messages in thread
From: carlos at redhat dot com @ 2015-06-03  4:09 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

* [Bug nptl/13690] pthread_mutex_unlock potentially cause invalid access
  2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
                   ` (62 preceding siblings ...)
  2015-06-03  4:09 ` carlos at redhat dot com
@ 2015-07-14 20:23 ` triegel at redhat dot com
  63 siblings, 0 replies; 65+ messages in thread
From: triegel at redhat dot com @ 2015-07-14 20:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13690

Torvald Riegel <triegel at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
   Target Milestone|2.22                        |2.18

--- Comment #51 from Torvald Riegel <triegel at redhat dot com> ---
The spurious wake-up issue has been discussed with the Kernel community, and
the conclusion was to essentially document this behavior.  A current draft of
an updated futex manpage has the following wording:

       FUTEX_WAIT
              Returns 0 if the caller was woken up.  Note that a  wake-up  can
              also  be caused by common futex usage patterns in unrelated code
              that happened to have previously used the  futex  word's  memory
              location  (e.g., typical futex-based implementations of Pthreads
              mutexes can cause this under some conditions).  Therefore, call‐
              ers should always conservatively assume that a return value of 0
              can mean a spurious wake-up, and  use  the  futex  word's  value
              (i.e., the user space synchronization scheme)
                  to decide whether to continue to block or not.

I've send an updated patch for review:
https://sourceware.org/ml/libc-alpha/2015-07/msg00411.html
Compared to the patch posted here, it fixes the problem in __lll_unlock and
__lll_robust_unlock, and fixes a similar problem in
__pthread_mutex_unlock_full.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-return-28823-listarch-glibc-bugs=sources.redhat.com@sourceware.org Tue Jul 14 20:38:23 2015
Return-Path: <glibc-bugs-return-28823-listarch-glibc-bugs=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs@sources.redhat.com
Received: (qmail 27496 invoked by alias); 14 Jul 2015 20:38:23 -0000
Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs@sourceware.org>
List-Help: <mailto:glibc-bugs-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-owner@sourceware.org
Delivered-To: mailing list glibc-bugs@sourceware.org
Received: (qmail 27427 invoked by uid 48); 14 Jul 2015 20:38:10 -0000
From: "dcb314 at hotmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug ports/18674] New: trunk/sysdeps/i386/tst-auditmod3b.c:84: possible missing break ?
Date: Tue, 14 Jul 2015 20:38:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: ports
X-Bugzilla-Version: 2.21
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: dcb314 at hotmail dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc target_milestone
Message-ID: <bug-18674-131@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-07/txt/msg00103.txt.bz2
Content-length: 943

https://sourceware.org/bugzilla/show_bug.cgi?id\x18674

            Bug ID: 18674
           Summary: trunk/sysdeps/i386/tst-auditmod3b.c:84: possible
                    missing break ?
           Product: glibc
           Version: 2.21
            Status: NEW
          Severity: normal
          Priority: P2
         Component: ports
          Assignee: unassigned at sourceware dot org
          Reporter: dcb314 at hotmail dot com
                CC: carlos at redhat dot com, roland at gnu dot org
  Target Milestone: ---

[trunk/sysdeps/i386/tst-auditmod3b.c:84] ->
[trunk/sysdeps/i386/tst-auditmod3b.c:86]: (warning) Variable 'flagstr' is
reassigned a value before the old one has been used. 'break;' missing?

Source code is

    case LA_SER_DEFAULT:
      flagstr = "LA_SER_DEFAULT";
    case LA_SER_SECURE:
      flagstr = "LA_SER_SECURE";
      break;

--
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2015-07-14 20:23 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-14 14:28 [Bug nptl/13690] New: pthread_mutex_unlock potentially cause invalid access anemo at mba dot ocn.ne.jp
2012-02-14 14:29 ` [Bug nptl/13690] " anemo at mba dot ocn.ne.jp
2012-02-14 15:39 ` carlos at systemhalted dot org
2012-02-14 15:41 ` carlos at systemhalted dot org
2012-02-14 15:42 ` carlos at systemhalted dot org
2012-02-15  6:47 ` ppluzhnikov at google dot com
2012-02-15 13:18 ` anemo at mba dot ocn.ne.jp
2012-02-15 14:35 ` carlos at systemhalted dot org
2012-02-16  5:09 ` bugdal at aerifal dot cx
2012-02-16 14:43 ` anemo at mba dot ocn.ne.jp
2012-02-16 14:47 ` anemo at mba dot ocn.ne.jp
2012-02-16 15:37 ` carlos at systemhalted dot org
2012-02-16 15:41 ` carlos at systemhalted dot org
2012-02-16 16:22 ` bugdal at aerifal dot cx
2012-02-16 16:35 ` carlos at systemhalted dot org
2012-02-17  5:11 ` bugdal at aerifal dot cx
2012-02-17 13:27 ` anemo at mba dot ocn.ne.jp
2012-02-17 16:18 ` carlos at systemhalted dot org
2012-02-17 16:37 ` carlos at systemhalted dot org
2012-02-20 11:42 ` anemo at mba dot ocn.ne.jp
2012-02-22 14:57 ` carlos at systemhalted dot org
2012-02-29 16:54 ` carlos at systemhalted dot org
2012-03-07 10:30 ` drepper.fsp at gmail dot com
2012-03-07 17:53 ` bugdal at aerifal dot cx
2012-03-08  3:23 ` carlos at systemhalted dot org
2012-03-08  5:13 ` bugdal at aerifal dot cx
2012-04-28  9:57 ` coolhair24 at verizon dot net
2012-06-27 22:32 ` jsm28 at gcc dot gnu.org
2012-11-29 15:55 ` carlos_odonell at mentor dot com
2012-12-01 16:43 ` aj at suse dot de
2012-12-03 23:57 ` carlos at systemhalted dot org
2013-10-09 20:14 ` neleai at seznam dot cz
2013-12-18 20:13 ` triegel at redhat dot com
2013-12-18 20:33 ` bugdal at aerifal dot cx
2013-12-18 20:49 ` bugdal at aerifal dot cx
2013-12-20 19:08 ` lopresti at gmail dot com
2013-12-20 19:38 ` carlos at redhat dot com
2013-12-20 20:25 ` triegel at redhat dot com
2013-12-20 22:51 ` bugdal at aerifal dot cx
2014-01-03  9:10 ` kevin.dempsey at aculab dot com
2014-01-06 16:58 ` triegel at redhat dot com
2014-01-06 17:46 ` lopresti at gmail dot com
2014-01-06 20:38 ` triegel at redhat dot com
2014-01-06 20:47 ` bugdal at aerifal dot cx
2014-01-06 21:20 ` triegel at redhat dot com
2014-01-06 21:24 ` bugdal at aerifal dot cx
2014-03-28  1:27 ` dancol at dancol dot org
2014-03-28 20:07 ` tudorb at gmail dot com
2014-06-20 12:23 ` kevin.dempsey at aculab dot com
2014-06-20 18:29 ` triegel at redhat dot com
2014-06-20 19:02 ` bugdal at aerifal dot cx
2014-06-20 19:10 ` bugdal at aerifal dot cx
2014-06-23  3:06 ` bugdal at aerifal dot cx
2014-06-25 14:34 ` triegel at redhat dot com
2014-06-25 16:01 ` bugdal at aerifal dot cx
2014-06-25 17:40 ` triegel at redhat dot com
2014-06-25 18:03 ` bugdal at aerifal dot cx
2014-06-27  7:26 ` fweimer at redhat dot com
2014-08-09 20:38 ` triegel at redhat dot com
2014-08-12  2:29 ` bugdal at aerifal dot cx
2015-01-15  8:45 ` mtk.manpages at gmail dot com
2015-05-30 18:25 ` dancol at dancol dot org
2015-06-03  4:08 ` carlos at redhat dot com
2015-06-03  4:09 ` carlos at redhat dot com
2015-07-14 20:23 ` triegel at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).