* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
@ 2012-08-17 22:34 ` bugdal at aerifal dot cx
2014-06-17 18:35 ` fweimer at redhat dot com
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: bugdal at aerifal dot cx @ 2012-08-17 22:34 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #1 from Rich Felker <bugdal at aerifal dot cx> 2012-08-17 22:34:24 UTC ---
It seems this bug has been known (but not reported as a bug) since 2010 or
earlier:
http://lists.freebsd.org/pipermail/svn-src-user/2010-November/003668.html
Keep in mind this thread I'm linking has some other complaints about NPTL's
robust mutexes that are orthogonal to this bug report, such as the fact that
you can maliciously mess up other processes that map the same mutex you have
access to. These other complaints are perhaps QoI issues, but not major bugs;
an application has no basis to assume it's safe to let untrusted processes map
its synchronization objects.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
2012-08-17 22:34 ` [Bug nptl/14485] " bugdal at aerifal dot cx
@ 2014-06-17 18:35 ` fweimer at redhat dot com
2014-06-25 10:47 ` fweimer at redhat dot com
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: fweimer at redhat dot com @ 2014-06-17 18:35 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fweimer at redhat dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
2012-08-17 22:34 ` [Bug nptl/14485] " bugdal at aerifal dot cx
2014-06-17 18:35 ` fweimer at redhat dot com
@ 2014-06-25 10:47 ` fweimer at redhat dot com
2014-06-25 15:47 ` bugdal at aerifal dot cx
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: fweimer at redhat dot com @ 2014-06-25 10:47 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags| |security-
--- Comment #2 from Florian Weimer <fweimer at redhat dot com> ---
What causes the corruption? Can you really unmap a page which is in use in a
futex system call? Do we have a test case?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (2 preceding siblings ...)
2014-06-25 10:47 ` fweimer at redhat dot com
@ 2014-06-25 15:47 ` bugdal at aerifal dot cx
2015-02-09 0:28 ` mail at nh2 dot me
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: bugdal at aerifal dot cx @ 2014-06-25 15:47 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #3 from Rich Felker <bugdal at aerifal dot cx> ---
The corruption is performed by the kernel when it walks the robust list. The
basic situation is the same as in PR #13690, except that here there's actually
a potential write to the memory rather than just a read.
The sequence of events leading to corruption goes like this:
1. Thread A unlocks the process-shared, robust mutex and is preempted after the
mutex is removed from the robust list and atomically unlocked, but before it's
removed from the list_op_pending field of the robust list header.
2. Thread B locks the mutex, and, knowing by program logic that it's the last
user of the mutex, unlocks and unmaps it, allocates/maps something else that
gets assigned the same address as the shared mutex mapping, and then exits.
3. The kernel destroys the process, which involves walking each thread's robust
list and processing each thread's list_op_pending field of the robust list
header. Since thread A has a list_op_pending pointing at the address previously
occupied by the mutex, the kernel obliviously "unlocks the mutex" by writing a
0 to the address and futex-waking it. However, the kernel has instead
overwritten part of whatever mapping thread A created. If this is private
memory it (probably) doesn't matter since the process is ending anyway (but are
there race conditions where this can be seen?). If this is shared memory or a
shared file mapping, however, the kernel corrupts it.
I suspect the race is difficult to hit since thread A has to get preempted at
exactly the wrong time AND thread B has to do a fair amount of work without
thread A getting scheduled again. So I'm not sure how much luck we'd have
getting a test case.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (3 preceding siblings ...)
2014-06-25 15:47 ` bugdal at aerifal dot cx
@ 2015-02-09 0:28 ` mail at nh2 dot me
2015-02-09 20:41 ` carlos at redhat dot com
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: mail at nh2 dot me @ 2015-02-09 0:28 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
mail at nh2 dot me changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mail at nh2 dot me
--- Comment #4 from mail at nh2 dot me ---
@maintainers, do you acknowledge this as a bug?
I'd like to use this in a shared memory setup, but am scared of this case to
happen.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (4 preceding siblings ...)
2015-02-09 0:28 ` mail at nh2 dot me
@ 2015-02-09 20:41 ` carlos at redhat dot com
2015-02-09 21:13 ` carlos at redhat dot com
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: carlos at redhat dot com @ 2015-02-09 20:41 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
Carlos O'Donell <carlos at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |carlos at redhat dot com
--- Comment #5 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Rich Felker from comment #3)
> 1. Thread A unlocks the process-shared, robust mutex and is preempted after
> the mutex is removed from the robust list and atomically unlocked, but
> before it's removed from the list_op_pending field of the robust list header.
>
> 2. Thread B locks the mutex, and, knowing by program logic that it's the
> last user of the mutex, unlocks and unmaps it, allocates/maps something else
> that gets assigned the same address as the shared mutex mapping, and then
> exits.
Isn't this undefined behaviour? You have not specified how you established a
happens-after relationship between the destruction of the mutex by Thread B and
the last use by Thread A. In this description you give it would seem to me that
Thread A is still not done, and that the "program logic" from Thread B is
destroying an in-use mutex and that results in undefined behaviour from Thread
A. Thread B fails to establish a happens-after the use of the mutex from Thread
A. If Thread B truly establishes a happens-after the unlock from Thread A, is
there a problem? I don't think there is.
Did I get something wrong Rich?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (5 preceding siblings ...)
2015-02-09 20:41 ` carlos at redhat dot com
@ 2015-02-09 21:13 ` carlos at redhat dot com
2015-02-09 22:51 ` bugdal at aerifal dot cx
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: carlos at redhat dot com @ 2015-02-09 21:13 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #6 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Carlos O'Donell from comment #5)
> (In reply to Rich Felker from comment #3)
> > 1. Thread A unlocks the process-shared, robust mutex and is preempted after
> > the mutex is removed from the robust list and atomically unlocked, but
> > before it's removed from the list_op_pending field of the robust list header.
> >
> > 2. Thread B locks the mutex, and, knowing by program logic that it's the
> > last user of the mutex, unlocks and unmaps it, allocates/maps something else
> > that gets assigned the same address as the shared mutex mapping, and then
> > exits.
>
> Isn't this undefined behaviour? You have not specified how you established a
> happens-after relationship between the destruction of the mutex by Thread B
> and the last use by Thread A. In this description you give it would seem to
> me that Thread A is still not done, and that the "program logic" from Thread
> B is destroying an in-use mutex and that results in undefined behaviour from
> Thread A. Thread B fails to establish a happens-after the use of the mutex
> from Thread A. If Thread B truly establishes a happens-after the unlock from
> Thread A, is there a problem? I don't think there is.
>
> Did I get something wrong Rich?
OK, I see what's wrong.
This issue is about self-synchronizing vs. not-self-synchronizing.
http://austingroupbugs.net/view.php?id=811
Given 811 has been accepted, I withdraw my complaint.
Your example is valid, and we do have a problem.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (6 preceding siblings ...)
2015-02-09 21:13 ` carlos at redhat dot com
@ 2015-02-09 22:51 ` bugdal at aerifal dot cx
2015-02-10 0:18 ` bugdal at aerifal dot cx
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: bugdal at aerifal dot cx @ 2015-02-09 22:51 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #7 from Rich Felker <bugdal at aerifal dot cx> ---
Carlos, there's actually a still-open related Austin Group issue, number 864:
http://austingroupbugs.net/view.php?id=864
If there's a desire from the glibc side that implementations not be required to
handle the case of self-synchronized unmapping, please have someone make that
case. I'd be interested in hearing some arguments on both sides, as I haven't
really made up my own opinion on which way it should be resolved.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (7 preceding siblings ...)
2015-02-09 22:51 ` bugdal at aerifal dot cx
@ 2015-02-10 0:18 ` bugdal at aerifal dot cx
2015-02-10 21:57 ` triegel at redhat dot com
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: bugdal at aerifal dot cx @ 2015-02-10 0:18 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #8 from Rich Felker <bugdal at aerifal dot cx> ---
In reply to comment 4, this issue can be avoided by applications in at least
two ways:
1. Use a separate mapping of the shared synchronization object for each
user/thread that might want to unmap it.
2. Use a separate synchronization object local to the process to synchronize
unmapping of the shared mutex.
Since the only way you'd have multiple threads in the same process accessing
the shared synchronization object is by storing the pointer to the (mapping
containing the) shared mutex in some process-local object that's shared between
threads, it seems natural that you would already be synchronizing access to
this memory with another mutex (or other synchronization object) stored with
the pointer. So approach 2 seems like it's always practical, probably doesn't
involve any new synchronization, and likely makes it unnecessary/useless to
support self-synchronized unmapping. On the other hand, it may not actually be
any harder to support self-synchronized unmapping than to support
self-synchronized destruction+unmapping, which almost certainly needs to be
supported.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (8 preceding siblings ...)
2015-02-10 0:18 ` bugdal at aerifal dot cx
@ 2015-02-10 21:57 ` triegel at redhat dot com
2015-02-10 22:17 ` bugdal at aerifal dot cx
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: triegel at redhat dot com @ 2015-02-10 21:57 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
Torvald Riegel <triegel at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
CC| |triegel at redhat dot com
Assignee|unassigned at sourceware dot org |triegel at redhat dot com
--- Comment #9 from Torvald Riegel <triegel at redhat dot com> ---
I agree that there is an issue if we claim that a robust mutex can be destroyed
as soon as the thread that wants to destroy it can acquire it and there is no
other thread or process trying to acquire it anymore. I don't think that
whether we consider destruction or unmap without destruction makes a
significant difference, except regarding performance of potential solutions.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (9 preceding siblings ...)
2015-02-10 21:57 ` triegel at redhat dot com
@ 2015-02-10 22:17 ` bugdal at aerifal dot cx
2015-08-09 12:29 ` mail at nh2 dot me
2021-10-21 15:42 ` fweimer at redhat dot com
12 siblings, 0 replies; 14+ messages in thread
From: bugdal at aerifal dot cx @ 2015-02-10 22:17 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #10 from Rich Felker <bugdal at aerifal dot cx> ---
Torvald, the distinction between unmap and destroy+unmap may be significant in
that the costly synchronization could be tucked away in pthread_mutex_destroy
to deal with the latter case but not the former. So I think this realistically
comes into any performance-based argument of what the standard should mandate.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (10 preceding siblings ...)
2015-02-10 22:17 ` bugdal at aerifal dot cx
@ 2015-08-09 12:29 ` mail at nh2 dot me
2021-10-21 15:42 ` fweimer at redhat dot com
12 siblings, 0 replies; 14+ messages in thread
From: mail at nh2 dot me @ 2015-08-09 12:29 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
--- Comment #11 from mail at nh2 dot me ---
Could somebody summarise for me as somebody not familiar with the glibc
internals, what is the status of this bug, and in which cases am I safe to use
a robust mutex in a shared memory setup? Thanks!
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Bug nptl/14485] File corruption race condition in robust mutex unlocking
2012-08-17 18:52 [Bug nptl/14485] New: File corruption race condition in robust mutex unlocking bugdal at aerifal dot cx
` (11 preceding siblings ...)
2015-08-09 12:29 ` mail at nh2 dot me
@ 2021-10-21 15:42 ` fweimer at redhat dot com
12 siblings, 0 replies; 14+ messages in thread
From: fweimer at redhat dot com @ 2021-10-21 15:42 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=14485
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC|triegel at redhat dot com |
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 14+ messages in thread