public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/2644] New: Race condition during unwind code after thread cancellation
@ 2006-05-07 13:56 batneil at thebatcave dot org dot uk
2006-05-07 14:02 ` [Bug nptl/2644] " batneil at thebatcave dot org dot uk
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 13:56 UTC (permalink / raw)
To: glibc-bugs
I think there is a race condition in the code in
nptl/sysdeps/pthread/unwind-forcedunwind.c which can lead to a segfault. I
found this in a redhat build, but it exists in CVS glibc too (as of May 6 2006).
Consider a call to _Unwind_ForcedUnwind, when libgcc_s_forcedunwind has not been
loaded. _Unwind_ForcedUnwind checks the value against null, and jumps to
pthread_cancel_init. Meanwhile another thread comes in and initialises all
these pointers, so the first check in pthread_cancel_init shows that
libgcc_s_getcfa is non-null, so we return to _Unwind_ForcedUnwind and execute
libgcc_s_forcedunwind. As the function pointer libgcc_s_forcedunwind has not
been marked volatile, the compiler does not need to reload this value, and
attempts to call the address it previously loaded, ie. 0.
I have a test case which shows the problem and patch which I believe fixes it.
--
Summary: Race condition during unwind code after thread
cancellation
Product: glibc
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: nptl
AssignedTo: drepper at redhat dot com
ReportedBy: batneil at thebatcave dot org dot uk
CC: glibc-bugs at sources dot redhat dot com
GCC build triplet: i686-redhat-linux
GCC host triplet: i686-redhat-linux
GCC target triplet: i686-redhat-linux
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
@ 2006-05-07 14:02 ` batneil at thebatcave dot org dot uk
2006-05-07 14:05 ` batneil at thebatcave dot org dot uk
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 14:02 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 14:01 -------
Created an attachment (id=1004)
--> (http://sourceware.org/bugzilla/attachment.cgi?id=1004&action=view)
Patch to make the unwind function pointers volatile.
I think this patch fixes the bug. It's a little ugly because I wasn't sure of
the syntax for making the function pointers volatile.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
2006-05-07 14:02 ` [Bug nptl/2644] " batneil at thebatcave dot org dot uk
@ 2006-05-07 14:05 ` batneil at thebatcave dot org dot uk
2006-05-07 14:14 ` batneil at thebatcave dot org dot uk
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 14:05 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 14:04 -------
Created an attachment (id=1005)
--> (http://sourceware.org/bugzilla/attachment.cgi?id=1005&action=view)
Test case which shows the problem
This test shows the bug for me on an 8 way xeon. I run it with the following
command:
for i in `seq 0 100 10000`; do echo $i; for j in `seq 1 100`; do
./pthread_exit_race; done; done
and eventually I get some segs:
0
100
received segv!
received segv!
200
...
I built the test with g++ pthread_exit_race.cc -o pthread_exit_race -lpthread
-DMAXTHREADS=100
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
2006-05-07 14:02 ` [Bug nptl/2644] " batneil at thebatcave dot org dot uk
2006-05-07 14:05 ` batneil at thebatcave dot org dot uk
@ 2006-05-07 14:14 ` batneil at thebatcave dot org dot uk
2006-05-07 17:33 ` drepper at redhat dot com
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 14:14 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 14:14 -------
To clarify how the problem manifests, here's a snippet of the compiled
_Unwind_ForcedUnwind code from the unpatched libc:
00941360 <_Unwind_ForcedUnwind>:
[...]
941377: 8b 93 ac 21 00 00 mov 0x21ac(%ebx),%edx
94137d: 89 7d fc mov %edi,0xfffffffc(%ebp)
941380: 85 d2 test %edx,%edx
941382: 74 23 je 9413a7
941384: 8b 75 10 mov 0x10(%ebp),%esi
941387: 8b 4d 0c mov 0xc(%ebp),%ecx
94138a: 8b 45 08 mov 0x8(%ebp),%eax
94138d: 89 74 24 08 mov %esi,0x8(%esp)
941391: 89 4c 24 04 mov %ecx,0x4(%esp)
941395: 89 04 24 mov %eax,(%esp)
941398: ff d2 call *%edx
[...]
9413a6: c3 ret
9413a7: 8b 83 b0 21 00 00 mov 0x21b0(%ebx),%eax
9413ad: 85 c0 test %eax,%eax
9413af: 75 d3 jne 941384
If the test at 941380 shows that edx (which is libgcc_s_forcedunwind) contains
0, and the test at 9413ad shows that eax (libgcc_s_getcfa) is non-zero, we jump
back to 941384 and try to call edx without having changed it from 0.
After patching and rebuilding on my system the following code results:
0000c360 <_Unwind_ForcedUnwind>:
[...]
c377: 8b 83 ac 21 00 00 mov 0x21ac(%ebx),%eax
c37d: 89 7d fc mov %edi,0xfffffffc(%ebp)
c380: 85 c0 test %eax,%eax
c382: 74 29 je c3ad <_Unwind_ForcedUnwind+0x4d>
c384: 8b 7d 10 mov 0x10(%ebp),%edi
c387: 8b 75 0c mov 0xc(%ebp),%esi
c38a: 8b 4d 08 mov 0x8(%ebp),%ecx
c38d: 89 7c 24 08 mov %edi,0x8(%esp)
c391: 8b 83 ac 21 00 00 mov 0x21ac(%ebx),%eax
c397: 89 74 24 04 mov %esi,0x4(%esp)
c39b: 89 0c 24 mov %ecx,(%esp)
c39e: ff d0 call *%eax
[...]
c3ac: c3 ret
c3ad: 8b 93 b0 21 00 00 mov 0x21b0(%ebx),%edx
c3b3: 85 d2 test %edx,%edx
c3b5: 75 cd jne c384 <_Unwind_ForcedUnwind+0x24>
In this case eax is used for libgcc_s_forcedunwind, and after returning from the
(inlined) pthread_cancel_init we now reload eax at c391 before calling it.
This bug seems to exist in current CVS and could probably affect most platforms.
I've searched for any previous reports and the closest I got was the patch
proposed in http://sourceware.org/ml/libc-hacker/2005-11/msg00010.html, which
fixes a different problem with the same code.
I can provide more details of compiler versions etc if required.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (2 preceding siblings ...)
2006-05-07 14:14 ` batneil at thebatcave dot org dot uk
@ 2006-05-07 17:33 ` drepper at redhat dot com
2006-05-07 18:37 ` batneil at thebatcave dot org dot uk
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: drepper at redhat dot com @ 2006-05-07 17:33 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From drepper at redhat dot com 2006-05-07 17:32 -------
What compiler do you use? Mine doesn't generate this code.
Anyway, this is a pretty heavy handed solution. It forces the compiler to load
the value twice while in fact you only want to reload when pthread_cancel_init
was called.
Instead, try replacing each call of pthread_cancel_init with an approriate call like
{
pthread_cancel_init ();
asm volatile ("" : "=m" (libgcc_s_result));
}
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |WAITING
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (3 preceding siblings ...)
2006-05-07 17:33 ` drepper at redhat dot com
@ 2006-05-07 18:37 ` batneil at thebatcave dot org dot uk
2006-05-07 22:15 ` batneil at thebatcave dot org dot uk
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 18:37 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 18:37 -------
Thanks for the quick response. Your solution does sound better, I'm just
rebuilding to check that it works as expected.
The compiler I'm using is GCC 3.4.3; to be precise, it's version 3.4.3 20041212
(Red Hat 3.4.3-9.EL4). If it's of any use, the first piece of code I quoted
matches that from the libpthread.so.0 that comes from the glibc-2.3.4-2 package
on Red Hat EL 4.
I'm building with rpmbuild at the moment, but from what I can tell the CFLAGS
are set to '-march=i686 -DNDEBUG=1 -g -O3'. Please let me know if you need more
details on the configuration.
It seems to me that although not all compilers necessarily will plant code that
is unsafe, it is at least valid for them to do so from the current source.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (4 preceding siblings ...)
2006-05-07 18:37 ` batneil at thebatcave dot org dot uk
@ 2006-05-07 22:15 ` batneil at thebatcave dot org dot uk
2006-05-07 22:19 ` batneil at thebatcave dot org dot uk
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 22:15 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 22:15 -------
Created an attachment (id=1006)
--> (http://sourceware.org/bugzilla/attachment.cgi?id=1006&action=view)
Patch to force reload of the pointers only when required
As discussed, this patch forces the function pointers to be reloaded when
required, without needing them all to marked as volatile. I've used the '+'
modifier in the asm, when I used '=' gcc decided to dead-code one of the stores
and everything broke.
--
What |Removed |Added
----------------------------------------------------------------------------
Attachment #1004 is|0 |1
obsolete| |
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (5 preceding siblings ...)
2006-05-07 22:15 ` batneil at thebatcave dot org dot uk
@ 2006-05-07 22:19 ` batneil at thebatcave dot org dot uk
2006-05-08 1:00 ` drepper at redhat dot com
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-07 22:19 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-07 22:19 -------
For completeness, here's the compiler output for the new patched version:
0000c360 <_Unwind_ForcedUnwind>:
[...]
c377: 8b 93 ac 21 00 00 mov 0x21ac(%ebx),%edx
c37d: 89 7d fc mov %edi,0xfffffffc(%ebp)
c380: 85 d2 test %edx,%edx
c382: 74 23 je c3a7 <_Unwind_ForcedUnwind+0x47>
c384: 8b 75 10 mov 0x10(%ebp),%esi
c387: 8b 4d 0c mov 0xc(%ebp),%ecx
c38a: 8b 45 08 mov 0x8(%ebp),%eax
c38d: 89 74 24 08 mov %esi,0x8(%esp)
c391: 89 4c 24 04 mov %ecx,0x4(%esp)
c395: 89 04 24 mov %eax,(%esp)
c398: ff d2 call *%edx
[...]
c3a6: c3 ret
c3a7: 8b 83 b0 21 00 00 mov 0x21b0(%ebx),%eax
c3ad: 85 c0 test %eax,%eax
c3af: 74 08 je c3b9 <_Unwind_ForcedUnwind+0x59>
c3b1: 8b 93 ac 21 00 00 mov 0x21ac(%ebx),%edx
c3b7: eb cb jmp c384 <_Unwind_ForcedUnwind+0x24>
The common case is now just as it was before, but in the case where we have to
do the initialisation the value is correctly loaded at c3b1.
I haven't finished testing with this version, but it looks good so far.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (6 preceding siblings ...)
2006-05-07 22:19 ` batneil at thebatcave dot org dot uk
@ 2006-05-08 1:00 ` drepper at redhat dot com
2006-05-08 9:36 ` batneil at thebatcave dot org dot uk
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: drepper at redhat dot com @ 2006-05-08 1:00 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From drepper at redhat dot com 2006-05-08 01:00 -------
I checked in the patch.
(Next time, change the state from WAITING when you add comments.)
--
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |RESOLVED
Resolution| |FIXED
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (7 preceding siblings ...)
2006-05-08 1:00 ` drepper at redhat dot com
@ 2006-05-08 9:36 ` batneil at thebatcave dot org dot uk
2006-05-08 11:28 ` jakub at redhat dot com
2006-11-28 10:31 ` jakub at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: batneil at thebatcave dot org dot uk @ 2006-05-08 9:36 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From batneil at thebatcave dot org dot uk 2006-05-08 09:36 -------
(In reply to comment #8)
> I checked in the patch.
Thank you!
> (Next time, change the state from WAITING when you add comments.)
Sorry, will do.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (8 preceding siblings ...)
2006-05-08 9:36 ` batneil at thebatcave dot org dot uk
@ 2006-05-08 11:28 ` jakub at redhat dot com
2006-11-28 10:31 ` jakub at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: jakub at redhat dot com @ 2006-05-08 11:28 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From jakub at redhat dot com 2006-05-08 11:27 -------
I think instead of the patch you checked in we should just mark
pthread_cancel_init with __attribute__((noinline)). That will do everything
that's needed and is desirable anyway.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug nptl/2644] Race condition during unwind code after thread cancellation
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
` (9 preceding siblings ...)
2006-05-08 11:28 ` jakub at redhat dot com
@ 2006-11-28 10:31 ` jakub at redhat dot com
10 siblings, 0 replies; 12+ messages in thread
From: jakub at redhat dot com @ 2006-11-28 10:31 UTC (permalink / raw)
To: glibc-bugs
------- Additional Comments From jakub at redhat dot com 2006-11-28 10:31 -------
*** Bug 3597 has been marked as a duplicate of this bug. ***
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsomla at mysql dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=2644
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2006-11-28 10:31 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-07 13:56 [Bug nptl/2644] New: Race condition during unwind code after thread cancellation batneil at thebatcave dot org dot uk
2006-05-07 14:02 ` [Bug nptl/2644] " batneil at thebatcave dot org dot uk
2006-05-07 14:05 ` batneil at thebatcave dot org dot uk
2006-05-07 14:14 ` batneil at thebatcave dot org dot uk
2006-05-07 17:33 ` drepper at redhat dot com
2006-05-07 18:37 ` batneil at thebatcave dot org dot uk
2006-05-07 22:15 ` batneil at thebatcave dot org dot uk
2006-05-07 22:19 ` batneil at thebatcave dot org dot uk
2006-05-08 1:00 ` drepper at redhat dot com
2006-05-08 9:36 ` batneil at thebatcave dot org dot uk
2006-05-08 11:28 ` jakub at redhat dot com
2006-11-28 10:31 ` jakub at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).