From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28237 invoked by alias); 14 Jan 2014 14:51:49 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Received: (qmail 28197 invoked by uid 48); 14 Jan 2014 14:51:46 -0000 From: "carlos at redhat dot com" To: glibc-bugs@sourceware.org Subject: [Bug nptl/12683] Race conditions in pthread cancellation Date: Tue, 14 Jan 2014 14:51:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: nptl X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: critical X-Bugzilla-Who: carlos at redhat dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: 2.19 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-01/txt/msg00236.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=12683 --- Comment #17 from Carlos O'Donell --- (In reply to Rich Felker from comment #16) > There are several points at which the cancellation signal could arrive: > > 1. Before the final "testcancel" before the syscall is made. > 2. Between the "testcancel" and the syscall. > 3. While the syscall is blocked and no side effects have yet taken place. > 4. While the syscall is blocked but with some side effects already having > taken place (e.g. a partial read or write). > 5. After the syscall has returned. > > You want to act on cancellation in cases 1-3 but not in case 4 or 5. > Handling case 1 is of course trivial, since you're about to do a conditional > branch based on whether the thread has received a cancellation request; > nothing needs to be done in the signal handler (but it also wouldn't hurt to > handle it from the signal handler). Case 2 can be caught by the signal > handler determining that the saved program counter (from the ucontext_t) is > in some address range beginning just before the "testcancel" and ending with > the syscall instruction. > > The rest of the cases are the "tricky" part but it turns out they too are > easy: > > Case 3: In this case, except for certain syscalls that ALWAYS fail with > EINTR even for non-interrupting signals, the kernel will reset the program > counter to point at the syscall instruction during signal handling, so that > the syscall is restarted when the signal handler returns. So, from the > signal handler's standpoint, this looks the same as case 2, and thus it's > taken care of. > > Case 4: In this case, the kernel cannot restart the syscall; when it's > interrupted by a signal, the kernel must cause the syscall to return with > whatever partial result it obtained (e.g. partial read or write). In this > case, the saved program counter points just after the syscall instruction, > so the signal handler won't act on cancellation. > > Case 5: OK, I lied. This one is trivial too since the program counter is > past the syscall instruction already. Excellent. I like your idea then. It seems like a list of PC's using either markers or dwarf2 is the way to go here. > What about syscalls that fail with EINTR even when the signal handler is > non-interrupting? In this case, the syscall wrapper code can just check the > cancellation flag when the errno result is EINTR, and act on cancellation if > it's set. Note that an exception needs to be made for close(), where EINTR > should be treated as EINPROGRESS and thus not permit cancellation to take > place. We'll need a big disclaimer about close and a detailed comment. I know some of the details there, specifically that although EINTR has been returned the close will complete. > BTW, I should justify why the signal handler should be non-interrupting > (SA_RESTART): if it weren't, you would risk causing spurious EINTR in > programs not written to handle it, e.g. if the user incorrectly send signal > 32/33 to the process or if pthread_cancel were called while cancellation is > disabled in the target thread. The kernel folks have spent a great deal of > effort getting rid of spurious EINTRs (which cause all sorts of ugly bugs) > and it would be a shame to reintroduce them. Also it doesn't buy you > anything moving the cancellation action to the EINTR check after the syscall > returns; the same check in the signal handler that handles case 2 above also > handles the case of restartable syscalls correctly, for free. That makes sense. -- You are receiving this mail because: You are on the CC list for the bug.