[RFC] Propose fix for race conditions in pthread cancellation (bz#12683)

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
@ 2014-09-10 21:48 Adhemerval Zanella
  2014-09-10 22:00 ` Rich Felker
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Adhemerval Zanella @ 2014-09-10 21:48 UTC (permalink / raw)
  To: GNU C. Library, Rich Felker

Hi all,

I have summarized in [1] the current issues with GLIBC pthread cancellation system,
the current GLIBC implementation and the proposed solution by Rich Felker with the
adjustment required to enabled it on GLIBC.

It is still heavily WIP and I'm still planning to add more content, so any question,
comments, advices are welcomed.

The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
but initial results are promising. It is building on both powerpc64 and x86_64 
(it won't build on others platforms basically because I rewrite the way cancelable
syscalls are done).

Current NPTL testcase are all passing but:

FAIL: nptl/tst-cancel-wrappers
FAIL: nptl/tst-cancel20
FAIL: nptl/tst-cancel21-static
FAIL: nptl/tst-cancel4
FAIL: nptl/tst-cancel5
FAIL: nptl/tst-cancelx20
FAIL: nptl/tst-cancelx21
FAIL: nptl/tst-cancelx4
FAIL: nptl/tst-cancelx5
FAIL: nptl/tst-detach1

The 'nptl/tst-cancel-wrappers' is expected since I get rid of the 
enable_asynccancel/disable_asynccancel function, but the other are due the fact now
cancellation *will not* on one important case:

* syscall is blocked but with some side effects already having taken place (for
  instance partial read/write/send/etc.)

This is the cases for tst-cancel[4/5] that checks for cancelable write and send
and the way the test is code, kernel IP address from signal handler is *after*
syscall, indicating partial read/send.  Similar cases occurs for tst-cancel[20|21],
where the read returns after the syscall in pipe reading. I'm still checking
nptl/tst-detach1.

Anyway, now I would like comments about proposed solution and if the cases for
new failures should not be allowed or if testcases now should be adjusted.

I also note that this new implementation shows correct behavior on the testcases
from bug reported and replicated on bugzilla: first one does not show leaked
file descriptors and second correctly hangs.

[1] https://sourceware.org/glibc/wiki/Release/2.21/bz12683
[2] https://github.com/zatrazz/glibc/commits/master-bz12683

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-10 21:48 [RFC] Propose fix for race conditions in pthread cancellation (bz#12683) Adhemerval Zanella
@ 2014-09-10 22:00 ` Rich Felker
  2014-09-10 22:11   ` Adhemerval Zanella
  2014-09-12 14:44 ` Torvald Riegel
  2014-09-15 12:49 ` Florian Weimer
  2 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2014-09-10 22:00 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: GNU C. Library

On Wed, Sep 10, 2014 at 06:47:58PM -0300, Adhemerval Zanella wrote:
> Hi all,
> 
> I have summarized in [1] the current issues with GLIBC pthread cancellation system,
> the current GLIBC implementation and the proposed solution by Rich Felker with the
> adjustment required to enabled it on GLIBC.
> 
> It is still heavily WIP and I'm still planning to add more content, so any question,
> comments, advices are welcomed.
> 
> The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
> rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
> but initial results are promising. It is building on both powerpc64 and x86_64 
> (it won't build on others platforms basically because I rewrite the way cancelable
> syscalls are done).
> 
> Current NPTL testcase are all passing but:
> 
> FAIL: nptl/tst-cancel-wrappers
> FAIL: nptl/tst-cancel20
> FAIL: nptl/tst-cancel21-static
> FAIL: nptl/tst-cancel4
> FAIL: nptl/tst-cancel5
> FAIL: nptl/tst-cancelx20
> FAIL: nptl/tst-cancelx21
> FAIL: nptl/tst-cancelx4
> FAIL: nptl/tst-cancelx5
> FAIL: nptl/tst-detach1
> 
> The 'nptl/tst-cancel-wrappers' is expected since I get rid of the 
> enable_asynccancel/disable_asynccancel function, but the other are due the fact now
> cancellation *will not* on one important case:
> 
> * syscall is blocked but with some side effects already having taken place (for
>   instance partial read/write/send/etc.)

It's important that cancellation NOT be acted upon in these cases. The
side effects for them are not equivalent to EINTR (EINTR is only
allowed when no data was transferred) and thus acting on cancellation
would violate the rule that the side effects on cancellation must be
as if the call terminated with EINTR.

It is desirable that the partial read/write immediately return in this
case, rather than sitting around waiting for more data to be
transferred, and unless you go out of your way to get a different
behavior, that should come for free with most natural implementations
anyway. Then cancellation of course remains pending and will be acted
upon as soon as a cancellation point is called again. The important
thing is that the application has now had the ability to record what
side effects were completed, and which ones remain incomplete, so that
it has a consistent state when cancellation is acted upon.

> This is the cases for tst-cancel[4/5] that checks for cancelable write and send
> and the way the test is code, kernel IP address from signal handler is *after*
> syscall, indicating partial read/send.  Similar cases occurs for tst-cancel[20|21],
> where the read returns after the syscall in pipe reading. I'm still checking
> nptl/tst-detach1.

Yes, that's exactly how it's supposed to work.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-10 22:00 ` Rich Felker
@ 2014-09-10 22:11   ` Adhemerval Zanella
  0 siblings, 0 replies; 16+ messages in thread
From: Adhemerval Zanella @ 2014-09-10 22:11 UTC (permalink / raw)
  To: libc-alpha

On 10-09-2014 19:00, Rich Felker wrote:
> On Wed, Sep 10, 2014 at 06:47:58PM -0300, Adhemerval Zanella wrote:
>> Hi all,
>>
>> I have summarized in [1] the current issues with GLIBC pthread cancellation system,
>> the current GLIBC implementation and the proposed solution by Rich Felker with the
>> adjustment required to enabled it on GLIBC.
>>
>> It is still heavily WIP and I'm still planning to add more content, so any question,
>> comments, advices are welcomed.
>>
>> The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
>> rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
>> but initial results are promising. It is building on both powerpc64 and x86_64 
>> (it won't build on others platforms basically because I rewrite the way cancelable
>> syscalls are done).
>>
>> Current NPTL testcase are all passing but:
>>
>> FAIL: nptl/tst-cancel-wrappers
>> FAIL: nptl/tst-cancel20
>> FAIL: nptl/tst-cancel21-static
>> FAIL: nptl/tst-cancel4
>> FAIL: nptl/tst-cancel5
>> FAIL: nptl/tst-cancelx20
>> FAIL: nptl/tst-cancelx21
>> FAIL: nptl/tst-cancelx4
>> FAIL: nptl/tst-cancelx5
>> FAIL: nptl/tst-detach1
>>
>> The 'nptl/tst-cancel-wrappers' is expected since I get rid of the 
>> enable_asynccancel/disable_asynccancel function, but the other are due the fact now
>> cancellation *will not* on one important case:
>>
>> * syscall is blocked but with some side effects already having taken place (for
>>   instance partial read/write/send/etc.)
> It's important that cancellation NOT be acted upon in these cases. The
> side effects for them are not equivalent to EINTR (EINTR is only
> allowed when no data was transferred) and thus acting on cancellation
> would violate the rule that the side effects on cancellation must be
> as if the call terminated with EINTR.
>
> It is desirable that the partial read/write immediately return in this
> case, rather than sitting around waiting for more data to be
> transferred, and unless you go out of your way to get a different
> behavior, that should come for free with most natural implementations
> anyway. Then cancellation of course remains pending and will be acted
> upon as soon as a cancellation point is called again. The important
> thing is that the application has now had the ability to record what
> side effects were completed, and which ones remain incomplete, so that
> it has a consistent state when cancellation is acted upon.

I do agree that cancellation should not act upon the cases described and my
idea is in fact adjust testcase to check for partial read and call 
pthread_testcancel to check for pending cancellations. 

However this change current GLIBC expected behavior (which I do think is not
correct regarding the issues described), so I would like to know if maintainer
seems reasonable to change its behavior.

>
>> This is the cases for tst-cancel[4/5] that checks for cancelable write and send
>> and the way the test is code, kernel IP address from signal handler is *after*
>> syscall, indicating partial read/send.  Similar cases occurs for tst-cancel[20|21],
>> where the read returns after the syscall in pipe reading. I'm still checking
>> nptl/tst-detach1.
> Yes, that's exactly how it's supposed to work.
>
> Rich
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-10 21:48 [RFC] Propose fix for race conditions in pthread cancellation (bz#12683) Adhemerval Zanella
  2014-09-10 22:00 ` Rich Felker
@ 2014-09-12 14:44 ` Torvald Riegel
  2014-09-12 15:32   ` Rich Felker
                     ` (2 more replies)
  2014-09-15 12:49 ` Florian Weimer
  2 siblings, 3 replies; 16+ messages in thread
From: Torvald Riegel @ 2014-09-12 14:44 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: GNU C. Library, Rich Felker

On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> I have summarized in [1] the current issues with GLIBC pthread cancellation system,
> the current GLIBC implementation and the proposed solution by Rich Felker with the
> adjustment required to enabled it on GLIBC.
> 
> It is still heavily WIP and I'm still planning to add more content, so any question,
> comments, advices are welcomed.
> 
> The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
> rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
> but initial results are promising. It is building on both powerpc64 and x86_64 
> (it won't build on others platforms basically because I rewrite the way cancelable
> syscalls are done).

The general direction seems good to me.  Thanks for working on this.  I
have some questions/comments about the steps you outline in the "GLIBC
adjustment" section on the wiki page:

"4. Adjust nptl/pthread_cancel.c to send an signal instead of acting
directly. This avoid synchronization issues about updating the
cancellation status and also focus the logic on signal handler and
cancellation syscall code."

The current code seems focused on trying to avoid sending signal if
possible, seemingly because of performance concerns.  My gut feeling is
that cancellation doesn't need to have low latency, but do we have
sufficient confidence that always sending the signal is alright?

I believe we can still avoid sending the signal with the new scheme;
what we'd have to do is let the code for cancellable syscalls mark
(using atomic ops) whether it's currently executing or not (ie,
fetch_and_set to true and fetch_and_set to previous value); then the
pthread_cancel should see whether it needs to send a signal or not.
That adds a bit more synchronization, but seems possible.

"8. Adjust 'lowlevellock.h' arch-specific implementations to provide
cancelable futex calls (used in libpthread code)."

Only FUTEX_WAIT should be cancellable, I suppose.

I've been thinking about whether it would be easier for the
implementation of cancellable pthreads functions to just call a
noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
wait but calling pthread_testcancel instead so that any pending
cancellation is acted upon).  However, this seems to save less
complexity than just doing for FUTEX_WAIT what we do for all the other
cancellable syscalls, so it's probably not worth it.

I'd add another thing to the list of steps, though: For future-proofing,
it would be good pthread_setcancelstate and pthread_setcanceltype would
have an explicit compiler barrier in them.  If at some point we have LTO
including those functions, we use C11 atomics, and compilers start
optimizing those more aggressively, then I believe we're at risk that
the compiler reorders code across the atomics in pthread_setcancelstate;
this could lead to cancellation handlers seeing unexpected state when
they are actually run.  C11 has tighter rules for what's allowed in
signal handlers (IIRC, these are still under discussion, at least in the
C++ committee) than what POSIX requires, yet the cancellation handler is
similar to a signal handler.  A compiler barrier should hopefully
prevent any issues due to that mismatch.  Thoughts?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 14:44 ` Torvald Riegel
@ 2014-09-12 15:32   ` Rich Felker
  2014-09-12 16:11     ` Torvald Riegel
  2014-09-12 17:33   ` Joseph S. Myers
  2014-09-17 22:13   ` Adhemerval Zanella
  2 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2014-09-12 15:32 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: Adhemerval Zanella, GNU C. Library

On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > I have summarized in [1] the current issues with GLIBC pthread cancellation system,
> > the current GLIBC implementation and the proposed solution by Rich Felker with the
> > adjustment required to enabled it on GLIBC.
> > 
> > It is still heavily WIP and I'm still planning to add more content, so any question,
> > comments, advices are welcomed.
> > 
> > The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
> > rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
> > but initial results are promising. It is building on both powerpc64 and x86_64 
> > (it won't build on others platforms basically because I rewrite the way cancelable
> > syscalls are done).
> 
> The general direction seems good to me.  Thanks for working on this.  I
> have some questions/comments about the steps you outline in the "GLIBC
> adjustment" section on the wiki page:
> 
> "4. Adjust nptl/pthread_cancel.c to send an signal instead of acting
> directly. This avoid synchronization issues about updating the
> cancellation status and also focus the logic on signal handler and
> cancellation syscall code."
> 
> The current code seems focused on trying to avoid sending signal if
> possible, seemingly because of performance concerns.  My gut feeling is
> that cancellation doesn't need to have low latency, but do we have
> sufficient confidence that always sending the signal is alright?
> 
> I believe we can still avoid sending the signal with the new scheme;
> what we'd have to do is let the code for cancellable syscalls mark
> (using atomic ops) whether it's currently executing or not (ie,
> fetch_and_set to true and fetch_and_set to previous value); then the
> pthread_cancel should see whether it needs to send a signal or not.
> That adds a bit more synchronization, but seems possible.

I agree that avoiding sending a signal is still desirable. One reason
that may not have been considered is short reads/writes. If you send a
signal which you don't expect to be acted upon, chances are fairly
good that it's going to cause spurious short reads/writes due to
interrupting a syscall that has already transferred some data. This is
not a conformance problem and not critical, but it still seems
desirable to avoid.

> "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> cancelable futex calls (used in libpthread code)."
> 
> Only FUTEX_WAIT should be cancellable, I suppose.
> 
> I've been thinking about whether it would be easier for the
> implementation of cancellable pthreads functions to just call a
> noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> wait but calling pthread_testcancel instead so that any pending
> cancellation is acted upon).  However, this seems to save less
> complexity than just doing for FUTEX_WAIT what we do for all the other
> cancellable syscalls, so it's probably not worth it.

This is not viable; it has race conditions, of the exact type which
the futex syscall and the new cancellation system are designed to
avoid.

AFAIK it also only works if the cancellation signal is an interrupting
signal (no SA_RESTART) which is a big problem -- it would introduce
illegal EINTR results into programs not expecting to receive them and
not prepared to handle them.

> I'd add another thing to the list of steps, though: For future-proofing,
> it would be good pthread_setcancelstate and pthread_setcanceltype would
> have an explicit compiler barrier in them.  If at some point we have LTO
> including those functions, we use C11 atomics, and compilers start
> optimizing those more aggressively, then I believe we're at risk that
> the compiler reorders code across the atomics in pthread_setcancelstate;
> this could lead to cancellation handlers seeing unexpected state when
> they are actually run.  C11 has tighter rules for what's allowed in
> signal handlers (IIRC, these are still under discussion, at least in the
> C++ committee) than what POSIX requires, yet the cancellation handler is
> similar to a signal handler.  A compiler barrier should hopefully
> prevent any issues due to that mismatch.  Thoughts?

Since you're only concerned about signal handlers, not access from
other threads, volatile is probably sufficient here.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 15:32   ` Rich Felker
@ 2014-09-12 16:11     ` Torvald Riegel
  2014-09-12 17:17       ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Torvald Riegel @ 2014-09-12 16:11 UTC (permalink / raw)
  To: Rich Felker; +Cc: Adhemerval Zanella, GNU C. Library

On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > cancelable futex calls (used in libpthread code)."
> > 
> > Only FUTEX_WAIT should be cancellable, I suppose.
> > 
> > I've been thinking about whether it would be easier for the
> > implementation of cancellable pthreads functions to just call a
> > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > wait but calling pthread_testcancel instead so that any pending
> > cancellation is acted upon).  However, this seems to save less
> > complexity than just doing for FUTEX_WAIT what we do for all the other
> > cancellable syscalls, so it's probably not worth it.
> 
> This is not viable; it has race conditions, of the exact type which
> the futex syscall and the new cancellation system are designed to
> avoid.

The new scheme would be used, but would always set the cancellation as
pending.  So what I thought about would primarily change the way we act
on cancellation, not when it's considered to have happened.  There's no
real side effect in waiting, so futex_wait is a special case.

> AFAIK it also only works if the cancellation signal is an interrupting
> signal (no SA_RESTART) which is a big problem -- it would introduce
> illegal EINTR results into programs not expecting to receive them and
> not prepared to handle them.

That sounds right.

> > I'd add another thing to the list of steps, though: For future-proofing,
> > it would be good pthread_setcancelstate and pthread_setcanceltype would
> > have an explicit compiler barrier in them.  If at some point we have LTO
> > including those functions, we use C11 atomics, and compilers start
> > optimizing those more aggressively, then I believe we're at risk that
> > the compiler reorders code across the atomics in pthread_setcancelstate;
> > this could lead to cancellation handlers seeing unexpected state when
> > they are actually run.  C11 has tighter rules for what's allowed in
> > signal handlers (IIRC, these are still under discussion, at least in the
> > C++ committee) than what POSIX requires, yet the cancellation handler is
> > similar to a signal handler.  A compiler barrier should hopefully
> > prevent any issues due to that mismatch.  Thoughts?
> 
> Since you're only concerned about signal handlers, not access from
> other threads, volatile is probably sufficient here.

Yes, we don't need HW synchronization between the interrupted code and
the cancellation handler.  Requiring programmers to use volatile
whenever they need to communicate from interrupted code to cancellation
handlers (on both sides!) could work in practice, but only if the
pthread_setcancelstate etc. look to the compiler like they could call
code that uses volatiles -- which isn't the case for these functions
(assuming the compiler understands and optimizes atomics).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 16:11     ` Torvald Riegel
@ 2014-09-12 17:17       ` Rich Felker
  2014-09-12 22:44         ` Torvald Riegel
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2014-09-12 17:17 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: Adhemerval Zanella, GNU C. Library

On Fri, Sep 12, 2014 at 06:11:30PM +0200, Torvald Riegel wrote:
> On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> > On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > > cancelable futex calls (used in libpthread code)."
> > > 
> > > Only FUTEX_WAIT should be cancellable, I suppose.
> > > 
> > > I've been thinking about whether it would be easier for the
> > > implementation of cancellable pthreads functions to just call a
> > > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > > wait but calling pthread_testcancel instead so that any pending
> > > cancellation is acted upon).  However, this seems to save less
> > > complexity than just doing for FUTEX_WAIT what we do for all the other
> > > cancellable syscalls, so it's probably not worth it.
> > 
> > This is not viable; it has race conditions, of the exact type which
> > the futex syscall and the new cancellation system are designed to
> > avoid.
> 
> The new scheme would be used, but would always set the cancellation as
> pending.  So what I thought about would primarily change the way we act
> on cancellation, not when it's considered to have happened.  There's no
> real side effect in waiting, so futex_wait is a special case.

The problem is that you end up waiting unboundedly long, possibly
forever. Consider cancellation of a pthread_cond_wait that will never
be signaled, where the cancellation request arrives in the race
window.

> > > I'd add another thing to the list of steps, though: For future-proofing,
> > > it would be good pthread_setcancelstate and pthread_setcanceltype would
> > > have an explicit compiler barrier in them.  If at some point we have LTO
> > > including those functions, we use C11 atomics, and compilers start
> > > optimizing those more aggressively, then I believe we're at risk that
> > > the compiler reorders code across the atomics in pthread_setcancelstate;
> > > this could lead to cancellation handlers seeing unexpected state when
> > > they are actually run.  C11 has tighter rules for what's allowed in
> > > signal handlers (IIRC, these are still under discussion, at least in the
> > > C++ committee) than what POSIX requires, yet the cancellation handler is
> > > similar to a signal handler.  A compiler barrier should hopefully
> > > prevent any issues due to that mismatch.  Thoughts?
> > 
> > Since you're only concerned about signal handlers, not access from
> > other threads, volatile is probably sufficient here.
> 
> Yes, we don't need HW synchronization between the interrupted code and
> the cancellation handler.  Requiring programmers to use volatile
> whenever they need to communicate from interrupted code to cancellation
> handlers (on both sides!) could work in practice, but only if the
> pthread_setcancelstate etc. look to the compiler like they could call
> code that uses volatiles -- which isn't the case for these functions
> (assuming the compiler understands and optimizes atomics).

Oh, I see -- you're concerned about reordering with respect to code
in the calling application. In that case I think you're right -- you
need a full compiler barrier of some sort.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 17:17       ` Rich Felker
@ 2014-09-12 22:44         ` Torvald Riegel
  2014-09-13  1:58           ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Torvald Riegel @ 2014-09-12 22:44 UTC (permalink / raw)
  To: Rich Felker; +Cc: Adhemerval Zanella, GNU C. Library

On Fri, 2014-09-12 at 13:17 -0400, Rich Felker wrote:
> On Fri, Sep 12, 2014 at 06:11:30PM +0200, Torvald Riegel wrote:
> > On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> > > On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > > > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > > > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > > > cancelable futex calls (used in libpthread code)."
> > > > 
> > > > Only FUTEX_WAIT should be cancellable, I suppose.
> > > > 
> > > > I've been thinking about whether it would be easier for the
> > > > implementation of cancellable pthreads functions to just call a
> > > > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > > > wait but calling pthread_testcancel instead so that any pending
> > > > cancellation is acted upon).  However, this seems to save less
> > > > complexity than just doing for FUTEX_WAIT what we do for all the other
> > > > cancellable syscalls, so it's probably not worth it.
> > > 
> > > This is not viable; it has race conditions, of the exact type which
> > > the futex syscall and the new cancellation system are designed to
> > > avoid.
> > 
> > The new scheme would be used, but would always set the cancellation as
> > pending.  So what I thought about would primarily change the way we act
> > on cancellation, not when it's considered to have happened.  There's no
> > real side effect in waiting, so futex_wait is a special case.
> 
> The problem is that you end up waiting unboundedly long, possibly
> forever. Consider cancellation of a pthread_cond_wait that will never
> be signaled, where the cancellation request arrives in the race
> window.

What I had in mind would interrupt the syscall like the new scheme
proposes to do, so cancellation would happen like in the normal case.
It would not be just checking a flag before/after a syscall.  What would
be different is changing that we just return the EINTR to the calling
pthread code (e.g., the pthread_cond_wait) and letting it do the
cancelling, instead of setting up a pthread_cond_wait-specific
cancellation handler to do that.

But that probably gives too little benefit compared to having a special
case, so I doubt it's worth it.  Nor worth a long discussion I guess :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 22:44         ` Torvald Riegel
@ 2014-09-13  1:58           ` Rich Felker
  2014-09-14 18:00             ` Torvald Riegel
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Felker @ 2014-09-13  1:58 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: Adhemerval Zanella, GNU C. Library

On Sat, Sep 13, 2014 at 12:44:32AM +0200, Torvald Riegel wrote:
> On Fri, 2014-09-12 at 13:17 -0400, Rich Felker wrote:
> > On Fri, Sep 12, 2014 at 06:11:30PM +0200, Torvald Riegel wrote:
> > > On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> > > > On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > > > > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > > > > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > > > > cancelable futex calls (used in libpthread code)."
> > > > > 
> > > > > Only FUTEX_WAIT should be cancellable, I suppose.
> > > > > 
> > > > > I've been thinking about whether it would be easier for the
> > > > > implementation of cancellable pthreads functions to just call a
> > > > > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > > > > wait but calling pthread_testcancel instead so that any pending
> > > > > cancellation is acted upon).  However, this seems to save less
> > > > > complexity than just doing for FUTEX_WAIT what we do for all the other
> > > > > cancellable syscalls, so it's probably not worth it.
> > > > 
> > > > This is not viable; it has race conditions, of the exact type which
> > > > the futex syscall and the new cancellation system are designed to
> > > > avoid.
> > > 
> > > The new scheme would be used, but would always set the cancellation as
> > > pending.  So what I thought about would primarily change the way we act
> > > on cancellation, not when it's considered to have happened.  There's no
> > > real side effect in waiting, so futex_wait is a special case.
> > 
> > The problem is that you end up waiting unboundedly long, possibly
> > forever. Consider cancellation of a pthread_cond_wait that will never
> > be signaled, where the cancellation request arrives in the race
> > window.
> 
> What I had in mind would interrupt the syscall like the new scheme
> proposes to do, so cancellation would happen like in the normal case.
> It would not be just checking a flag before/after a syscall.  What would
> be different is changing that we just return the EINTR to the calling
> pthread code (e.g., the pthread_cond_wait) and letting it do the
> cancelling, instead of setting up a pthread_cond_wait-specific
> cancellation handler to do that.
> 
> But that probably gives too little benefit compared to having a special
> case, so I doubt it's worth it.  Nor worth a long discussion I guess :)

Actually, cancellation of pthread_cond_[timed]wait is complicated.
Depending on how unblocking a waiter works, it's possible that the
thread being cancelled has already "consumed the signal", and
therefore can't act on cancellation. This is a case where the program
counter at cancellation signal time is not sufficient to determine if
cancellation can be acted upon; the decision needs to be made later
based on userspace criteria (cond var state), not based on the
completion or non-completion of the futex syscall.

So, your idea about wanting futex to return even when cancelled is
basically right. However, EINTR cannot be the mechanism for this since
the signal has to be non-interrupting (like basically everything else,
futex restarts automatically for SA_RESTART signal handlers; you never
see EINTR). Even if you could get EINTR, there would still be a race
condition where you would fail to ever wakeup.

Addressing this in my implementation in musl is pending. I have a
patch that solves it via longjmp out of the cancellation handler, but
this is ugly and costly in the common case where cancellation never
happens. A nicer design I'm working on is having an alternate
cancellation mode where, rather than acting on cancellation when it's
triggered, the handler will disable cancellation then adjust the saved
program counter so that, on return from the signal handler, the
syscall appears to have failed with ECANCELED. The caller can then
decide what to do: just reenabling cancellation leaves it pending, or
it can be call pthread_testcancel() after reenabling and restoring the
cancellation mode in order to act on it.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-13  1:58           ` Rich Felker
@ 2014-09-14 18:00             ` Torvald Riegel
  2014-09-15  0:46               ` Rich Felker
  0 siblings, 1 reply; 16+ messages in thread
From: Torvald Riegel @ 2014-09-14 18:00 UTC (permalink / raw)
  To: Rich Felker; +Cc: Adhemerval Zanella, GNU C. Library

On Fri, 2014-09-12 at 21:58 -0400, Rich Felker wrote:
> On Sat, Sep 13, 2014 at 12:44:32AM +0200, Torvald Riegel wrote:
> > On Fri, 2014-09-12 at 13:17 -0400, Rich Felker wrote:
> > > On Fri, Sep 12, 2014 at 06:11:30PM +0200, Torvald Riegel wrote:
> > > > On Fri, 2014-09-12 at 11:32 -0400, Rich Felker wrote:
> > > > > On Fri, Sep 12, 2014 at 04:44:26PM +0200, Torvald Riegel wrote:
> > > > > > On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
> > > > > > "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> > > > > > cancelable futex calls (used in libpthread code)."
> > > > > > 
> > > > > > Only FUTEX_WAIT should be cancellable, I suppose.
> > > > > > 
> > > > > > I've been thinking about whether it would be easier for the
> > > > > > implementation of cancellable pthreads functions to just call a
> > > > > > noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> > > > > > wait but calling pthread_testcancel instead so that any pending
> > > > > > cancellation is acted upon).  However, this seems to save less
> > > > > > complexity than just doing for FUTEX_WAIT what we do for all the other
> > > > > > cancellable syscalls, so it's probably not worth it.
> > > > > 
> > > > > This is not viable; it has race conditions, of the exact type which
> > > > > the futex syscall and the new cancellation system are designed to
> > > > > avoid.
> > > > 
> > > > The new scheme would be used, but would always set the cancellation as
> > > > pending.  So what I thought about would primarily change the way we act
> > > > on cancellation, not when it's considered to have happened.  There's no
> > > > real side effect in waiting, so futex_wait is a special case.
> > > 
> > > The problem is that you end up waiting unboundedly long, possibly
> > > forever. Consider cancellation of a pthread_cond_wait that will never
> > > be signaled, where the cancellation request arrives in the race
> > > window.
> > 
> > What I had in mind would interrupt the syscall like the new scheme
> > proposes to do, so cancellation would happen like in the normal case.
> > It would not be just checking a flag before/after a syscall.  What would
> > be different is changing that we just return the EINTR to the calling
> > pthread code (e.g., the pthread_cond_wait) and letting it do the
> > cancelling, instead of setting up a pthread_cond_wait-specific
> > cancellation handler to do that.
> > 
> > But that probably gives too little benefit compared to having a special
> > case, so I doubt it's worth it.  Nor worth a long discussion I guess :)
> 
> Actually, cancellation of pthread_cond_[timed]wait is complicated.
> Depending on how unblocking a waiter works, it's possible that the
> thread being cancelled has already "consumed the signal", and
> therefore can't act on cancellation. This is a case where the program
> counter at cancellation signal time is not sufficient to determine if
> cancellation can be acted upon; the decision needs to be made later
> based on userspace criteria (cond var state), not based on the
> completion or non-completion of the futex syscall.

If something that got cancelled has consumed a signal already, then this
isn't visible to other threads yet except that they don't wake up.  Have
you considered sending another signal (which is indistinguishable from
the one consumed by the cancelled thread) to undo the consumption?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-14 18:00             ` Torvald Riegel
@ 2014-09-15  0:46               ` Rich Felker
  0 siblings, 0 replies; 16+ messages in thread
From: Rich Felker @ 2014-09-15  0:46 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: Adhemerval Zanella, GNU C. Library

On Sun, Sep 14, 2014 at 08:00:49PM +0200, Torvald Riegel wrote:
> > Actually, cancellation of pthread_cond_[timed]wait is complicated.
> > Depending on how unblocking a waiter works, it's possible that the
> > thread being cancelled has already "consumed the signal", and
> > therefore can't act on cancellation. This is a case where the program
> > counter at cancellation signal time is not sufficient to determine if
> > cancellation can be acted upon; the decision needs to be made later
> > based on userspace criteria (cond var state), not based on the
> > completion or non-completion of the futex syscall.
> 
> If something that got cancelled has consumed a signal already, then this
> isn't visible to other threads yet except that they don't wake up.  Have
> you considered sending another signal (which is indistinguishable from
> the one consumed by the cancelled thread) to undo the consumption?

It's not that simple, because it's hard to guarantee waking a waiter
from the right set. The signal that was consumed by the cancelled
thread must wake a waiter which was already a waiter at the time of
that signal, not another waiter that arrives later. There may be other
difficulties too.

Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 14:44 ` Torvald Riegel
  2014-09-12 15:32   ` Rich Felker
@ 2014-09-12 17:33   ` Joseph S. Myers
  2014-09-14 18:04     ` Torvald Riegel
  2014-09-17 22:13   ` Adhemerval Zanella
  2 siblings, 1 reply; 16+ messages in thread
From: Joseph S. Myers @ 2014-09-12 17:33 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: Adhemerval Zanella, GNU C. Library, Rich Felker

On Fri, 12 Sep 2014, Torvald Riegel wrote:

> I believe we can still avoid sending the signal with the new scheme;
> what we'd have to do is let the code for cancellable syscalls mark
> (using atomic ops) whether it's currently executing or not (ie,
> fetch_and_set to true and fetch_and_set to previous value); then the
> pthread_cancel should see whether it needs to send a signal or not.
> That adds a bit more synchronization, but seems possible.

Would this be calling C functions that do those atomic operations (in 
similar places to where there are currently calls to 
__pthread_enable_asynccancel etc.)?  That would seem better than embedding 
the atomic operation code directly in the macros that generate .S syscall 
stubs, at least for architectures where there are multiple variants for 
how atomic operations are implemented depending on things such as the 
architecture version.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 17:33   ` Joseph S. Myers
@ 2014-09-14 18:04     ` Torvald Riegel
  0 siblings, 0 replies; 16+ messages in thread
From: Torvald Riegel @ 2014-09-14 18:04 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Adhemerval Zanella, GNU C. Library, Rich Felker

On Fri, 2014-09-12 at 17:33 +0000, Joseph S. Myers wrote:
> On Fri, 12 Sep 2014, Torvald Riegel wrote:
> 
> > I believe we can still avoid sending the signal with the new scheme;
> > what we'd have to do is let the code for cancellable syscalls mark
> > (using atomic ops) whether it's currently executing or not (ie,
> > fetch_and_set to true and fetch_and_set to previous value); then the
> > pthread_cancel should see whether it needs to send a signal or not.
> > That adds a bit more synchronization, but seems possible.
> 
> Would this be calling C functions that do those atomic operations (in 
> similar places to where there are currently calls to 
> __pthread_enable_asynccancel etc.)?  That would seem better than embedding 
> the atomic operation code directly in the macros that generate .S syscall 
> stubs, at least for architectures where there are multiple variants for 
> how atomic operations are implemented depending on things such as the 
> architecture version.

I haven't thought it through, but this should be possible I guess.  What
we want is to have an optimization "around" the new scheme, where both
the cancellation target and the cancellation initiator agree that they
got canceled.  The agreement would be a CAS or atomic_exchange, and we
might have to use pthread_testcancel when we find out we were canceled.
That would mean we have this check twice (IIRC, the new scheme's asm
code has it at the beginning of the region), but still might be better.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-12 14:44 ` Torvald Riegel
  2014-09-12 15:32   ` Rich Felker
  2014-09-12 17:33   ` Joseph S. Myers
@ 2014-09-17 22:13   ` Adhemerval Zanella
  2 siblings, 0 replies; 16+ messages in thread
From: Adhemerval Zanella @ 2014-09-17 22:13 UTC (permalink / raw)
  To: libc-alpha

On 12-09-2014 11:44, Torvald Riegel wrote:
> On Wed, 2014-09-10 at 18:47 -0300, Adhemerval Zanella wrote:
>> I have summarized in [1] the current issues with GLIBC pthread cancellation system,
>> the current GLIBC implementation and the proposed solution by Rich Felker with the
>> adjustment required to enabled it on GLIBC.
>>
>> It is still heavily WIP and I'm still planning to add more content, so any question,
>> comments, advices are welcomed.
>>
>> The GLIBC adjustment to proposed solution is in fact the current work I'm doing to
>> rewrite pthread cancellation subsystem [2]. My code still needs a *lot* of cleanup,
>> but initial results are promising. It is building on both powerpc64 and x86_64 
>> (it won't build on others platforms basically because I rewrite the way cancelable
>> syscalls are done).
> The general direction seems good to me.  Thanks for working on this.  I
> have some questions/comments about the steps you outline in the "GLIBC
> adjustment" section on the wiki page:
>
> "4. Adjust nptl/pthread_cancel.c to send an signal instead of acting
> directly. This avoid synchronization issues about updating the
> cancellation status and also focus the logic on signal handler and
> cancellation syscall code."
>
> The current code seems focused on trying to avoid sending signal if
> possible, seemingly because of performance concerns.  My gut feeling is
> that cancellation doesn't need to have low latency, but do we have
> sufficient confidence that always sending the signal is alright?
>
> I believe we can still avoid sending the signal with the new scheme;
> what we'd have to do is let the code for cancellable syscalls mark
> (using atomic ops) whether it's currently executing or not (ie,
> fetch_and_set to true and fetch_and_set to previous value); then the
> pthread_cancel should see whether it needs to send a signal or not.
> That adds a bit more synchronization, but seems possible.

This solution seems feasible, but I would like to address it, if possible, on a next
patch iterations.  Right now, I would like to focus on implementation correctness
for such solution.  I will add on the wiki the suggestion. 

>
>
> "8. Adjust 'lowlevellock.h' arch-specific implementations to provide
> cancelable futex calls (used in libpthread code)."
>
> Only FUTEX_WAIT should be cancellable, I suppose.

Yes, I had to add cancel wrappers just for lll_futex_* that do FUTEX_WAIT.

>
> I've been thinking about whether it would be easier for the
> implementation of cancellable pthreads functions to just call a
> noncancellable FUTEX_WAIT and handle EINTR properly (by not retrying the
> wait but calling pthread_testcancel instead so that any pending
> cancellation is acted upon).  However, this seems to save less
> complexity than just doing for FUTEX_WAIT what we do for all the other
> cancellable syscalls, so it's probably not worth it.
>
>
> I'd add another thing to the list of steps, though: For future-proofing,
> it would be good pthread_setcancelstate and pthread_setcanceltype would
> have an explicit compiler barrier in them.  If at some point we have LTO
> including those functions, we use C11 atomics, and compilers start
> optimizing those more aggressively, then I believe we're at risk that
> the compiler reorders code across the atomics in pthread_setcancelstate;
> this could lead to cancellation handlers seeing unexpected state when
> they are actually run.  C11 has tighter rules for what's allowed in
> signal handlers (IIRC, these are still under discussion, at least in the
> C++ committee) than what POSIX requires, yet the cancellation handler is
> similar to a signal handler.  A compiler barrier should hopefully
> prevent any issues due to that mismatch.  Thoughts?
>
I will add this on the future iterations for this patch, but since it is not really
and issue right now, I prefer to work after this fix is corrected.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-10 21:48 [RFC] Propose fix for race conditions in pthread cancellation (bz#12683) Adhemerval Zanella
  2014-09-10 22:00 ` Rich Felker
  2014-09-12 14:44 ` Torvald Riegel
@ 2014-09-15 12:49 ` Florian Weimer
  2014-09-15 14:39   ` Torvald Riegel
  2 siblings, 1 reply; 16+ messages in thread
From: Florian Weimer @ 2014-09-15 12:49 UTC (permalink / raw)
  To: Adhemerval Zanella, GNU C. Library, Rich Felker

On 09/10/2014 11:47 PM, Adhemerval Zanella wrote:
> Anyway, now I would like comments about proposed solution and if the cases for
> new failures should not be allowed or if testcases now should be adjusted.

Will it be possible to use this mechanism to make selected lock-free 
algorithms inside glibc async-signal-safe?  I think rewinding to a 
previous address will be sufficient in many cases, but a more general 
approach which completes execution of a critical section in the signal 
handler could be desirable.

-- 
Florian Weimer / Red Hat Product Security

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC] Propose fix for race conditions in pthread cancellation (bz#12683)
  2014-09-15 12:49 ` Florian Weimer
@ 2014-09-15 14:39   ` Torvald Riegel
  0 siblings, 0 replies; 16+ messages in thread
From: Torvald Riegel @ 2014-09-15 14:39 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Adhemerval Zanella, GNU C. Library, Rich Felker

On Mon, 2014-09-15 at 14:49 +0200, Florian Weimer wrote:
> On 09/10/2014 11:47 PM, Adhemerval Zanella wrote:
> > Anyway, now I would like comments about proposed solution and if the cases for
> > new failures should not be allowed or if testcases now should be adjusted.
> 
> Will it be possible to use this mechanism to make selected lock-free 
> algorithms inside glibc async-signal-safe?

Not without additional care, I suppose.  Lock freedom allows the other
threads to make progress, but the thread itself can still do blocking
operations as long as it doesn't communicate with other threads in the
meantime; so, for example, there could be other random blocking stuff
embedded in a lock-free algorithm.

> I think rewinding to a 
> previous address will be sufficient in many cases, but a more general 
> approach which completes execution of a critical section in the signal 
> handler could be desirable.

That would require more care than just having a lock-free algorithm as
far as communication with other threads is concerned, because the
interruptible code than *also* needs to be lock-free wrt. the
continuation / fix-up code in the signal handler.  While this may not
need HW synchronization, the compiler still needs to be aware of this.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-09-17 22:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-10 21:48 [RFC] Propose fix for race conditions in pthread cancellation (bz#12683) Adhemerval Zanella
2014-09-10 22:00 ` Rich Felker
2014-09-10 22:11   ` Adhemerval Zanella
2014-09-12 14:44 ` Torvald Riegel
2014-09-12 15:32   ` Rich Felker
2014-09-12 16:11     ` Torvald Riegel
2014-09-12 17:17       ` Rich Felker
2014-09-12 22:44         ` Torvald Riegel
2014-09-13  1:58           ` Rich Felker
2014-09-14 18:00             ` Torvald Riegel
2014-09-15  0:46               ` Rich Felker
2014-09-12 17:33   ` Joseph S. Myers
2014-09-14 18:04     ` Torvald Riegel
2014-09-17 22:13   ` Adhemerval Zanella
2014-09-15 12:49 ` Florian Weimer
2014-09-15 14:39   ` Torvald Riegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).