* Futex error handling @ 2014-09-16 15:36 Torvald Riegel 2014-09-16 16:56 ` Rich Felker ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Torvald Riegel @ 2014-09-16 15:36 UTC (permalink / raw) To: GLIBC Devel; +Cc: Darren Hart We got complains from the kernel side that glibc wouldn't react properly to futex errors being returned. Thus, I'm looking at what we'd need to actually improve. I'm using this here as a documentation for futex error codes: https://lkml.org/lkml/2014/5/15/356 Generally, we have three categories of faults (ie, the cause for an error/failure): * Bug in glibc ("BL") * Bug in the client program ("BP") * Failures that are neither a bug in glibc nor the program ("F") Also, there are cases where it's not a "real" failure, but just something that is expected behavior that needs to be handled ("NF"). I'm not aware of a general policy about whether glibc should abort or assert (ie, abort only with assertion checks enabled) when the fault is in the BL or BP categories. I'd say we don't, because there's no way to handle it anyway, and other things will likely go wrong; but I don't have a strong opinion. Thoughts? For every futex op, here's a list of how I'd categorize the possible error codes (I'm ignoring ENOSYS, which is NF when feature testing (or BL)): FUTEX_WAIT: * EFAULT is either BL or BP. Nothing we can do. Should have failed earlier when we accessed the futex variable. * EINVAL (alignment and timeout normalization) is BL/BP. * EWOULDBLOCK, ETIMEDOUT are NF. FUTEX_WAKE, FUTEX_WAKE_OP: * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this case. This is due to how futexes work when combined with certain rules for destruction of the underlying synchronization data structure; see my description of the mutex destruction issue (but this can happen with other data structures such as semaphores or cond vars too): https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html * EINVAL (futex alignment) is BL/BP. * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or* NF. The latter is caused by the mutex destruction issue, only that a pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory location but one which has been reused for a PI futex. Thus, we must not abort or assert in this case. FUTEX_REQUEUE: * Like FUTEX_WAKE, except that it's not safe to use concurrently with possible destruction / reuse of the futex memory (because requeueing to a futex that's unrelated to the new futex located in reused memory is bad). FUTEX_REQUEUE_CMP: * Like FUTEX_REQUEUE. EAGAIN is NF. FUTEX_WAKE_OP: * Haven't looked at this yet. Only used in condvars, and might not be necessary for a condvar that's not based on a condvar-internal lock. FUTEX_WAIT_BITSET / FUTEX_WAKE_BITSET: * Like FUTEX_WAIT / FUTEX_WAKE. The additional EINVAL is BL. FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI: * EFAULT is BL/BP. * ENOMEM is F. We need to handle this. * EINVAL, EPERM, ESRCH are BL/BP. * EAGAIN and ETIMEDOUT are NF. * EDEADLOCK is BP (or BL). * EOWNERDIED is F. FUTEX_UNLOCK_PI: (* I guess this can return EFAULT too, which is BL/BP.) * EINVAL and EPERM are BL/BP. I don't think there's a mutex destruction issue with PI locks because the kernel takes care of both resetting the value of the futex var and waking up threads; it should do so in a way that won't access reused memory. I guess we should check that though... FUTEX_WAIT_REQUEUE_PI: * EFAULT and EINVAL are BL/BP. * EWOULDBLOCK and ETIMEDOUT are NF. * EOWNERDIED is F. FUTEX_CMP_REQUEUE_PI is like FUTEX_CMP_REQUEUE except: * ENOMEM is F. * EPERM and ESRCH are BL/BP. * EDEADLOCK is BP (or BL). I think the next steps to improve this should be: 1) Getting consensus on how we want to handle BL and BP in general. 2) Applying the outcome of that to the list above and getting consensus on the result. 3) For each case of F, find the best way to report it to the caller (e.g., error code from the pthreads function, abort, ...). 4) Change each use of the futexes accordingly, one at a time. I've asked Michael Kerrisk for the state of the futex error docs, but haven't gotten a reply yet. (Last time I checked, the new input from the email I referred to above wasn't part of the futex docs yet.) Thoughts? ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 15:36 Futex error handling Torvald Riegel @ 2014-09-16 16:56 ` Rich Felker 2014-09-16 18:13 ` Torvald Riegel 2014-09-17 19:41 ` Roland McGrath 2014-10-20 16:15 ` Futex error handling Torvald Riegel 2 siblings, 1 reply; 35+ messages in thread From: Rich Felker @ 2014-09-16 16:56 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart On Tue, Sep 16, 2014 at 05:36:25PM +0200, Torvald Riegel wrote: > We got complains from the kernel side that glibc wouldn't react properly > to futex errors being returned. Are these complaints public anywhere? > Thus, I'm looking at what we'd need to > actually improve. I'm using this here as a documentation for futex > error codes: https://lkml.org/lkml/2014/5/15/356 > > Generally, we have three categories of faults (ie, the cause for an > error/failure): > * Bug in glibc ("BL") > * Bug in the client program ("BP") > * Failures that are neither a bug in glibc nor the program ("F") > > Also, there are cases where it's not a "real" failure, but just > something that is expected behavior that needs to be handled ("NF"). > > I'm not aware of a general policy about whether glibc should abort or > assert (ie, abort only with assertion checks enabled) when the fault is > in the BL or BP categories. I'd say we don't, because there's no way to > handle it anyway, and other things will likely go wrong; but I don't > have a strong opinion. Thoughts? > > For every futex op, here's a list of how I'd categorize the possible > error codes (I'm ignoring ENOSYS, which is NF when feature testing (or > BL)): > > FUTEX_WAIT: > * EFAULT is either BL or BP. Nothing we can do. Should have failed > earlier when we accessed the futex variable. > * EINVAL (alignment and timeout normalization) is BL/BP. > * EWOULDBLOCK, ETIMEDOUT are NF. I would distingish multiple versions of "BP" for EINVAL here. You seem to have mixed "program has invoked undefined behavior" (e.g. invalid synchronization object) with "program has provided an erroneous argument which the implementation is required to report" (e.g. invalid timespec contents). If you don't want to add a new class, the latter technically could just be considered NF; it's fully equivalent to NF in terms of how it has to be handled. > FUTEX_WAKE, FUTEX_WAKE_OP: > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > case. This is due to how futexes work when combined with certain rules > for destruction of the underlying synchronization data structure; see my > description of the mutex destruction issue (but this can happen with > other data structures such as semaphores or cond vars too): > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT is reserved for BL/BP (and not NF). I don't see any point in having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm missing something. > * EINVAL (futex alignment) is BL/BP. > * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or* > NF. The latter is caused by the mutex destruction issue, only that a > pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory > location but one which has been reused for a PI futex. Thus, we must > not abort or assert in this case. Nice catch. I wasn't aware of this one (the PI causing EINVAL). > FUTEX_REQUEUE: > * Like FUTEX_WAKE, except that it's not safe to use concurrently with > possible destruction / reuse of the futex memory (because requeueing to > a futex that's unrelated to the new futex located in reused memory is > bad). I think all cases where FUTEX_REQUEUE remotely makes sense as an operation inherently preclude destruction of the target (for example, the mutex that cond var waiters are going to try to relock can't be deleted until they've all relocked it and returned). > I think the next steps to improve this should be: > 1) Getting consensus on how we want to handle BL and BP in general. > 2) Applying the outcome of that to the list above and getting consensus > on the result. > 3) For each case of F, find the best way to report it to the caller > (e.g., error code from the pthreads function, abort, ...). > 4) Change each use of the futexes accordingly, one at a time. Sounds reasonable to me. > I've asked Michael Kerrisk for the state of the futex error docs, but > haven't gotten a reply yet. (Last time I checked, the new input from > the email I referred to above wasn't part of the futex docs yet.) Yes, documentation on futex behavior would be very nice to have... Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 16:56 ` Rich Felker @ 2014-09-16 18:13 ` Torvald Riegel 2014-09-16 18:55 ` Rich Felker 0 siblings, 1 reply; 35+ messages in thread From: Torvald Riegel @ 2014-09-16 18:13 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, Darren Hart On Tue, 2014-09-16 at 12:56 -0400, Rich Felker wrote: > On Tue, Sep 16, 2014 at 05:36:25PM +0200, Torvald Riegel wrote: > > We got complains from the kernel side that glibc wouldn't react properly > > to futex errors being returned. > > Are these complaints public anywhere? I don't think so. Anyway, this point isn't really relevant to the discussion -- just my personal motivation, kind of :) > > Thus, I'm looking at what we'd need to > > actually improve. I'm using this here as a documentation for futex > > error codes: https://lkml.org/lkml/2014/5/15/356 > > > > Generally, we have three categories of faults (ie, the cause for an > > error/failure): > > * Bug in glibc ("BL") > > * Bug in the client program ("BP") > > * Failures that are neither a bug in glibc nor the program ("F") > > > > Also, there are cases where it's not a "real" failure, but just > > something that is expected behavior that needs to be handled ("NF"). > > > > I'm not aware of a general policy about whether glibc should abort or > > assert (ie, abort only with assertion checks enabled) when the fault is > > in the BL or BP categories. I'd say we don't, because there's no way to > > handle it anyway, and other things will likely go wrong; but I don't > > have a strong opinion. Thoughts? > > > > For every futex op, here's a list of how I'd categorize the possible > > error codes (I'm ignoring ENOSYS, which is NF when feature testing (or > > BL)): > > > > FUTEX_WAIT: > > * EFAULT is either BL or BP. Nothing we can do. Should have failed > > earlier when we accessed the futex variable. > > * EINVAL (alignment and timeout normalization) is BL/BP. > > * EWOULDBLOCK, ETIMEDOUT are NF. > > I would distingish multiple versions of "BP" for EINVAL here. You seem > to have mixed "program has invoked undefined behavior" (e.g. invalid > synchronization object) with "program has provided an erroneous > argument which the implementation is required to report" (e.g. invalid > timespec contents). If you don't want to add a new class, the latter > technically could just be considered NF; it's fully equivalent to NF > in terms of how it has to be handled. Good point on timespec. > > FUTEX_WAKE, FUTEX_WAKE_OP: > > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > > case. This is due to how futexes work when combined with certain rules > > for destruction of the underlying synchronization data structure; see my > > description of the mutex destruction issue (but this can happen with > > other data structures such as semaphores or cond vars too): > > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html > > Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT > is reserved for BL/BP (and not NF). I don't see any point in > having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm > missing something. I agree that I was a bit sloppy in the categorization. You're right that depending on how it's used, EFAULT can be just BL/BP. This applies to both FUTEX_WAKE and FUTEX_WAKE_OP, I think; the latter has just a finite number of bits, so you can't avoid an ABA issue entirely. You can use the latter like FUTEX_UNLOCK_PI though, to try to avoid the mutex destruction issue. So, to summarize, my categories kind of assume a "typical" use of those operations in glibc. What I was trying to point out is that we can't abort in the generic futex syscall code when we see EFAULT, because that's wrong for typical uses of FUTEX_WAKE. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 18:13 ` Torvald Riegel @ 2014-09-16 18:55 ` Rich Felker 2014-09-18 12:27 ` Torvald Riegel 0 siblings, 1 reply; 35+ messages in thread From: Rich Felker @ 2014-09-16 18:55 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart On Tue, Sep 16, 2014 at 08:12:38PM +0200, Torvald Riegel wrote: > > > FUTEX_WAKE, FUTEX_WAKE_OP: > > > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > > > case. This is due to how futexes work when combined with certain rules > > > for destruction of the underlying synchronization data structure; see my > > > description of the mutex destruction issue (but this can happen with > > > other data structures such as semaphores or cond vars too): > > > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html > > > > Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT > > is reserved for BL/BP (and not NF). I don't see any point in > > having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm > > missing something. > > I agree that I was a bit sloppy in the categorization. You're right > that depending on how it's used, EFAULT can be just BL/BP. This applies > to both FUTEX_WAKE and FUTEX_WAKE_OP, I think; the latter has just a > finite number of bits, so you can't avoid an ABA issue entirely. You I'm not sure what ABA issue you have in mind. The EFAULT case with FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic operation on the futex int that's associated with the wake allows another thread to synchronize and determine that it may legally destroy the object before the actual wake is sent. FUTEX_WAKE_OP can fully avoid this by performing the atomic operation after looking up and locking the futex hash bucket, so that there's no further access after the atomic and thus no opportunity for fault. > So, to summarize, my categories kind of assume a "typical" use of those > operations in glibc. What I was trying to point out is that we can't > abort in the generic futex syscall code when we see EFAULT, because > that's wrong for typical uses of FUTEX_WAKE. Yes, I agree with this. I'm not clear yet on whether it would be an advantage to use FUTEX_WAKE_OP to avoid this, but I think it's plausible that it might be. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 18:55 ` Rich Felker @ 2014-09-18 12:27 ` Torvald Riegel 2014-09-18 17:40 ` Rich Felker 0 siblings, 1 reply; 35+ messages in thread From: Torvald Riegel @ 2014-09-18 12:27 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, Darren Hart On Tue, 2014-09-16 at 14:54 -0400, Rich Felker wrote: > On Tue, Sep 16, 2014 at 08:12:38PM +0200, Torvald Riegel wrote: > > > > FUTEX_WAKE, FUTEX_WAKE_OP: > > > > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > > > > case. This is due to how futexes work when combined with certain rules > > > > for destruction of the underlying synchronization data structure; see my > > > > description of the mutex destruction issue (but this can happen with > > > > other data structures such as semaphores or cond vars too): > > > > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html > > > > > > Note that it's possible to use FUTEX_WAKE_OP in such a way that EFAULT > > > is reserved for BL/BP (and not NF). I don't see any point in > > > having/using FUTEX_WAKE_OP except for this purpose, but maybe I'm > > > missing something. > > > > I agree that I was a bit sloppy in the categorization. You're right > > that depending on how it's used, EFAULT can be just BL/BP. This applies > > to both FUTEX_WAKE and FUTEX_WAKE_OP, I think; the latter has just a > > finite number of bits, so you can't avoid an ABA issue entirely. You > > I'm not sure what ABA issue you have in mind. Sorry, I must have been thinking about the _BITSET variants when I wrote this. Not sure why... :) > The EFAULT case with > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic > operation on the futex int that's associated with the wake allows > another thread to synchronize and determine that it may legally > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can > fully avoid this by performing the atomic operation after looking up > and locking the futex hash bucket, so that there's no further access > after the atomic and thus no opportunity for fault. Agreed; that like what UNLOCK_PI does. However, and that's something I've only thought about recently, it would be good to know which guarantees the kernel gives in this case; in particular, what happens (and which error code results) if there is destruction and potential unmapping etc. of the futex variable concurrently with WAKE_OP or UNLOCK_PI being in flight. > > So, to summarize, my categories kind of assume a "typical" use of those > > operations in glibc. What I was trying to point out is that we can't > > abort in the generic futex syscall code when we see EFAULT, because > > that's wrong for typical uses of FUTEX_WAKE. > > Yes, I agree with this. I'm not clear yet on whether it would be an > advantage to use FUTEX_WAKE_OP to avoid this, but I think it's > plausible that it might be. It should work to avoid the issue, but then wake-up latency will be higher (because the kernel has to do it). Not a good trade-off to make, I think. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-18 12:27 ` Torvald Riegel @ 2014-09-18 17:40 ` Rich Felker 2014-09-18 17:58 ` Torvald Riegel 0 siblings, 1 reply; 35+ messages in thread From: Rich Felker @ 2014-09-18 17:40 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart On Thu, Sep 18, 2014 at 02:26:35PM +0200, Torvald Riegel wrote: > > The EFAULT case with > > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic > > operation on the futex int that's associated with the wake allows > > another thread to synchronize and determine that it may legally > > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can > > fully avoid this by performing the atomic operation after looking up > > and locking the futex hash bucket, so that there's no further access > > after the atomic and thus no opportunity for fault. > > Agreed; that like what UNLOCK_PI does. However, and that's something > I've only thought about recently, it would be good to know which > guarantees the kernel gives in this case; in particular, what happens > (and which error code results) if there is destruction and potential > unmapping etc. of the futex variable concurrently with WAKE_OP or > UNLOCK_PI being in flight. I've RTFS'd and my understanding is that no such problems are possible. The futex hashing (note: there are two futex address arguments and both are hashed, even if they're equal; this should be optimized on the kernel side to make FUTEX_WAKE_OP practical) and locking of the resulting hash buckets happens before the atomic operation is performed. After the atomic operation, the bucket is walked and matching waiters are woken. In theory it's possible that, as soon as the atomic operation is performed, the backing (file/anon/whatever) is destroyed and its underlying id (e.g. inode number) is reused, so that the backing identified for the original futex address has been reused by this time. However, it's not a problem because a new waiter can't arrive while the hash bucket is still locked -- so such a new waiter can't be woken. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-18 17:40 ` Rich Felker @ 2014-09-18 17:58 ` Torvald Riegel 2014-09-18 20:41 ` Rich Felker 0 siblings, 1 reply; 35+ messages in thread From: Torvald Riegel @ 2014-09-18 17:58 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, Darren Hart, Michael Kerrisk On Thu, 2014-09-18 at 13:39 -0400, Rich Felker wrote: > On Thu, Sep 18, 2014 at 02:26:35PM +0200, Torvald Riegel wrote: > > > The EFAULT case with > > > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic > > > operation on the futex int that's associated with the wake allows > > > another thread to synchronize and determine that it may legally > > > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can > > > fully avoid this by performing the atomic operation after looking up > > > and locking the futex hash bucket, so that there's no further access > > > after the atomic and thus no opportunity for fault. > > > > Agreed; that like what UNLOCK_PI does. However, and that's something > > I've only thought about recently, it would be good to know which > > guarantees the kernel gives in this case; in particular, what happens > > (and which error code results) if there is destruction and potential > > unmapping etc. of the futex variable concurrently with WAKE_OP or > > UNLOCK_PI being in flight. > > I've RTFS'd and my understanding is that no such problems are > possible. The futex hashing (note: there are two futex address > arguments and both are hashed, even if they're equal; this should be > optimized on the kernel side to make FUTEX_WAKE_OP practical) and > locking of the resulting hash buckets happens before the atomic > operation is performed. After the atomic operation, the bucket is > walked and matching waiters are woken. > > In theory it's possible that, as soon as the atomic operation is > performed, the backing (file/anon/whatever) is destroyed and its > underlying id (e.g. inode number) is reused, so that the backing > identified for the original futex address has been reused by this > time. However, it's not a problem because a new waiter can't arrive > while the hash bucket is still locked -- so such a new waiter can't be > woken. I don't disagree with what you said (but I haven't looked at all the source). The point I was trying to make was that the current implementation is not a guarantee -- if we rely on this then we should request the kernel to document that this is allowed usage; that's the only way to make sure this keeps working. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-18 17:58 ` Torvald Riegel @ 2014-09-18 20:41 ` Rich Felker 2014-10-30 1:55 ` Darren Hart 0 siblings, 1 reply; 35+ messages in thread From: Rich Felker @ 2014-09-18 20:41 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart, Michael Kerrisk On Thu, Sep 18, 2014 at 07:57:45PM +0200, Torvald Riegel wrote: > On Thu, 2014-09-18 at 13:39 -0400, Rich Felker wrote: > > On Thu, Sep 18, 2014 at 02:26:35PM +0200, Torvald Riegel wrote: > > > > The EFAULT case with > > > > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic > > > > operation on the futex int that's associated with the wake allows > > > > another thread to synchronize and determine that it may legally > > > > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can > > > > fully avoid this by performing the atomic operation after looking up > > > > and locking the futex hash bucket, so that there's no further access > > > > after the atomic and thus no opportunity for fault. > > > > > > Agreed; that like what UNLOCK_PI does. However, and that's something > > > I've only thought about recently, it would be good to know which > > > guarantees the kernel gives in this case; in particular, what happens > > > (and which error code results) if there is destruction and potential > > > unmapping etc. of the futex variable concurrently with WAKE_OP or > > > UNLOCK_PI being in flight. > > > > I've RTFS'd and my understanding is that no such problems are > > possible. The futex hashing (note: there are two futex address > > arguments and both are hashed, even if they're equal; this should be > > optimized on the kernel side to make FUTEX_WAKE_OP practical) and > > locking of the resulting hash buckets happens before the atomic > > operation is performed. After the atomic operation, the bucket is > > walked and matching waiters are woken. > > > > In theory it's possible that, as soon as the atomic operation is > > performed, the backing (file/anon/whatever) is destroyed and its > > underlying id (e.g. inode number) is reused, so that the backing > > identified for the original futex address has been reused by this > > time. However, it's not a problem because a new waiter can't arrive > > while the hash bucket is still locked -- so such a new waiter can't be > > woken. > > I don't disagree with what you said (but I haven't looked at all the > source). The point I was trying to make was that the current > implementation is not a guarantee -- if we rely on this then we should > request the kernel to document that this is allowed usage; that's the > only way to make sure this keeps working. Oh, I agree entirely on this. The lack of documentation of the futex interfaces makes it rather unsafe to rely on _anything_ about it right now, except for the vague principle that the kernel isn't supposed to "break userspace" with incompatible changes. I don't think it will be a problem ggettingthis behavior for FUTEX_WAKE_OP documented though. It seems like it's part of the intended semantics. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-18 20:41 ` Rich Felker @ 2014-10-30 1:55 ` Darren Hart 2014-10-30 6:51 ` Carlos O'Donell 0 siblings, 1 reply; 35+ messages in thread From: Darren Hart @ 2014-10-30 1:55 UTC (permalink / raw) To: Rich Felker; +Cc: Torvald Riegel, GLIBC Devel, Michael Kerrisk On Thu, Sep 18, 2014 at 04:40:55PM -0400, Rich Felker wrote: > On Thu, Sep 18, 2014 at 07:57:45PM +0200, Torvald Riegel wrote: > > On Thu, 2014-09-18 at 13:39 -0400, Rich Felker wrote: > > > On Thu, Sep 18, 2014 at 02:26:35PM +0200, Torvald Riegel wrote: > > > > > The EFAULT case with > > > > > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic > > > > > operation on the futex int that's associated with the wake allows > > > > > another thread to synchronize and determine that it may legally > > > > > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can > > > > > fully avoid this by performing the atomic operation after looking up > > > > > and locking the futex hash bucket, so that there's no further access > > > > > after the atomic and thus no opportunity for fault. > > > > > > > > Agreed; that like what UNLOCK_PI does. However, and that's something > > > > I've only thought about recently, it would be good to know which > > > > guarantees the kernel gives in this case; in particular, what happens > > > > (and which error code results) if there is destruction and potential > > > > unmapping etc. of the futex variable concurrently with WAKE_OP or > > > > UNLOCK_PI being in flight. > > > > > > I've RTFS'd and my understanding is that no such problems are > > > possible. The futex hashing (note: there are two futex address > > > arguments and both are hashed, even if they're equal; this should be > > > optimized on the kernel side to make FUTEX_WAKE_OP practical) and > > > locking of the resulting hash buckets happens before the atomic > > > operation is performed. After the atomic operation, the bucket is > > > walked and matching waiters are woken. > > > > > > In theory it's possible that, as soon as the atomic operation is > > > performed, the backing (file/anon/whatever) is destroyed and its > > > underlying id (e.g. inode number) is reused, so that the backing > > > identified for the original futex address has been reused by this > > > time. However, it's not a problem because a new waiter can't arrive > > > while the hash bucket is still locked -- so such a new waiter can't be > > > woken. > > > > I don't disagree with what you said (but I haven't looked at all the > > source). The point I was trying to make was that the current > > implementation is not a guarantee -- if we rely on this then we should > > request the kernel to document that this is allowed usage; that's the > > only way to make sure this keeps working. > > Oh, I agree entirely on this. The lack of documentation of the futex > interfaces makes it rather unsafe to rely on _anything_ about it right Where do you want to see this documented? Thomas recently added a lot of state documentation to futex.c, and he and I both responded to Michael Kerrisk's request for documented error codes. Are we refering to the man page? If so, I believe that is in the works already. -- Darren Hart Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-30 1:55 ` Darren Hart @ 2014-10-30 6:51 ` Carlos O'Donell 0 siblings, 0 replies; 35+ messages in thread From: Carlos O'Donell @ 2014-10-30 6:51 UTC (permalink / raw) To: Darren Hart, Rich Felker; +Cc: Torvald Riegel, GLIBC Devel, Michael Kerrisk On 10/29/2014 09:55 PM, Darren Hart wrote: > On Thu, Sep 18, 2014 at 04:40:55PM -0400, Rich Felker wrote: >> Oh, I agree entirely on this. The lack of documentation of the futex >> interfaces makes it rather unsafe to rely on _anything_ about it right > > Where do you want to see this documented? Thomas recently added a lot of state > documentation to futex.c, and he and I both responded to Michael Kerrisk's > request for documented error codes. Are we refering to the man page? If so, I > believe that is in the works already. Yes, it is in the works, but it isn't done yet. Torvald Riegel was following up on some of this. The linux kernel man pages maintained by Michael Kerrisk is the canonical place we look to for kernel API documentation for linux-specific syscalls. If we can get everything well documented there, then we are off to the races. We want it documented there also so we can agree upon what error codes mean what. Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 15:36 Futex error handling Torvald Riegel 2014-09-16 16:56 ` Rich Felker @ 2014-09-17 19:41 ` Roland McGrath 2014-09-17 19:46 ` Torvald Riegel 2014-10-20 16:15 ` Futex error handling Torvald Riegel 2 siblings, 1 reply; 35+ messages in thread From: Roland McGrath @ 2014-09-17 19:41 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart, Michael Kerrisk > We got complains from the kernel side that glibc wouldn't react properly > to futex errors being returned. Thus, I'm looking at what we'd need to > actually improve. I'm using this here as a documentation for futex > error codes: https://lkml.org/lkml/2014/5/15/356 Can somebody work with Michael et al to ensure that the futex(2) man page becomes reliable documentation for this? ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-17 19:41 ` Roland McGrath @ 2014-09-17 19:46 ` Torvald Riegel 2014-09-17 19:59 ` Roland McGrath 0 siblings, 1 reply; 35+ messages in thread From: Torvald Riegel @ 2014-09-17 19:46 UTC (permalink / raw) To: Roland McGrath; +Cc: GLIBC Devel, Darren Hart, Michael Kerrisk On Wed, 2014-09-17 at 12:41 -0700, Roland McGrath wrote: > > We got complains from the kernel side that glibc wouldn't react properly > > to futex errors being returned. Thus, I'm looking at what we'd need to > > actually improve. I'm using this here as a documentation for futex > > error codes: https://lkml.org/lkml/2014/5/15/356 > > Can somebody work with Michael et al to ensure that the futex(2) man page > becomes reliable documentation for this? I've asked Michael about the current status and offered my help in case we also want to document more high-level properties of futex, like how to use it properly. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-17 19:46 ` Torvald Riegel @ 2014-09-17 19:59 ` Roland McGrath 2014-09-17 23:17 ` Rich Felker 0 siblings, 1 reply; 35+ messages in thread From: Roland McGrath @ 2014-09-17 19:59 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart, Michael Kerrisk > I've asked Michael about the current status and offered my help in case > we also want to document more high-level properties of futex, like how > to use it properly. Thanks for taking this on. The futex(7) man page seems like the right place for that, and it's rather thin now. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-17 19:59 ` Roland McGrath @ 2014-09-17 23:17 ` Rich Felker 2014-10-23 19:16 ` Add futex wrapper to glibc? Carlos O'Donell 0 siblings, 1 reply; 35+ messages in thread From: Rich Felker @ 2014-09-17 23:17 UTC (permalink / raw) To: Roland McGrath; +Cc: Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On Wed, Sep 17, 2014 at 12:59:18PM -0700, Roland McGrath wrote: > > I've asked Michael about the current status and offered my help in case > > we also want to document more high-level properties of futex, like how > > to use it properly. > > Thanks for taking this on. The futex(7) man page seems like the right > place for that, and it's rather thin now. I would really like to expose futex(2) as a public interface and see it documented there.. :-) I know that's something of a separate topic, but it seems like a good time to discuss since the documentation is going to be worked on, and since C11 atomics make having futex() available to applications A LOT more desirable. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Add futex wrapper to glibc? 2014-09-17 23:17 ` Rich Felker @ 2014-10-23 19:16 ` Carlos O'Donell 2014-10-23 21:07 ` Joseph S. Myers ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Carlos O'Donell @ 2014-10-23 19:16 UTC (permalink / raw) To: Rich Felker, Roland McGrath Cc: Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On 09/17/2014 07:17 PM, Rich Felker wrote: > On Wed, Sep 17, 2014 at 12:59:18PM -0700, Roland McGrath wrote: >>> I've asked Michael about the current status and offered my help in case >>> we also want to document more high-level properties of futex, like how >>> to use it properly. >> >> Thanks for taking this on. The futex(7) man page seems like the right >> place for that, and it's rather thin now. > > I would really like to expose futex(2) as a public interface and see > it documented there.. :-) I know that's something of a separate topic, > but it seems like a good time to discuss since the documentation is > going to be worked on, and since C11 atomics make having futex() > available to applications A LOT more desirable. We've had this discussion before, but I can't find the reference in the libc-alpha archives. The largest problem right now is lack of documentation and that becomes lack of commitment from the kernel to maintain the API/ABI, and that translates into reluctance for glibc to create a wrapper. We are IMO at the stage where futex is stable, few things are changing, and with documentation in place, I would consider adding a futex wrapper. However, per other discussions on errors, I'd actually add a non-trivial C wrapper to enforce that the kernel does not return error codes that are unexpected. Roland expressed some opinions in the past. I think, because I can't find the reference, they were along the lines that: the value of all of these things being in glibc is suspect. I'm OK with distinct symbol sets between supported OSs and I understand that for GNU/Linux that glibc *is* part C library and part Linux library. Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 19:16 ` Add futex wrapper to glibc? Carlos O'Donell @ 2014-10-23 21:07 ` Joseph S. Myers 2014-10-24 2:00 ` Carlos O'Donell ` (2 more replies) 2014-10-23 21:12 ` Rich Felker 2014-10-30 1:59 ` Darren Hart 2 siblings, 3 replies; 35+ messages in thread From: Joseph S. Myers @ 2014-10-23 21:07 UTC (permalink / raw) To: Carlos O'Donell Cc: Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On Thu, 23 Oct 2014, Carlos O'Donell wrote: > > I would really like to expose futex(2) as a public interface and see > > it documented there.. :-) I know that's something of a separate topic, > > but it seems like a good time to discuss since the documentation is > > going to be worked on, and since C11 atomics make having futex() > > available to applications A LOT more desirable. > > We've had this discussion before, but I can't find the reference in > the libc-alpha archives. See bug 9712 (and comment 4 therein where I pointed to past discussions of the general question of which syscall wrappers to add). > The largest problem right now is lack of documentation and that becomes > lack of commitment from the kernel to maintain the API/ABI, and that > translates into reluctance for glibc to create a wrapper. That's not a plausible reason. The presumption is that if any interface provided by the kernel could plausibly be used by userspace, it must stay stable. We need to reach consensus on when to add syscall wrappers to glibc (and I suppose on how to determine whether a given syscall wrapper should be a cancellation point). Possible considerations include: * If a syscall is obsoleted by another syscall (or otherwise considered obsolete), there is no need to add a wrapper to glibc. * If a syscall cannot meaningfully be used behind glibc's back, or is not useful in the glibc context except for in the ways in which it is used by glibc, but can only be used directly by glibc, there is no need to add a wrapper to glibc (this probably applies to set_thread_area, for example). * If there's a glibc function that's not quite a direct wrapper of the syscall but provides all the functionality of it that can usefully be used in a program using glibc, there is no need to add a wrapper to glibc. * Otherwise, my inclination would be to default to adding wrappers (both for syscalls not used in glibc, and for cases such as futex where the syscall is used in glibc but can usefully be used directly as well) unless there is a clear reason not to. This includes for architecture-specific syscalls. * If a wrapper is provided, there should also be a header with corresponding declarations (that either provides any relevant constants / structures or includes appropriate kernel neaders providing them). * There may be cases where a function is meaningful not just with Linux kernels supporting the syscall, and should be provided with some kind of fallback for older kernels, and possibly made part of the glibc API for systems with other kernels as well. * If we do get consensus on principles, we still need someone to go through current syscall lists and work out a more detailed proposal for which syscalls to add to glibc, then actually add them once there is consensus on those sets. > We are IMO at the stage where futex is stable, few things are changing, > and with documentation in place, I would consider adding a futex wrapper. > However, per other discussions on errors, I'd actually add a non-trivial > C wrapper to enforce that the kernel does not return error codes that > are unexpected. I don't think that makes sense for futex. I think such checks only make sense where glibc is making use of futex within its own code - there it may sometimes be useful to sanity-check results match glibc's expectations. I also think it's desirable for any such syscall wrapper to be usable with futex operations that may be added to the kernel in future (i.e. someone should only need new kernel headers to build code using new operations, not a new glibc binary that knows how to check things for those new operations). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 21:07 ` Joseph S. Myers @ 2014-10-24 2:00 ` Carlos O'Donell 2014-10-26 7:19 ` Mike Frysinger 2014-10-30 2:03 ` Darren Hart 2 siblings, 0 replies; 35+ messages in thread From: Carlos O'Donell @ 2014-10-24 2:00 UTC (permalink / raw) To: Joseph S. Myers Cc: Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On 10/23/2014 05:06 PM, Joseph S. Myers wrote: >> We are IMO at the stage where futex is stable, few things are changing, >> and with documentation in place, I would consider adding a futex wrapper. >> However, per other discussions on errors, I'd actually add a non-trivial >> C wrapper to enforce that the kernel does not return error codes that >> are unexpected. > > I don't think that makes sense for futex. I think such checks only make > sense where glibc is making use of futex within its own code - there it > may sometimes be useful to sanity-check results match glibc's > expectations. I also think it's desirable for any such syscall wrapper to > be usable with futex operations that may be added to the kernel in future > (i.e. someone should only need new kernel headers to build code using new > operations, not a new glibc binary that knows how to check things for > those new operations). I concede. My worry is the mess with sched_[sg]etaffinity where the kernel syscall and userspace API are different. Your point " * If there's a glibc function that's not quite a direct wrapper of the syscall but provides all the functionality of it that can usefully be used in a program using glibc, there is no need to add a wrapper to glibc." covers that though. I've added your notes to the wiki: https://sourceware.org/glibc/wiki/Consensus#WIP:_Kernel_syscalls_wrappers Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 21:07 ` Joseph S. Myers 2014-10-24 2:00 ` Carlos O'Donell @ 2014-10-26 7:19 ` Mike Frysinger 2014-10-27 17:52 ` Joseph S. Myers ` (2 more replies) 2014-10-30 2:03 ` Darren Hart 2 siblings, 3 replies; 35+ messages in thread From: Mike Frysinger @ 2014-10-26 7:19 UTC (permalink / raw) To: Joseph S. Myers Cc: Carlos O'Donell, Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk [-- Attachment #1: Type: text/plain, Size: 2662 bytes --] On 23 Oct 2014 21:06, Joseph S. Myers wrote: > On Thu, 23 Oct 2014, Carlos O'Donell wrote: > > The largest problem right now is lack of documentation and that becomes > > lack of commitment from the kernel to maintain the API/ABI, and that > > translates into reluctance for glibc to create a wrapper. > > That's not a plausible reason. The presumption is that if any interface > provided by the kernel could plausibly be used by userspace, it must stay > stable. sure, but Carlos is speaking in this specific case wrt futex, and it def suffers from lack of documentation everywhere. which provides quite a good amount of wiggle room for the kernel peeps :). > * Otherwise, my inclination would be to default to adding wrappers (both > for syscalls not used in glibc, and for cases such as futex where the > syscall is used in glibc but can usefully be used directly as well) unless > there is a clear reason not to. This includes for architecture-specific > syscalls. i disagree here. if the func is not going to be generally useful, i don't think glibc should pick it up. especially for arch-specific syscalls. i think the kernel itself is much better suited for providing a thin wrapper layer for userspace as they're directly in control of the ABI (especially cross-arch) and the API (via the UAPI headers), they can provide a full API for all syscalls (unlike glibc per the caveats listed above), and it means there's a central source point for everyone to use, not just glibc. if the syscall is generally useful to people writing userland code, then i don't see a problem with incorporating it in some fashion into the official GNU API. > * If a wrapper is provided, there should also be a header with > corresponding declarations (that either provides any relevant constants / > structures or includes appropriate kernel neaders providing them). absolutely > * There may be cases where a function is meaningful not just with Linux > kernels supporting the syscall, and should be provided with some kind of > fallback for older kernels, and possibly made part of the glibc API for > systems with other kernels as well. i think any syscall we think about adopting, this should be given serious consideration. ideally glibc's API should be stable across platforms. if the Linux syscall is so great as to warrant us adding a wrapper, then it stands to reason that equivalent functionality should be desirable on other platforms. it doesn't mean we need to wait for them to implement the kernel side, but we shouldn't be baking in pure Linuxisms from the start. -mike [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-26 7:19 ` Mike Frysinger @ 2014-10-27 17:52 ` Joseph S. Myers 2014-10-27 20:28 ` Roland McGrath 2014-10-30 2:07 ` Darren Hart 2 siblings, 0 replies; 35+ messages in thread From: Joseph S. Myers @ 2014-10-27 17:52 UTC (permalink / raw) To: Mike Frysinger Cc: Carlos O'Donell, Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On Sun, 26 Oct 2014, Mike Frysinger wrote: > > * Otherwise, my inclination would be to default to adding wrappers (both > > for syscalls not used in glibc, and for cases such as futex where the > > syscall is used in glibc but can usefully be used directly as well) unless > > there is a clear reason not to. This includes for architecture-specific > > syscalls. > > i disagree here. if the func is not going to be generally useful, i > don't think glibc should pick it up. especially for arch-specific > syscalls. i think the kernel itself is much better suited for providing > a thin wrapper layer for userspace as they're directly in control of the > ABI (especially cross-arch) and the API (via the UAPI headers), they can > provide a full API for all syscalls (unlike glibc per the caveats listed > above), and it means there's a central source point for everyone to use, > not just glibc. I don't think the kernel is well-suited for this; various issues around building objects for all the different ABIs, including ABI differences that are only relevant at the userspace level and not the kernel level, don't arise for kernel code (and only to a limited extent for vDSOs) - and then you get into things such as providing unwind / debug info for syscall stubs. If the syscall is to be a cancellation point (and it was suggested the futex wrapper should be, for certain arguments) then the syscall implementation closely ties into libc internals (and if it sets errno, that means the object code for the implementation depends on libc, even if it would be possible to write C source code for it that does not). > if the syscall is generally useful to people writing userland code, then > i don't see a problem with incorporating it in some fashion into the > official GNU API. Sometimes "access this Linux kernel interface" is generally useful (e.g. gettid, futex - and plenty of existing syscall wrappers). > i think any syscall we think about adopting, this should be given > serious consideration. ideally glibc's API should be stable across > platforms. if the Linux syscall is so great as to warrant us adding a > wrapper, then it stands to reason that equivalent functionality should > be desirable on other platforms. it doesn't mean we need to wait for > them to implement the kernel side, but we shouldn't be baking in pure > Linuxisms from the start. -mike I don't think we actually need to decide at the time of adding a syscall wrapper whether it would be useful on other platforms - if we decide that later, we can always add the ENOSYS stub for Hurd use, or fallbacks for older Linux kernel versions, at that point. (If we wish to use a function internally in glibc in OS-independent code, that would be a good indication that it should be made an OS-independent interface.) Nor do I think we can reliably predict whether an interface is suitable to be OS-independent from whether it is generally useful with the Linux kernel - it may be necessary to see how several different OSes approach an issue before a good conclusion can be reached on what a portable OS-independent interface should look like. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-26 7:19 ` Mike Frysinger 2014-10-27 17:52 ` Joseph S. Myers @ 2014-10-27 20:28 ` Roland McGrath 2014-10-30 2:07 ` Darren Hart 2 siblings, 0 replies; 35+ messages in thread From: Roland McGrath @ 2014-10-27 20:28 UTC (permalink / raw) To: Mike Frysinger Cc: Joseph S. Myers, Carlos O'Donell, Rich Felker, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk I generally concur with Mike's points. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-26 7:19 ` Mike Frysinger 2014-10-27 17:52 ` Joseph S. Myers 2014-10-27 20:28 ` Roland McGrath @ 2014-10-30 2:07 ` Darren Hart 2014-10-30 9:32 ` Torvald Riegel 2 siblings, 1 reply; 35+ messages in thread From: Darren Hart @ 2014-10-30 2:07 UTC (permalink / raw) To: Joseph S. Myers, Carlos O'Donell, Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Sun, Oct 26, 2014 at 03:19:22AM -0400, Mike Frysinger wrote: > On 23 Oct 2014 21:06, Joseph S. Myers wrote: > > On Thu, 23 Oct 2014, Carlos O'Donell wrote: > > > The largest problem right now is lack of documentation and that becomes > > > lack of commitment from the kernel to maintain the API/ABI, and that > > > translates into reluctance for glibc to create a wrapper. > > > > That's not a plausible reason. The presumption is that if any interface > > provided by the kernel could plausibly be used by userspace, it must stay > > stable. > > sure, but Carlos is speaking in this specific case wrt futex, and it def suffers > from lack of documentation everywhere. which provides quite a good amount of > wiggle room for the kernel peeps :). This kernel peep, and I think I can speak for tglx as well, is not looking for wiggle room here. We've provide Michael Kerrisk with the return values and error conditions. I for one treat this as binding. > > > * Otherwise, my inclination would be to default to adding wrappers (both > > for syscalls not used in glibc, and for cases such as futex where the > > syscall is used in glibc but can usefully be used directly as well) unless > > there is a clear reason not to. This includes for architecture-specific > > syscalls. > > i disagree here. if the func is not going to be generally useful, i don't think > glibc should pick it up. especially for arch-specific syscalls. i think the > kernel itself is much better suited for providing a thin wrapper layer for > userspace as they're directly in control of the ABI (especially cross-arch) and > the API (via the UAPI headers), they can provide a full API for all syscalls > (unlike glibc per the caveats listed above), and it means there's a central > source point for everyone to use, not just glibc. > > if the syscall is generally useful to people writing userland code, then i don't > see a problem with incorporating it in some fashion into the official GNU API. > Agreed, and we have at least a couple examples of futex usage outside of glibc. (futextest - OK, that one's self-serving, librcu, and I've heard of others) -- Darren Hart Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-30 2:07 ` Darren Hart @ 2014-10-30 9:32 ` Torvald Riegel 0 siblings, 0 replies; 35+ messages in thread From: Torvald Riegel @ 2014-10-30 9:32 UTC (permalink / raw) To: Darren Hart Cc: Joseph S. Myers, Carlos O'Donell, Rich Felker, Roland McGrath, GLIBC Devel, Michael Kerrisk On Wed, 2014-10-29 at 19:07 -0700, Darren Hart wrote: > On Sun, Oct 26, 2014 at 03:19:22AM -0400, Mike Frysinger wrote: > > On 23 Oct 2014 21:06, Joseph S. Myers wrote: > > > On Thu, 23 Oct 2014, Carlos O'Donell wrote: > > > * Otherwise, my inclination would be to default to adding wrappers (both > > > for syscalls not used in glibc, and for cases such as futex where the > > > syscall is used in glibc but can usefully be used directly as well) unless > > > there is a clear reason not to. This includes for architecture-specific > > > syscalls. > > > > i disagree here. if the func is not going to be generally useful, i don't think > > glibc should pick it up. especially for arch-specific syscalls. i think the > > kernel itself is much better suited for providing a thin wrapper layer for > > userspace as they're directly in control of the ABI (especially cross-arch) and > > the API (via the UAPI headers), they can provide a full API for all syscalls > > (unlike glibc per the caveats listed above), and it means there's a central > > source point for everyone to use, not just glibc. > > > > if the syscall is generally useful to people writing userland code, then i don't > > see a problem with incorporating it in some fashion into the official GNU API. > > > > Agreed, and we have at least a couple examples of futex usage outside of glibc. > > (futextest - OK, that one's self-serving, librcu, and I've heard of others) Yes, there are and will be others, for example GCC's libitm, or C++'s std::synchronic<T> if it gets accepted. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 21:07 ` Joseph S. Myers 2014-10-24 2:00 ` Carlos O'Donell 2014-10-26 7:19 ` Mike Frysinger @ 2014-10-30 2:03 ` Darren Hart 2 siblings, 0 replies; 35+ messages in thread From: Darren Hart @ 2014-10-30 2:03 UTC (permalink / raw) To: Joseph S. Myers Cc: Carlos O'Donell, Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Thu, Oct 23, 2014 at 09:06:17PM +0000, Joseph S. Myers wrote: > On Thu, 23 Oct 2014, Carlos O'Donell wrote: > > > We are IMO at the stage where futex is stable, few things are changing, > > and with documentation in place, I would consider adding a futex wrapper. > > However, per other discussions on errors, I'd actually add a non-trivial > > C wrapper to enforce that the kernel does not return error codes that > > are unexpected. > > I don't think that makes sense for futex. I think such checks only make > sense where glibc is making use of futex within its own code - there it > may sometimes be useful to sanity-check results match glibc's > expectations. I also think it's desirable for any such syscall wrapper to > be usable with futex operations that may be added to the kernel in future > (i.e. someone should only need new kernel headers to build code using new > operations, not a new glibc binary that knows how to check things for > those new operations). Agreed, let the caller deal with the return values. When used directly, glibc should not be acting on any of the return codes as they are needed by the implementation (which is outside of glibc). -- Darren Hart Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 19:16 ` Add futex wrapper to glibc? Carlos O'Donell 2014-10-23 21:07 ` Joseph S. Myers @ 2014-10-23 21:12 ` Rich Felker 2014-10-30 1:59 ` Darren Hart 2 siblings, 0 replies; 35+ messages in thread From: Rich Felker @ 2014-10-23 21:12 UTC (permalink / raw) To: Carlos O'Donell Cc: Roland McGrath, Torvald Riegel, GLIBC Devel, Darren Hart, Michael Kerrisk On Thu, Oct 23, 2014 at 03:16:07PM -0400, Carlos O'Donell wrote: > On 09/17/2014 07:17 PM, Rich Felker wrote: > > On Wed, Sep 17, 2014 at 12:59:18PM -0700, Roland McGrath wrote: > >>> I've asked Michael about the current status and offered my help in case > >>> we also want to document more high-level properties of futex, like how > >>> to use it properly. > >> > >> Thanks for taking this on. The futex(7) man page seems like the right > >> place for that, and it's rather thin now. > > > > I would really like to expose futex(2) as a public interface and see > > it documented there.. :-) I know that's something of a separate topic, > > but it seems like a good time to discuss since the documentation is > > going to be worked on, and since C11 atomics make having futex() > > available to applications A LOT more desirable. > > We've had this discussion before, but I can't find the reference in > the libc-alpha archives. Perhaps on the bug tracker. IIRC there's at least one duplicate bug report for it too, and the discussions are probably buried in one of them. > The largest problem right now is lack of documentation and that becomes > lack of commitment from the kernel to maintain the API/ABI, and that > translates into reluctance for glibc to create a wrapper. The kernel's commitment to maintain API/ABI is simply a consequence of all the pthread primitives depending on the behavior. This is at least as strong as documentation, but I agree documentation would be nice too, especially for less-known aspects that glibc is not depending on right now. > We are IMO at the stage where futex is stable, few things are changing, > and with documentation in place, I would consider adding a futex wrapper. > However, per other discussions on errors, I'd actually add a non-trivial > C wrapper to enforce that the kernel does not return error codes that > are unexpected. Is there a reason for this? I don't see it being done on any other Linux-specific functions glibc provides syscall wrappers for. > Roland expressed some opinions in the past. I think, because I can't find > the reference, they were along the lines that: the value of all of these > things being in glibc is suspect. I don't think it's any more suspect than epoll, eventfd, fanotify, sendfile, etc. syscall wrappers being in glibc. These are all Linux APIs that are useful to applications. Recommending these to be handled in third-party libraries is not a good idea. Not only does it perpetuate the use of syscall() (which has ugly variadic calling issues and issues with argument type on x32 and possibly other future 32-on-64 type ABIs), but it precludes applications from being able to use them with cancellation. One of the bug reports requesting futex() is actually about having it as a cancellation point. IMO it would make sense for just the wait forms to be cancellation points, much like how fcntl is a cancellation point only for blocking lock operations. > I'm OK with distinct symbol sets between supported OSs and I understand > that for GNU/Linux that glibc *is* part C library and part Linux library. Yes. BTW I think it's fundamentally possible to do non-process-shared futex in userspace if you do want to support it on non-Linux targets. You just do the equivalent hash buckets/waiter lists wrapped with a mutex and cond var. Not to say this is a good idea, but it's possible. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-23 19:16 ` Add futex wrapper to glibc? Carlos O'Donell 2014-10-23 21:07 ` Joseph S. Myers 2014-10-23 21:12 ` Rich Felker @ 2014-10-30 1:59 ` Darren Hart 2014-10-30 2:44 ` Rich Felker 2014-10-30 2:55 ` Carlos O'Donell 2 siblings, 2 replies; 35+ messages in thread From: Darren Hart @ 2014-10-30 1:59 UTC (permalink / raw) To: Carlos O'Donell Cc: Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Thu, Oct 23, 2014 at 03:16:07PM -0400, Carlos O'Donell wrote: > On 09/17/2014 07:17 PM, Rich Felker wrote: > > On Wed, Sep 17, 2014 at 12:59:18PM -0700, Roland McGrath wrote: > >>> I've asked Michael about the current status and offered my help in case > >>> we also want to document more high-level properties of futex, like how > >>> to use it properly. > >> > >> Thanks for taking this on. The futex(7) man page seems like the right > >> place for that, and it's rather thin now. > > > > I would really like to expose futex(2) as a public interface and see > > it documented there.. :-) I know that's something of a separate topic, > > but it seems like a good time to discuss since the documentation is > > going to be worked on, and since C11 atomics make having futex() > > available to applications A LOT more desirable. > > We've had this discussion before, but I can't find the reference in > the libc-alpha archives. > > The largest problem right now is lack of documentation and that becomes > lack of commitment from the kernel to maintain the API/ABI, and that > translates into reluctance for glibc to create a wrapper. I would like to see this as well so I (and others, like Mathieu Desnoyers) don't have to do our own syscall wrapper in futextests and librcu, for example. > > We are IMO at the stage where futex is stable, few things are changing, > and with documentation in place, I would consider adding a futex wrapper. Yes, at least for the defined OP codes. New OPs may be added of course, but that isn't a concern for supporting what exists today, and doesn't break compatibility. I wonder though... can we not wrap FUTEX_REQUEUE? It's fundamentally broken. FUTEX_CMP_REQUEUE should *always* be used instead. The glibc wrapper is one way to encourage developers to do the right thing (don't expose the bad op in the header). -- Darren Hart Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-30 1:59 ` Darren Hart @ 2014-10-30 2:44 ` Rich Felker 2014-10-30 2:55 ` Carlos O'Donell 1 sibling, 0 replies; 35+ messages in thread From: Rich Felker @ 2014-10-30 2:44 UTC (permalink / raw) To: Darren Hart Cc: Carlos O'Donell, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Wed, Oct 29, 2014 at 06:59:15PM -0700, Darren Hart wrote: > > We are IMO at the stage where futex is stable, few things are changing, > > and with documentation in place, I would consider adding a futex wrapper. > > Yes, at least for the defined OP codes. New OPs may be added of course, but that > isn't a concern for supporting what exists today, and doesn't break > compatibility. > > I wonder though... can we not wrap FUTEX_REQUEUE? It's fundamentally broken. > FUTEX_CMP_REQUEUE should *always* be used instead. The glibc wrapper is one way > to encourage developers to do the right thing (don't expose the bad op in the > header). You're mistaken here. There are plenty of valid ways to use FUTEX_REQUEUE - for example if the calling thread is requeuing the target(s) to a lock that the calling thread owns. Just because it doesn't meet the needs of the way glibc was using it internally doesn't mean it's useless for other applications. In any case, I don't think there's a proposal to intercept/modify the commands to futex, just to pass them through (and possibly do a cancellable syscall for some of them). Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-30 1:59 ` Darren Hart 2014-10-30 2:44 ` Rich Felker @ 2014-10-30 2:55 ` Carlos O'Donell 2014-10-30 3:26 ` Rich Felker 2014-10-30 17:09 ` Joseph S. Myers 1 sibling, 2 replies; 35+ messages in thread From: Carlos O'Donell @ 2014-10-30 2:55 UTC (permalink / raw) To: Darren Hart Cc: Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On 10/29/2014 09:59 PM, Darren Hart wrote: > I wonder though... can we not wrap FUTEX_REQUEUE? It's fundamentally broken. > FUTEX_CMP_REQUEUE should *always* be used instead. The glibc wrapper is one way > to encourage developers to do the right thing (don't expose the bad op in the > header). We can do whatever you want, but you should document it as broken in the linux kernel man pages if it is not already. Then when we add the wrapper we'll make sure it's not there. The WIP consensus is that deprecated kernel syscalls or features will not be part of wrappers. Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-30 2:55 ` Carlos O'Donell @ 2014-10-30 3:26 ` Rich Felker 2014-10-30 17:09 ` Joseph S. Myers 1 sibling, 0 replies; 35+ messages in thread From: Rich Felker @ 2014-10-30 3:26 UTC (permalink / raw) To: Carlos O'Donell Cc: Darren Hart, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Wed, Oct 29, 2014 at 10:54:19PM -0400, Carlos O'Donell wrote: > On 10/29/2014 09:59 PM, Darren Hart wrote: > > I wonder though... can we not wrap FUTEX_REQUEUE? It's fundamentally broken. > > FUTEX_CMP_REQUEUE should *always* be used instead. The glibc wrapper is one way > > to encourage developers to do the right thing (don't expose the bad op in the > > header). > > We can do whatever you want, but you should document it as broken in the > linux kernel man pages if it is not already. Then when we add the wrapper > we'll make sure it's not there. > > The WIP consensus is that deprecated kernel syscalls or features will > not be part of wrappers. As I mentioned, it's not broken; it's fine to use as long as there's no possible race on the condition for whether the target thread(s) should wake up or wait on the futex, and there are actually situations where it's preferable (yields better performance by avoiding spurious EAGAIN) than the CMP version. I can give examples if anyone's interested. Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Add futex wrapper to glibc? 2014-10-30 2:55 ` Carlos O'Donell 2014-10-30 3:26 ` Rich Felker @ 2014-10-30 17:09 ` Joseph S. Myers 1 sibling, 0 replies; 35+ messages in thread From: Joseph S. Myers @ 2014-10-30 17:09 UTC (permalink / raw) To: Carlos O'Donell Cc: Darren Hart, Rich Felker, Roland McGrath, Torvald Riegel, GLIBC Devel, Michael Kerrisk On Wed, 29 Oct 2014, Carlos O'Donell wrote: > The WIP consensus is that deprecated kernel syscalls or features will > not be part of wrappers. Deprecated *syscalls* will not have wrappers. If a particular syscall is to have a wrapper, I don't think there's any consensus for glibc to try to filter out deprecated uses and only pass through non-deprecated ones (and wrappers should definitely allow through unknown flags, futex operations etc., rather than requiring glibc to have a full understanding of valid arguments to the syscall). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-09-16 15:36 Futex error handling Torvald Riegel 2014-09-16 16:56 ` Rich Felker 2014-09-17 19:41 ` Roland McGrath @ 2014-10-20 16:15 ` Torvald Riegel 2014-10-21 22:34 ` Roland McGrath 2014-10-23 19:35 ` Carlos O'Donell 2 siblings, 2 replies; 35+ messages in thread From: Torvald Riegel @ 2014-10-20 16:15 UTC (permalink / raw) To: GLIBC Devel; +Cc: Darren Hart On Tue, 2014-09-16 at 17:36 +0200, Torvald Riegel wrote: > We got complains from the kernel side that glibc wouldn't react properly > to futex errors being returned. Thus, I'm looking at what we'd need to > actually improve. I'm using this here as a documentation for futex > error codes: https://lkml.org/lkml/2014/5/15/356 > > Generally, we have three categories of faults (ie, the cause for an > error/failure): > * Bug in glibc ("BL") > * Bug in the client program ("BP") > * Failures that are neither a bug in glibc nor the program ("F") > > Also, there are cases where it's not a "real" failure, but just > something that is expected behavior that needs to be handled ("NF"). > > I'm not aware of a general policy about whether glibc should abort or > assert (ie, abort only with assertion checks enabled) when the fault is > in the BL or BP categories. I'd say we don't, because there's no way to > handle it anyway, and other things will likely go wrong; but I don't > have a strong opinion. Thoughts? > > For every futex op, here's a list of how I'd categorize the possible > error codes (I'm ignoring ENOSYS, which is NF when feature testing (or > BL)): > > FUTEX_WAIT: > * EFAULT is either BL or BP. Nothing we can do. Should have failed > earlier when we accessed the futex variable. > * EINVAL (alignment and timeout normalization) is BL/BP. > * EWOULDBLOCK, ETIMEDOUT are NF. > > FUTEX_WAKE, FUTEX_WAKE_OP: > * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > case. This is due to how futexes work when combined with certain rules > for destruction of the underlying synchronization data structure; see my > description of the mutex destruction issue (but this can happen with > other data structures such as semaphores or cond vars too): > https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html > * EINVAL (futex alignment) is BL/BP. > * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or* > NF. The latter is caused by the mutex destruction issue, only that a > pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory > location but one which has been reused for a PI futex. Thus, we must > not abort or assert in this case. > > FUTEX_REQUEUE: > * Like FUTEX_WAKE, except that it's not safe to use concurrently with > possible destruction / reuse of the futex memory (because requeueing to > a futex that's unrelated to the new futex located in reused memory is > bad). > > FUTEX_REQUEUE_CMP: > * Like FUTEX_REQUEUE. EAGAIN is NF. > > FUTEX_WAKE_OP: > * Haven't looked at this yet. Only used in condvars, and might not be > necessary for a condvar that's not based on a condvar-internal lock. > > FUTEX_WAIT_BITSET / FUTEX_WAKE_BITSET: > * Like FUTEX_WAIT / FUTEX_WAKE. The additional EINVAL is BL. > > FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI: > * EFAULT is BL/BP. > * ENOMEM is F. We need to handle this. > * EINVAL, EPERM, ESRCH are BL/BP. > * EAGAIN and ETIMEDOUT are NF. > * EDEADLOCK is BP (or BL). > * EOWNERDIED is F. > > FUTEX_UNLOCK_PI: > (* I guess this can return EFAULT too, which is BL/BP.) > * EINVAL and EPERM are BL/BP. I don't think there's a mutex destruction > issue with PI locks because the kernel takes care of both resetting the > value of the futex var and waking up threads; it should do so in a way > that won't access reused memory. I guess we should check that though... > > FUTEX_WAIT_REQUEUE_PI: > * EFAULT and EINVAL are BL/BP. > * EWOULDBLOCK and ETIMEDOUT are NF. > * EOWNERDIED is F. > > FUTEX_CMP_REQUEUE_PI is like FUTEX_CMP_REQUEUE except: > * ENOMEM is F. > * EPERM and ESRCH are BL/BP. > * EDEADLOCK is BP (or BL). > > > I think the next steps to improve this should be: > 1) Getting consensus on how we want to handle BL and BP in general. Ping. I think that's a fairly generic issue (in the sense of likely being a question for other things than just mutexes too), so I'd like some input on the direction. Carlos, Joseph, Roland: Do you have any comments? > 2) Applying the outcome of that to the list above and getting consensus > on the result. > 3) For each case of F, find the best way to report it to the caller > (e.g., error code from the pthreads function, abort, ...). > 4) Change each use of the futexes accordingly, one at a time. > > I've asked Michael Kerrisk for the state of the futex error docs, but > haven't gotten a reply yet. (Last time I checked, the new input from > the email I referred to above wasn't part of the futex docs yet.) Michael has replied, and it's on his list of things to do. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-20 16:15 ` Futex error handling Torvald Riegel @ 2014-10-21 22:34 ` Roland McGrath 2014-10-23 2:37 ` Carlos O'Donell 2014-10-23 19:35 ` Carlos O'Donell 1 sibling, 1 reply; 35+ messages in thread From: Roland McGrath @ 2014-10-21 22:34 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, Darren Hart The high-level comment is that we have always favored having actual bugs cause quick and complete failure. If it's a bug in libc, then it should fail early and catastrophically so that we find out about the bug as soon as possible. That trades off against any runtime cost of detecting the case. If it's cheap to detect, then detect it. If it's not so cheap, then don't pay the cost because we don't expect that we'll have the bug. Using assert is a middle ground for things that have enough cost that we don't just leave them in all the time, but little enough that there's still any question about it. I don't know what the distribution of NDEBUG use is like across distributions. If every distribution builds production libc with NDEBUG then in practice assert will not catch any real-world problems and it shouldn't really count as runtime detection, because only people developing libc will ever see it. If it's user code invoking undefined behavior, then it should fail early and catastrophically so that developers don't get the false impression that their code is OK when it happens not to break the use cases they test adequately. (Said another way, so that we avoid giving developers an excuse to complain when a future implementation change "breaks" their programs that were always broken, but theretofore ignorably so.) That too trades off against any runtime cost of detecting the case. I'd say the allowance for cost of detection is marginally higher than in the first case, because we expect user bugs to be more common that libc bugs. But it's still not much, since correct programs performing better is more important to us than buggy programs being easier to debug. Those are generic principles. There's another kind of case I don't think you mentioned, that is especially apropros for the futex operations. That is unexpected results from the kernel. That could of course just be a libc bug that causes its expectations to be wrong. But it could also be a kernel bug, or a new compatibility problem (e.g. some system call starts returning new error codes in a new kernel version that weren't possible when the libc code was written, built, and tested). For those I'm not sure there is any general rule that will really help. It might just require careful consideration case by case for what is the wisest form of future-proofing. Sometimes, propagating whatever error the kernel gave back to the user is clearly the best thing to do. But there might also be situations where an unexpected result means that libc has become confused about what state the kernel left things in, and crashing would be better. And finally, there might well be instances of kernel bugs that we could adequately recognize and work around. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-21 22:34 ` Roland McGrath @ 2014-10-23 2:37 ` Carlos O'Donell 2014-10-23 17:48 ` Roland McGrath 0 siblings, 1 reply; 35+ messages in thread From: Carlos O'Donell @ 2014-10-23 2:37 UTC (permalink / raw) To: Roland McGrath, Torvald Riegel; +Cc: GLIBC Devel, Darren Hart On 10/21/2014 06:34 PM, Roland McGrath wrote: > The high-level comment is that we have always favored having actual bugs > cause quick and complete failure. Agreed. > If it's a bug in libc, then it should fail early and catastrophically so > that we find out about the bug as soon as possible. That trades off > against any runtime cost of detecting the case. If it's cheap to detect, > then detect it. If it's not so cheap, then don't pay the cost because we > don't expect that we'll have the bug. Using assert is a middle ground for > things that have enough cost that we don't just leave them in all the time, > but little enough that there's still any question about it. I don't know > what the distribution of NDEBUG use is like across distributions. If every > distribution builds production libc with NDEBUG then in practice assert > will not catch any real-world problems and it shouldn't really count as > runtime detection, because only people developing libc will ever see it. RHEL builds with -DNDEBUG. I expect every distribution does the same. > If it's user code invoking undefined behavior, then it should fail early > and catastrophically so that developers don't get the false impression that > their code is OK when it happens not to break the use cases they test > adequately. (Said another way, so that we avoid giving developers an > excuse to complain when a future implementation change "breaks" their > programs that were always broken, but theretofore ignorably so.) That too > trades off against any runtime cost of detecting the case. I'd say the > allowance for cost of detection is marginally higher than in the first > case, because we expect user bugs to be more common that libc bugs. But > it's still not much, since correct programs performing better is more > important to us than buggy programs being easier to debug. Agreed. Apparently the kernel detection of user bugs is very low cost, the kernel has a broader view of all the locks than glibc does. Thus error returns from glibc should IMO become immediate catastrophic failures if the error indicates undefined behaviour. If the situation is in any way recoverable, we must return the error code to the user. > Those are generic principles. There's another kind of case I don't think > you mentioned, that is especially apropros for the futex operations. That > is unexpected results from the kernel. That could of course just be a libc > bug that causes its expectations to be wrong. But it could also be a > kernel bug, or a new compatibility problem (e.g. some system call starts > returning new error codes in a new kernel version that weren't possible > when the libc code was written, built, and tested). For those I'm not sure > there is any general rule that will really help. It might just require > careful consideration case by case for what is the wisest form of > future-proofing. Sometimes, propagating whatever error the kernel gave > back to the user is clearly the best thing to do. But there might also be > situations where an unexpected result means that libc has become confused > about what state the kernel left things in, and crashing would be better. > And finally, there might well be instances of kernel bugs that we could > adequately recognize and work around. I would prefer that unexpected error codes cause glibc to crash. This immediately alerts kernel and glibc developers of a mismatch in their expectations before this ever gets out of experimental distributions. That's just my preference. Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-23 2:37 ` Carlos O'Donell @ 2014-10-23 17:48 ` Roland McGrath 0 siblings, 0 replies; 35+ messages in thread From: Roland McGrath @ 2014-10-23 17:48 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Torvald Riegel, GLIBC Devel, Darren Hart > I would prefer that unexpected error codes cause glibc to crash. This > immediately alerts kernel and glibc developers of a mismatch in their > expectations before this ever gets out of experimental distributions. That's my inclination as well. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-20 16:15 ` Futex error handling Torvald Riegel 2014-10-21 22:34 ` Roland McGrath @ 2014-10-23 19:35 ` Carlos O'Donell 2014-10-23 20:47 ` Rich Felker 1 sibling, 1 reply; 35+ messages in thread From: Carlos O'Donell @ 2014-10-23 19:35 UTC (permalink / raw) To: Torvald Riegel, GLIBC Devel; +Cc: Darren Hart On 10/20/2014 12:14 PM, Torvald Riegel wrote: > On Tue, 2014-09-16 at 17:36 +0200, Torvald Riegel wrote: >> We got complains from the kernel side that glibc wouldn't react properly >> to futex errors being returned. Thus, I'm looking at what we'd need to >> actually improve. I'm using this here as a documentation for futex >> error codes: https://lkml.org/lkml/2014/5/15/356 >> >> Generally, we have three categories of faults (ie, the cause for an >> error/failure): >> * Bug in glibc ("BL") >> * Bug in the client program ("BP") >> * Failures that are neither a bug in glibc nor the program ("F") >> >> Also, there are cases where it's not a "real" failure, but just >> something that is expected behavior that needs to be handled ("NF"). >> >> I'm not aware of a general policy about whether glibc should abort or >> assert (ie, abort only with assertion checks enabled) when the fault is >> in the BL or BP categories. I'd say we don't, because there's no way to >> handle it anyway, and other things will likely go wrong; but I don't >> have a strong opinion. Thoughts? >> >> For every futex op, here's a list of how I'd categorize the possible >> error codes (I'm ignoring ENOSYS, which is NF when feature testing (or >> BL)): >> >> FUTEX_WAIT: >> * EFAULT is either BL or BP. Nothing we can do. Should have failed >> earlier when we accessed the futex variable. Abort. >> * EINVAL (alignment and timeout normalization) is BL/BP. Abort. >> * EWOULDBLOCK, ETIMEDOUT are NF. Don't know. WARNING: Aborting on EWOULDBLOCK is dangerous since we might just wish to deadlock the single thread, so a developer can attach with the debugger to inspect thread state while other threads keep running. >> FUTEX_WAKE, FUTEX_WAKE_OP: >> * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this >> case. This is due to how futexes work when combined with certain rules >> for destruction of the underlying synchronization data structure; see my >> description of the mutex destruction issue (but this can happen with >> other data structures such as semaphores or cond vars too): >> https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html Don't know. >> * EINVAL (futex alignment) is BL/BP. Abort. >> * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or* >> NF. The latter is caused by the mutex destruction issue, only that a >> pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory >> location but one which has been reused for a PI futex. Thus, we must >> not abort or assert in this case. Don't know. >> >> FUTEX_REQUEUE: >> * Like FUTEX_WAKE, except that it's not safe to use concurrently with >> possible destruction / reuse of the futex memory (because requeueing to >> a futex that's unrelated to the new futex located in reused memory is >> bad). Are there any error codes for FUTEX_REQUEUE or are you saying it's identical to FUTEX_WAKE? >> FUTEX_REQUEUE_CMP: >> * Like FUTEX_REQUEUE. EAGAIN is NF. Is EGAIN the only one to consider? Yes, EAGAIN needs some kind of special handling, but I don't know what. >> FUTEX_WAKE_OP: >> * Haven't looked at this yet. Only used in condvars, and might not be >> necessary for a condvar that's not based on a condvar-internal lock. >> >> FUTEX_WAIT_BITSET / FUTEX_WAKE_BITSET: >> * Like FUTEX_WAIT / FUTEX_WAKE. The additional EINVAL is BL. Abort. >> FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI: >> * EFAULT is BL/BP. Abort. >> * ENOMEM is F. We need to handle this. Yes, pass it along. >> * EINVAL, EPERM, ESRCH are BL/BP. Abort. >> * EAGAIN and ETIMEDOUT are NF. Don't know. >> * EDEADLOCK is BP (or BL). See WARNING above. >> * EOWNERDIED is F. Don't know. >> FUTEX_UNLOCK_PI: >> (* I guess this can return EFAULT too, which is BL/BP.) >> * EINVAL and EPERM are BL/BP. I don't think there's a mutex destruction >> issue with PI locks because the kernel takes care of both resetting the >> value of the futex var and waking up threads; it should do so in a way >> that won't access reused memory. I guess we should check that though... Abort. >> FUTEX_WAIT_REQUEUE_PI: >> * EFAULT and EINVAL are BL/BP. Abort. >> * EWOULDBLOCK and ETIMEDOUT are NF. Don't know. >> * EOWNERDIED is F. Don't know. How did we get here? >> FUTEX_CMP_REQUEUE_PI is like FUTEX_CMP_REQUEUE except: >> * ENOMEM is F. Pass it along. >> * EPERM and ESRCH are BL/BP. Abort. >> * EDEADLOCK is BP (or BL). See WARNING above. >> >> >> I think the next steps to improve this should be: >> 1) Getting consensus on how we want to handle BL and BP in general. > > Ping. I think that's a fairly generic issue (in the sense of likely > being a question for other things than just mutexes too), so I'd like > some input on the direction. > > Carlos, Joseph, Roland: Do you have any comments? Roland already responded, but I'll reiterate with wiki updates: (1) Bugs in the GNU C library should fail early and visibly, balanced by runtime detection cost. (2) Bugs in the user program should fail early and visibly, balanced by runtime detection cost in the GNU C library. (3) Unknown error codes from the kernel should cause immediate failure. https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling >> 2) Applying the outcome of that to the list above and getting consensus >> on the result. >> 3) For each case of F, find the best way to report it to the caller >> (e.g., error code from the pthreads function, abort, ...). >> 4) Change each use of the futexes accordingly, one at a time. >> >> I've asked Michael Kerrisk for the state of the futex error docs, but >> haven't gotten a reply yet. (Last time I checked, the new input from >> the email I referred to above wasn't part of the futex docs yet.) > > Michael has replied, and it's on his list of things to do. Hopefully that helps, maybe it doesn't, I answered the easy ones, and perhaps the first step is to fix the easy ones first :-) Cheers, Carlos. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Futex error handling 2014-10-23 19:35 ` Carlos O'Donell @ 2014-10-23 20:47 ` Rich Felker 0 siblings, 0 replies; 35+ messages in thread From: Rich Felker @ 2014-10-23 20:47 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Torvald Riegel, GLIBC Devel, Darren Hart On Thu, Oct 23, 2014 at 03:34:46PM -0400, Carlos O'Donell wrote: > On 10/20/2014 12:14 PM, Torvald Riegel wrote: > > On Tue, 2014-09-16 at 17:36 +0200, Torvald Riegel wrote: > >> We got complains from the kernel side that glibc wouldn't react properly > >> to futex errors being returned. Thus, I'm looking at what we'd need to > >> actually improve. I'm using this here as a documentation for futex > >> error codes: https://lkml.org/lkml/2014/5/15/356 > >> > >> Generally, we have three categories of faults (ie, the cause for an > >> error/failure): > >> * Bug in glibc ("BL") > >> * Bug in the client program ("BP") > >> * Failures that are neither a bug in glibc nor the program ("F") > >> > >> Also, there are cases where it's not a "real" failure, but just > >> something that is expected behavior that needs to be handled ("NF"). > >> > >> I'm not aware of a general policy about whether glibc should abort or > >> assert (ie, abort only with assertion checks enabled) when the fault is > >> in the BL or BP categories. I'd say we don't, because there's no way to > >> handle it anyway, and other things will likely go wrong; but I don't > >> have a strong opinion. Thoughts? > >> > >> For every futex op, here's a list of how I'd categorize the possible > >> error codes (I'm ignoring ENOSYS, which is NF when feature testing (or > >> BL)): > >> > >> FUTEX_WAIT: > >> * EFAULT is either BL or BP. Nothing we can do. Should have failed > >> earlier when we accessed the futex variable. > > Abort. Yes. > >> * EINVAL (alignment and timeout normalization) is BL/BP. > > Abort. For alignment, yes. For timeout normalization, no, this is a specified error that must be passed back to the application. Of course if glibc has already checked it in userspace, then further EINVAL from the kernel should not be possible and aborting is in principle valid. > >> * EWOULDBLOCK, ETIMEDOUT are NF. > > Don't know. > > WARNING: Aborting on EWOULDBLOCK is dangerous since we might just > wish to deadlock the single thread, so a developer can attach with > the debugger to inspect thread state while other threads keep running. NF is never a case you would abort on. It means a condition that occurs regularly in a valid program. > >> FUTEX_WAKE, FUTEX_WAKE_OP: > >> * EFAULT can be BL/BP *or* NF, so we *must not* abort or assert in this > >> case. This is due to how futexes work when combined with certain rules > >> for destruction of the underlying synchronization data structure; see my > >> description of the mutex destruction issue (but this can happen with > >> other data structures such as semaphores or cond vars too): > >> https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html > > Don't know. This is complex, but unless glibc is going to take the effort (and likely, cost) to prevent this mostly-harmless race, this case for FUTEX_WAKE must be considered as NF. For FUTEX_WAKE_OP, EFAULT is BL/BP if the fault is on the addr OP is applied to. > >> * EINVAL (futex alignment) is BL/BP. > > Abort. Yes. > >> * EINVAL (inconsistent state or hit a PI futex) can be either BL/BP *or* > >> NF. The latter is caused by the mutex destruction issue, only that a > >> pending FUTEX_WAKE after destruction doesn't hit an inaccessible memory > >> location but one which has been reused for a PI futex. Thus, we must > >> not abort or assert in this case. > > Don't know. I don't know much of anything about PI implementation details, so same here -- don't know. > >> FUTEX_REQUEUE: > >> * Like FUTEX_WAKE, except that it's not safe to use concurrently with > >> possible destruction / reuse of the futex memory (because requeueing to > >> a futex that's unrelated to the new futex located in reused memory is > >> bad). > > Are there any error codes for FUTEX_REQUEUE or are you saying it's > identical to FUTEX_WAKE? I think some of the NF cases with FUTEX_WAKE are BL/BP for FUTEX_REQUEUE. > >> FUTEX_REQUEUE_CMP: > >> * Like FUTEX_REQUEUE. EAGAIN is NF. > > Is EGAIN the only one to consider? > > Yes, EAGAIN needs some kind of special handling, but I don't know what. It's the only additional case beyond FUTEX_REQUEUE. > >> FUTEX_WAKE_OP: > >> * Haven't looked at this yet. Only used in condvars, and might not be > >> necessary for a condvar that's not based on a condvar-internal lock. > >> > >> FUTEX_WAIT_BITSET / FUTEX_WAKE_BITSET: > >> * Like FUTEX_WAIT / FUTEX_WAKE. The additional EINVAL is BL. > > Abort. > > >> FUTEX_LOCK_PI, FUTEX_TRYLOCK_PI: > >> * EFAULT is BL/BP. > > Abort. I think these are right. > >> * ENOMEM is F. We need to handle this. > > Yes, pass it along. Passing it along is probably not desirable; applications are not going to be prepared to deal with ENOMEM from pthread_mutex_lock. But I'm not sure what would be preferable. > >> * EINVAL, EPERM, ESRCH are BL/BP. > > Abort. Are you sure ESRCH can't happen as a race with robust mutex owner death? > >> * EAGAIN and ETIMEDOUT are NF. > > Don't know. NF is NF. > >> * EDEADLOCK is BP (or BL). > > See WARNING above. I don't see how it's related. > >> * EOWNERDIED is F. > > Don't know. I would think it's NF..? > >> FUTEX_UNLOCK_PI: > >> (* I guess this can return EFAULT too, which is BL/BP.) > >> * EINVAL and EPERM are BL/BP. I don't think there's a mutex destruction > >> issue with PI locks because the kernel takes care of both resetting the > >> value of the futex var and waking up threads; it should do so in a way > >> that won't access reused memory. I guess we should check that though... > > Abort. I think this is correct but it should be checked. > >> FUTEX_WAIT_REQUEUE_PI: > >> * EFAULT and EINVAL are BL/BP. > > Abort. > > >> * EWOULDBLOCK and ETIMEDOUT are NF. > > Don't know. I think these look right. > >> * EOWNERDIED is F. > > Don't know. How did we get here? Normal robust mutex usage. > >> FUTEX_CMP_REQUEUE_PI is like FUTEX_CMP_REQUEUE except: > >> * ENOMEM is F. > > Pass it along. How would this be passed along? Requeue is used internally in cond vars. If requeue fails, it needs to be replaced with a simple wake where the target thread will be responsible for attempting to wait on the mutex (and possibly failing there, in which case it takes responsibility for what to do). > >> * EPERM and ESRCH are BL/BP. > > Abort. Not sure. Probably right. > >> * EDEADLOCK is BP (or BL). > > See WARNING above. Again, unclear on this. > >> I think the next steps to improve this should be: > >> 1) Getting consensus on how we want to handle BL and BP in general. > > > > Ping. I think that's a fairly generic issue (in the sense of likely > > being a question for other things than just mutexes too), so I'd like > > some input on the direction. > > > > Carlos, Joseph, Roland: Do you have any comments? > > Roland already responded, but I'll reiterate with wiki updates: > > (1) Bugs in the GNU C library should fail early and visibly, balanced by > runtime detection cost. > > (2) Bugs in the user program should fail early and visibly, balanced by > runtime detection cost in the GNU C library. > > (3) Unknown error codes from the kernel should cause immediate failure. > > https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling I agree with these principles. > >> 2) Applying the outcome of that to the list above and getting consensus > >> on the result. > >> 3) For each case of F, find the best way to report it to the caller > >> (e.g., error code from the pthreads function, abort, ...). > >> 4) Change each use of the futexes accordingly, one at a time. > >> > >> I've asked Michael Kerrisk for the state of the futex error docs, but > >> haven't gotten a reply yet. (Last time I checked, the new input from > >> the email I referred to above wasn't part of the futex docs yet.) > > > > Michael has replied, and it's on his list of things to do. > > Hopefully that helps, maybe it doesn't, I answered the easy ones, and > perhaps the first step is to fix the easy ones first :-) :) Rich ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2014-10-30 17:09 UTC | newest] Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-09-16 15:36 Futex error handling Torvald Riegel 2014-09-16 16:56 ` Rich Felker 2014-09-16 18:13 ` Torvald Riegel 2014-09-16 18:55 ` Rich Felker 2014-09-18 12:27 ` Torvald Riegel 2014-09-18 17:40 ` Rich Felker 2014-09-18 17:58 ` Torvald Riegel 2014-09-18 20:41 ` Rich Felker 2014-10-30 1:55 ` Darren Hart 2014-10-30 6:51 ` Carlos O'Donell 2014-09-17 19:41 ` Roland McGrath 2014-09-17 19:46 ` Torvald Riegel 2014-09-17 19:59 ` Roland McGrath 2014-09-17 23:17 ` Rich Felker 2014-10-23 19:16 ` Add futex wrapper to glibc? Carlos O'Donell 2014-10-23 21:07 ` Joseph S. Myers 2014-10-24 2:00 ` Carlos O'Donell 2014-10-26 7:19 ` Mike Frysinger 2014-10-27 17:52 ` Joseph S. Myers 2014-10-27 20:28 ` Roland McGrath 2014-10-30 2:07 ` Darren Hart 2014-10-30 9:32 ` Torvald Riegel 2014-10-30 2:03 ` Darren Hart 2014-10-23 21:12 ` Rich Felker 2014-10-30 1:59 ` Darren Hart 2014-10-30 2:44 ` Rich Felker 2014-10-30 2:55 ` Carlos O'Donell 2014-10-30 3:26 ` Rich Felker 2014-10-30 17:09 ` Joseph S. Myers 2014-10-20 16:15 ` Futex error handling Torvald Riegel 2014-10-21 22:34 ` Roland McGrath 2014-10-23 2:37 ` Carlos O'Donell 2014-10-23 17:48 ` Roland McGrath 2014-10-23 19:35 ` Carlos O'Donell 2014-10-23 20:47 ` Rich Felker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).