* [PATCH] Unify pthread_once (bug 15215) @ 2013-05-08 14:44 Torvald Riegel 2013-05-08 17:51 ` Rich Felker 2013-05-23 4:15 ` Carlos O'Donell 0 siblings, 2 replies; 21+ messages in thread From: Torvald Riegel @ 2013-05-08 14:44 UTC (permalink / raw) To: GLIBC Devel, libc-ports [-- Attachment #1: Type: text/plain, Size: 2308 bytes --] See http://sourceware.org/bugzilla/show_bug.cgi?id=15215 for background. I1 and I2 follow essentially the same algorithm, and we can replace it with a unified variant, as the bug suggests. See the attached patch for a modified version of the sparc instance. The differences between both are either cosmetic, or are unnecessary changes (ie, how the init-finished state is set (atomic_inc vs. store), or how the fork generations are compared). Both I1 and I2 were missing a release memory order (MO) when marking once_control as finished initialization. If the particular arch doesn't need a HW barrier for release, we at least need a compiler barrier; if it's needed, the original I1 and I2 are not guaranteed to work. Both I1 and I2 were missing acquire MO on the very first load of once_control. This needs to synchronize with the release MO on setting the state to init-finished, so without it it's not guaranteed to work either. Note that this will make a call to pthread_once that doesn't need to actually run the init routine slightly slower due to the additional acquire barrier. If you're really concerned about this overhead, speak up. There are ways to avoid it, but it comes with additional complexity and bookkeeping. I'm currently also using the existing atomic_{read/write}_barrier functions instead of not-yet-existing load_acq or store_rel functions. I'm not sure whether the latter can have somewhat more efficient implementations on Power and ARM; if so, and if you're concerned about the overhead, we can add load_acq and store_rel to atomic.h and start using it. This would be in line with C11, where we should eventually be heading to anyways, IMO. Both I1 and I2 have an ABA issue on __fork_generation, as explained in the comments that the patch adds. How do you all feel about this? I can't present a simple fix right now, but I believe it could be fixed with additional bookkeeping. If there's no objection to the essence of this patch, I'll post another patch that actually replaces I1 and I2 with the modified variant in the attached patch. Cleaning up the magic numbers, perhaps fixing the ABA issue, and comparing to the custom asm versions would be next. I had a brief look at the latter, and at least x86 doesn't seem to do anything logically different. Torvald [-- Attachment #2: patch --] [-- Type: text/x-patch, Size: 4328 bytes --] diff --git a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c index 5879f44..f9b0953 100644 --- a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c +++ b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c @@ -28,11 +28,31 @@ clear_once_control (void *arg) { pthread_once_t *once_control = (pthread_once_t *) arg; + /* Reset to the uninitialized state here (see __pthread_once). Also, we + don't need a stronger memory order because we do not need to make any + other of our writes visible to other threads that see this value. */ *once_control = 0; lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); } +/* This is similar to a lock implementation, but we distinguish between three + states: not yet initialized (0), initialization finished (2), and + initialization in progress (__fork_generation | 1). If in the first state, + threads will try to run the initialization by moving to the second state; + the first thread to do so via a CAS on once_control runs init_routine, + other threads block. + When forking the process, some threads can be interrupted during the second + state; they won't be present in the forked child, so we need to restart + initialization in the child. To distinguish an in-progress initialization + from an interrupted initialization (in which case we need to reclaim the + lock), we look at the fork generation that's part of the second state: We + can reclaim iff it differs from the current fork generation. + XXX: This algorithm has an ABA issue on the fork generation: If an + initialization is interrupted, we then fork 2^30 times (30b of once_control + are used for the fork generation), and try to initialize again, we can + deadlock because we can't distinguish the in-progress and interrupted cases + anymore. */ int __pthread_once (once_control, init_routine) pthread_once_t *once_control; @@ -42,15 +62,26 @@ __pthread_once (once_control, init_routine) { int oldval, val, newval; + /* We need acquire memory order for this load because if the value + signals that initialization has finished, we need to be see any + data modifications done during initialization. */ val = *once_control; + atomic_read_barrier(); do { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) + /* Check if the initialization has already been done. */ + if (__builtin_expect ((val & 2) != 0, 1)) return 0; oldval = val; - newval = (oldval & 3) | __fork_generation | 1; + /* We try to set the state to in-progress and having the current + fork generation. We don't need atomic accesses for the fork + generation because it's immutable in a particular process, and + forked child processes start with a single thread that modified + the generation. */ + newval = __fork_generation | 1; + /* We need acquire memory order here for the same reason as for the + load from once_control above. */ val = atomic_compare_and_exchange_val_acq (once_control, newval, oldval); } @@ -59,9 +90,10 @@ __pthread_once (once_control, init_routine) /* Check if another thread already runs the initializer. */ if ((oldval & 1) != 0) { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) + /* Check whether the initializer execution was interrupted by a + fork. (We know that for both values, bit 0 is set and bit 1 is + not.) */ + if (oldval == newval) { /* Same generation, some other thread was faster. Wait. */ lll_futex_wait (once_control, newval, LLL_PRIVATE); @@ -79,8 +111,11 @@ __pthread_once (once_control, init_routine) pthread_cleanup_pop (0); - /* Add one to *once_control. */ - atomic_increment (once_control); + /* Mark *once_control as having finished the initialization. We need + release memory order here because we need to synchronize with other + threads that want to use the initialized data. */ + atomic_write_barrier(); + *once_control = 2; /* Wake up all other threads. */ lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-08 14:44 [PATCH] Unify pthread_once (bug 15215) Torvald Riegel @ 2013-05-08 17:51 ` Rich Felker 2013-05-08 20:47 ` Torvald Riegel 2013-05-23 4:15 ` Carlos O'Donell 1 sibling, 1 reply; 21+ messages in thread From: Rich Felker @ 2013-05-08 17:51 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On Wed, May 08, 2013 at 04:43:57PM +0200, Torvald Riegel wrote: > Note that this will make a call to pthread_once that doesn't need to > actually run the init routine slightly slower due to the additional > acquire barrier. If you're really concerned about this overhead, speak > up. There are ways to avoid it, but it comes with additional complexity > and bookkeeping. On the one hand, I think it should be avoided if at all possible. pthread_once is the correct, canonical way to do initialization (as opposed to hacks like library init functions or global ctors), and the main doubt lots of people have about doing it the correct way is that they're going to kill performance if they call pthread_once from every point where initialization needs to have been completed. If every call imposes memory synchronization, performance might become a real issue discouraging people from following best practices for library initialization. On the other hand, I don't think it's conforming to elide the barrier. POSIX states (XSH 4.11 Memory Synchronization): "The pthread_once() function shall synchronize memory for the first call in each thread for a given pthread_once_t object." Since it's impossible to track whether a call is the first call in a given thread, this means every call to pthread_once() is required to be a full memory barrier. I suspect this is unintended, and we should perhaps file a bug report with the Austin Group and see if the requirement can be relaxed. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-08 17:51 ` Rich Felker @ 2013-05-08 20:47 ` Torvald Riegel 2013-05-08 21:25 ` Rich Felker 0 siblings, 1 reply; 21+ messages in thread From: Torvald Riegel @ 2013-05-08 20:47 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, libc-ports On Wed, 2013-05-08 at 13:51 -0400, Rich Felker wrote: > On Wed, May 08, 2013 at 04:43:57PM +0200, Torvald Riegel wrote: > > Note that this will make a call to pthread_once that doesn't need to > > actually run the init routine slightly slower due to the additional > > acquire barrier. If you're really concerned about this overhead, speak > > up. There are ways to avoid it, but it comes with additional complexity > > and bookkeeping. > > On the one hand, I think it should be avoided if at all possible. > pthread_once is the correct, canonical way to do initialization (as > opposed to hacks like library init functions or global ctors), and the > main doubt lots of people have about doing it the correct way is that > they're going to kill performance if they call pthread_once from every > point where initialization needs to have been completed. If every call > imposes memory synchronization, performance might become a real issue > discouraging people from following best practices for library > initialization. Well, what we precisely need is that the initialization happens-before (ie, the relation from the, say, C11 memory model) every call that does not in fact initialize. If initialization happened on another thread, you need to synchronize. But from there on, you are essentially free to establish this in any way you want. And there are ways, because happens-before is more-or-less transitive. > On the other hand, I don't think it's conforming to elide the barrier. > POSIX states (XSH 4.11 Memory Synchronization): > > "The pthread_once() function shall synchronize memory for the first > call in each thread for a given pthread_once_t object." No, it's not. You could see just parts of the effects of the initialization; potentially reading garbage can't be the intended semantics :) > Since it's impossible to track whether a call is the first call in a > given thread Are you sure about this? :) > this means every call to pthread_once() is required to > be a full memory barrier. Note that we do not need a full memory barrier, just an acquire memory barrier. So this only matters on architectures with memory models that give weaker per-default ordering guarantees. For example, this doesn't add any hardware barrier instructions on x86 or Sparc TSO. But for Power and ARM it does. > I suspect this is unintended, and we should > perhaps file a bug report with the Austin Group and see if the > requirement can be relaxed. I don't think that other semantics are intended. If you return from pthread_once(), initialization should have happened before that. If it doesn't, you don't really know whether initialization happened once, so programs would be forced to do their own synchronization. Torvald ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-08 20:47 ` Torvald Riegel @ 2013-05-08 21:25 ` Rich Felker 2013-05-09 8:39 ` Torvald Riegel 0 siblings, 1 reply; 21+ messages in thread From: Rich Felker @ 2013-05-08 21:25 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On Wed, May 08, 2013 at 10:47:26PM +0200, Torvald Riegel wrote: > On Wed, 2013-05-08 at 13:51 -0400, Rich Felker wrote: > > On Wed, May 08, 2013 at 04:43:57PM +0200, Torvald Riegel wrote: > > > Note that this will make a call to pthread_once that doesn't need to > > > actually run the init routine slightly slower due to the additional > > > acquire barrier. If you're really concerned about this overhead, speak > > > up. There are ways to avoid it, but it comes with additional complexity > > > and bookkeeping. > > > > On the one hand, I think it should be avoided if at all possible. > > pthread_once is the correct, canonical way to do initialization (as > > opposed to hacks like library init functions or global ctors), and the > > main doubt lots of people have about doing it the correct way is that > > they're going to kill performance if they call pthread_once from every > > point where initialization needs to have been completed. If every call > > imposes memory synchronization, performance might become a real issue > > discouraging people from following best practices for library > > initialization. > > Well, what we precisely need is that the initialization happens-before > (ie, the relation from the, say, C11 memory model) every call that does > not in fact initialize. If initialization happened on another thread, > you need to synchronize. But from there on, you are essentially free to > establish this in any way you want. And there are ways, because > happens-before is more-or-less transitive. > > > On the other hand, I don't think it's conforming to elide the barrier. > > POSIX states (XSH 4.11 Memory Synchronization): > > > > "The pthread_once() function shall synchronize memory for the first > > call in each thread for a given pthread_once_t object." > > No, it's not. You could see just parts of the effects of the > initialization; potentially reading garbage can't be the intended > semantics :) The work of synchronizing memory should take place at the end of the pthread_once call that actually does the initialization, rather than in the other threads which synchronize. This is the way the x86 memory model naturally works, but perhaps it's prohibitive to achieve on other architectures. However, the idea is that pthread_once only runs init routines a small finite number of times, so even if you had to so some horrible hack that makes the synchronization on return 1000x slower (e.g. a syscall), it would still be better than incurring the cost of a full acquire barrier in each subsequent call, which ideally should have the same cost as a call to an empty function. > > Since it's impossible to track whether a call is the first call in a > > given thread > > Are you sure about this? :) It's impossible with bounded memory requirements, and thus impossible in general (allocating memory for the tracking might fail). > > this means every call to pthread_once() is required to > > be a full memory barrier. > > Note that we do not need a full memory barrier, just an acquire memory > barrier. So this only matters on architectures with memory models that > give weaker per-default ordering guarantees. For example, this doesn't > add any hardware barrier instructions on x86 or Sparc TSO. But for > Power and ARM it does. Yes, I see that. > > I suspect this is unintended, and we should > > perhaps file a bug report with the Austin Group and see if the > > requirement can be relaxed. > > I don't think that other semantics are intended. If you return from > pthread_once(), initialization should have happened before that. If it > doesn't, you don't really know whether initialization happened once, so > programs would be forced to do their own synchronization. I think my confusion is merely that POSIX does not define the phrase "synchronize memory", and in the absence of a definition, "full memory barrier" (both release and acquire semantics) is the only reasonable interpretation I can find. In other words, it seems like a pathological conforming program could attempt to use the language in the specification to use pthread_once as a release barrier. I'm not sure if there are ways this could be meaningfully arranged (i.e. with well-defined ordering; off-hand, I would think tricks with cancelling an in-progress invocation of pthread_once might make it possible. By the way, cancellation probably makes the above POSIX text incorrect anyway; a thread could call pthread_once on the same pthread_once_t object more than once, with the second call not being a no-op, if the initialization routine for the first call is cancelled and the second call takes place from a cancellation cleanup handler. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-08 21:25 ` Rich Felker @ 2013-05-09 8:39 ` Torvald Riegel 2013-05-09 14:02 ` Rich Felker 0 siblings, 1 reply; 21+ messages in thread From: Torvald Riegel @ 2013-05-09 8:39 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, libc-ports On Wed, 2013-05-08 at 17:25 -0400, Rich Felker wrote: > On Wed, May 08, 2013 at 10:47:26PM +0200, Torvald Riegel wrote: > > On Wed, 2013-05-08 at 13:51 -0400, Rich Felker wrote: > > > On Wed, May 08, 2013 at 04:43:57PM +0200, Torvald Riegel wrote: > > > > Note that this will make a call to pthread_once that doesn't need to > > > > actually run the init routine slightly slower due to the additional > > > > acquire barrier. If you're really concerned about this overhead, speak > > > > up. There are ways to avoid it, but it comes with additional complexity > > > > and bookkeeping. > > > > > > On the one hand, I think it should be avoided if at all possible. > > > pthread_once is the correct, canonical way to do initialization (as > > > opposed to hacks like library init functions or global ctors), and the > > > main doubt lots of people have about doing it the correct way is that > > > they're going to kill performance if they call pthread_once from every > > > point where initialization needs to have been completed. If every call > > > imposes memory synchronization, performance might become a real issue > > > discouraging people from following best practices for library > > > initialization. > > > > Well, what we precisely need is that the initialization happens-before > > (ie, the relation from the, say, C11 memory model) every call that does > > not in fact initialize. If initialization happened on another thread, > > you need to synchronize. But from there on, you are essentially free to > > establish this in any way you want. And there are ways, because > > happens-before is more-or-less transitive. > > > > > On the other hand, I don't think it's conforming to elide the barrier. > > > POSIX states (XSH 4.11 Memory Synchronization): > > > > > > "The pthread_once() function shall synchronize memory for the first > > > call in each thread for a given pthread_once_t object." > > > > No, it's not. You could see just parts of the effects of the > > initialization; potentially reading garbage can't be the intended > > semantics :) > > The work of synchronizing memory should take place at the end of the > pthread_once call that actually does the initialization, rather than > in the other threads which synchronize. This isn't how the (hardware) memory models work. And it makes sense; if one CPU could prevent reordering in other CPUs (which would be required for what you have in mind), this would be an unconditional big hammer. Instead, CPUs can opt in by issuing barriers when needed, which then prevent reordering wrt. what happens globally to memory. > This is the way the x86 memory > model naturally works, but perhaps it's prohibitive to achieve on > other architectures. The x86 memory model is just stronger than others, so certain reorderings don't appear or aren't visible to programs. IOW, you don't need to do certain things explicitly for the hardware. You still do need the appropriate compiler barriers though; for example, if the compiler reorders the once_control release store to before the initialization stores, you still have an incorrectly synchronized program, even on x86. More background on this can be found in the C11 and C++11 memory models, in the Batty et al. paper formalizing C++11's. This list of mappings from these language-level models to HW could also be interesting (note that it doesn't cover the compiler side of this explicitly): http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html > However, the idea is that pthread_once only runs > init routines a small finite number of times, so even if you had to so > some horrible hack that makes the synchronization on return 1000x > slower (e.g. a syscall), it would still be better than incurring the > cost of a full acquire barrier in each subsequent call, which ideally > should have the same cost as a call to an empty function. That would be true if non-first calls appear 1000*(syscall_overhead/acquire_mbar_overhead) times. But do they? I think the way forward here is to: 1) Fix the implementation (ie, add the mbars). 2) Let the arch maintainers of the affected archs with weak memory moels (or people interested in this) look at this and come up with some measurements for how much overhead the mbars actually present in real code. 3) Decide whether this overhead justifies adding optimizations. This patch is step 1. I don't think we need to merge this step 3. > > > Since it's impossible to track whether a call is the first call in a > > > given thread > > > > Are you sure about this? :) > > It's impossible with bounded memory requirements, and thus impossible > in general (allocating memory for the tracking might fail). I believe you think about needing to track more than you actually need to know. All you need is knowing whether a thread established a happens-before with whoever initialized the once_control in the past. So you do need per-thread state, and per-once_control state, but not necessarily more. If in doubt, you can still do the acquire barrier. > > > this means every call to pthread_once() is required to > > > be a full memory barrier. > > > > Note that we do not need a full memory barrier, just an acquire memory > > barrier. So this only matters on architectures with memory models that > > give weaker per-default ordering guarantees. For example, this doesn't > > add any hardware barrier instructions on x86 or Sparc TSO. But for > > Power and ARM it does. > > Yes, I see that. > > > > I suspect this is unintended, and we should > > > perhaps file a bug report with the Austin Group and see if the > > > requirement can be relaxed. > > > > I don't think that other semantics are intended. If you return from > > pthread_once(), initialization should have happened before that. If it > > doesn't, you don't really know whether initialization happened once, so > > programs would be forced to do their own synchronization. > > I think my confusion is merely that POSIX does not define the phrase > "synchronize memory", and in the absence of a definition, "full memory > barrier" (both release and acquire semantics) is the only reasonable > interpretation I can find. In other words, it seems like a > pathological conforming program could attempt to use the language in > the specification to use pthread_once as a release barrier. I'm not > sure if there are ways this could be meaningfully arranged (i.e. with > well-defined ordering; off-hand, I would think tricks with cancelling > an in-progress invocation of pthread_once might make it possible. I agree that the absence of a proper memory model makes reasoning about some of this hard. I guess it would be best if POSIX would just endorse C11's memory model, and specify the intended semantics in relation to this model where needed. For example, the C11 variant of pthread_once has the following requirement: "Completion of an effective call to the call_once function synchronizes with all subsequent calls to the call_once function with the same value of flag." This makes intuitive sense, and is what's enforced by the patch I sent. ("synchronizes with" is a well-defined relationship in the model, and contributes to happens-before.) Torvald ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-09 8:39 ` Torvald Riegel @ 2013-05-09 14:02 ` Rich Felker 2013-05-09 15:14 ` Torvald Riegel 0 siblings, 1 reply; 21+ messages in thread From: Rich Felker @ 2013-05-09 14:02 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On Thu, May 09, 2013 at 10:39:25AM +0200, Torvald Riegel wrote: > > However, the idea is that pthread_once only runs > > init routines a small finite number of times, so even if you had to so > > some horrible hack that makes the synchronization on return 1000x > > slower (e.g. a syscall), it would still be better than incurring the > > cost of a full acquire barrier in each subsequent call, which ideally > > should have the same cost as a call to an empty function. > > That would be true if non-first calls appear > 1000*(syscall_overhead/acquire_mbar_overhead) times. But do they? In theory they might. Imagine a math function that might be called millions or billions of times, but which depends on a precomputed table. Personally, my view of best-practices is that you should use 'static const' for such tables, even if they're huge, rather than runtime generation, but unfortunately I think my view is still a minority one... Also, keep in mind that even large overhead on the first call to pthread_once is likely to be small in comparison to the time spent in the initialization function, while even small overhead is huge in comparison to a call to pthread_once that doesn't call the initialization function. > I think the way forward here is to: > 1) Fix the implementation (ie, add the mbars). > 2) Let the arch maintainers of the affected archs with weak memory moels > (or people interested in this) look at this and come up with some > measurements for how much overhead the mbars actually present in real > code. > 3) Decide whether this overhead justifies adding optimizations. > > This patch is step 1. I don't think we need to merge this step 3. I think this is a reasonable approach. > > > > Since it's impossible to track whether a call is the first call in a > > > > given thread > > > > > > Are you sure about this? :) > > > > It's impossible with bounded memory requirements, and thus impossible > > in general (allocating memory for the tracking might fail). > > I believe you think about needing to track more than you actually need > to know. All you need is knowing whether a thread established a > happens-before with whoever initialized the once_control in the past. > So you do need per-thread state, and per-once_control state, but not > necessarily more. If in doubt, you can still do the acquire barrier. The number of threads and the number of once controls are both unbounded. You might could solve the problem with serial numbers if there were room to store a sufficiently large one in the once control, but the once control is 32-bit and the serial numbers could (in a pathological but valid application) easily overflow 32 bits. > > I think my confusion is merely that POSIX does not define the phrase > > "synchronize memory", and in the absence of a definition, "full memory > > barrier" (both release and acquire semantics) is the only reasonable > > interpretation I can find. In other words, it seems like a > > pathological conforming program could attempt to use the language in > > the specification to use pthread_once as a release barrier. I'm not > > sure if there are ways this could be meaningfully arranged (i.e. with > > well-defined ordering; off-hand, I would think tricks with cancelling > > an in-progress invocation of pthread_once might make it possible. > > I agree that the absence of a proper memory model makes reasoning about > some of this hard. I guess it would be best if POSIX would just endorse > C11's memory model, and specify the intended semantics in relation to > this model where needed. Agreed, and I suspect this is what they'll do. I can raise the issue, but perhaps you'd be better at expressing it. Let me know if you'd rather I do it. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-09 14:02 ` Rich Felker @ 2013-05-09 15:14 ` Torvald Riegel 2013-05-09 15:56 ` Rich Felker 0 siblings, 1 reply; 21+ messages in thread From: Torvald Riegel @ 2013-05-09 15:14 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, libc-ports On Thu, 2013-05-09 at 10:02 -0400, Rich Felker wrote: > On Thu, May 09, 2013 at 10:39:25AM +0200, Torvald Riegel wrote: > > > However, the idea is that pthread_once only runs > > > init routines a small finite number of times, so even if you had to so > > > some horrible hack that makes the synchronization on return 1000x > > > slower (e.g. a syscall), it would still be better than incurring the > > > cost of a full acquire barrier in each subsequent call, which ideally > > > should have the same cost as a call to an empty function. > > > > That would be true if non-first calls appear > > 1000*(syscall_overhead/acquire_mbar_overhead) times. But do they? > > In theory they might. Imagine a math function that might be called > millions or billions of times, but which depends on a precomputed > table. Personally, my view of best-practices is that you should use > 'static const' for such tables, even if they're huge, rather than > runtime generation, but unfortunately I think my view is still a > minority one... > > Also, keep in mind that even large overhead on the first call to > pthread_once is likely to be small in comparison to the time spent in > the initialization function, while even small overhead is huge in > comparison to a call to pthread_once that doesn't call the > initialization function. > > > I think the way forward here is to: > > 1) Fix the implementation (ie, add the mbars). > > 2) Let the arch maintainers of the affected archs with weak memory moels > > (or people interested in this) look at this and come up with some > > measurements for how much overhead the mbars actually present in real > > code. > > 3) Decide whether this overhead justifies adding optimizations. > > > > This patch is step 1. I don't think we need to merge this step 3. > > I think this is a reasonable approach. > > > > > > Since it's impossible to track whether a call is the first call in a > > > > > given thread > > > > > > > > Are you sure about this? :) > > > > > > It's impossible with bounded memory requirements, and thus impossible > > > in general (allocating memory for the tracking might fail). > > > > I believe you think about needing to track more than you actually need > > to know. All you need is knowing whether a thread established a > > happens-before with whoever initialized the once_control in the past. > > So you do need per-thread state, and per-once_control state, but not > > necessarily more. If in doubt, you can still do the acquire barrier. > > The number of threads and the number of once controls are both > unbounded. They are bounded by the available memory :) So if you can do with a fixed amount of data in both thread state and once_control state, you should be fine. > You might could solve the problem with serial numbers if > there were room to store a sufficiently large one in the once control, > but the once control is 32-bit and the serial numbers could (in a > pathological but valid application) easily overflow 32 bits. The overflow can be an issue, but in that case I guess you can still try to detect an overflow globally using global state, and just do the acquire barrier in this case. Informally, one can try to trade off a comparison of state in once_control with a TLS variable; if that is significantly faster than an acquire barrier, it can be useful; if it's about the same, it doesn't make sense. > > > I think my confusion is merely that POSIX does not define the phrase > > > "synchronize memory", and in the absence of a definition, "full memory > > > barrier" (both release and acquire semantics) is the only reasonable > > > interpretation I can find. In other words, it seems like a > > > pathological conforming program could attempt to use the language in > > > the specification to use pthread_once as a release barrier. I'm not > > > sure if there are ways this could be meaningfully arranged (i.e. with > > > well-defined ordering; off-hand, I would think tricks with cancelling > > > an in-progress invocation of pthread_once might make it possible. > > > > I agree that the absence of a proper memory model makes reasoning about > > some of this hard. I guess it would be best if POSIX would just endorse > > C11's memory model, and specify the intended semantics in relation to > > this model where needed. > > Agreed, and I suspect this is what they'll do. I can raise the issue, > but perhaps you'd be better at expressing it. Let me know if you'd > rather I do it. I have no idea how the POSIX folks would feel about this. After all, it would create quite a dependency for POSIX. With that in mind, trying to resolve this isn't very high on my todo list. If people would think that this would be beneficial for how we can deal with POSIX requirements, or for our users to understand the POSIX requirements better, I can definitely try to follow up on this. If you want to go ahead and start discussing with them, please do so (please CC me on the tracker bug). Torvald ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-09 15:14 ` Torvald Riegel @ 2013-05-09 15:56 ` Rich Felker 2013-05-10 8:31 ` Torvald Riegel 0 siblings, 1 reply; 21+ messages in thread From: Rich Felker @ 2013-05-09 15:56 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On Thu, May 09, 2013 at 05:14:28PM +0200, Torvald Riegel wrote: > > > I agree that the absence of a proper memory model makes reasoning about > > > some of this hard. I guess it would be best if POSIX would just endorse > > > C11's memory model, and specify the intended semantics in relation to > > > this model where needed. > > > > Agreed, and I suspect this is what they'll do. I can raise the issue, > > but perhaps you'd be better at expressing it. Let me know if you'd > > rather I do it. > > I have no idea how the POSIX folks would feel about this. After all, it > would create quite a dependency for POSIX. With that in mind, trying to > resolve this isn't very high on my todo list. If people would think > that this would be beneficial for how we can deal with POSIX > requirements, or for our users to understand the POSIX requirements > better, I can definitely try to follow up on this. If you want to go > ahead and start discussing with them, please do so (please CC me on the > tracker bug). POSIX is aligned with ISO C, and since the current version of ISO C is now the 2011 version, Issue 8 should be aligned to the 2011 version of the C standard. I don't think the issue is whether it happens, but making sure that the relevant text gets updated so that there's no ambiguity as to whether it's compatible with the new C standard and not placing unwanted additional implementation constraints like it may be doing now. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-09 15:56 ` Rich Felker @ 2013-05-10 8:31 ` Torvald Riegel 2013-05-10 13:22 ` Rich Felker 0 siblings, 1 reply; 21+ messages in thread From: Torvald Riegel @ 2013-05-10 8:31 UTC (permalink / raw) To: Rich Felker; +Cc: GLIBC Devel, libc-ports On Thu, 2013-05-09 at 11:56 -0400, Rich Felker wrote: > On Thu, May 09, 2013 at 05:14:28PM +0200, Torvald Riegel wrote: > > > > I agree that the absence of a proper memory model makes reasoning about > > > > some of this hard. I guess it would be best if POSIX would just endorse > > > > C11's memory model, and specify the intended semantics in relation to > > > > this model where needed. > > > > > > Agreed, and I suspect this is what they'll do. I can raise the issue, > > > but perhaps you'd be better at expressing it. Let me know if you'd > > > rather I do it. > > > > I have no idea how the POSIX folks would feel about this. After all, it > > would create quite a dependency for POSIX. With that in mind, trying to > > resolve this isn't very high on my todo list. If people would think > > that this would be beneficial for how we can deal with POSIX > > requirements, or for our users to understand the POSIX requirements > > better, I can definitely try to follow up on this. If you want to go > > ahead and start discussing with them, please do so (please CC me on the > > tracker bug). > > POSIX is aligned with ISO C, and since the current version of ISO C is > now the 2011 version, Issue 8 should be aligned to the 2011 version of > the C standard. I don't think the issue is whether it happens, but > making sure that the relevant text gets updated so that there's no > ambiguity as to whether it's compatible with the new C standard and > not placing unwanted additional implementation constraints like it may > be doing now. So, if it is aligned, would POSIX be willing to base their definitions on the C11 memory model? Or would they want to keep their sometimes rather vague requirements and just make sure that there are no obvious inconsistencies or gaps? Torvald ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-10 8:31 ` Torvald Riegel @ 2013-05-10 13:22 ` Rich Felker 0 siblings, 0 replies; 21+ messages in thread From: Rich Felker @ 2013-05-10 13:22 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On Fri, May 10, 2013 at 10:30:57AM +0200, Torvald Riegel wrote: > On Thu, 2013-05-09 at 11:56 -0400, Rich Felker wrote: > > On Thu, May 09, 2013 at 05:14:28PM +0200, Torvald Riegel wrote: > > > > > I agree that the absence of a proper memory model makes reasoning about > > > > > some of this hard. I guess it would be best if POSIX would just endorse > > > > > C11's memory model, and specify the intended semantics in relation to > > > > > this model where needed. > > > > > > > > Agreed, and I suspect this is what they'll do. I can raise the issue, > > > > but perhaps you'd be better at expressing it. Let me know if you'd > > > > rather I do it. > > > > > > I have no idea how the POSIX folks would feel about this. After all, it > > > would create quite a dependency for POSIX. With that in mind, trying to > > > resolve this isn't very high on my todo list. If people would think > > > that this would be beneficial for how we can deal with POSIX > > > requirements, or for our users to understand the POSIX requirements > > > better, I can definitely try to follow up on this. If you want to go > > > ahead and start discussing with them, please do so (please CC me on the > > > tracker bug). > > > > POSIX is aligned with ISO C, and since the current version of ISO C is > > now the 2011 version, Issue 8 should be aligned to the 2011 version of > > the C standard. I don't think the issue is whether it happens, but > > making sure that the relevant text gets updated so that there's no > > ambiguity as to whether it's compatible with the new C standard and > > not placing unwanted additional implementation constraints like it may > > be doing now. > > So, if it is aligned, would POSIX be willing to base their definitions > on the C11 memory model? Or would they want to keep their sometimes > rather vague requirements and just make sure that there are no obvious > inconsistencies or gaps? My guess is that they would adopt the C11 model. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-08 14:44 [PATCH] Unify pthread_once (bug 15215) Torvald Riegel 2013-05-08 17:51 ` Rich Felker @ 2013-05-23 4:15 ` Carlos O'Donell 2013-08-26 12:50 ` Ondřej Bílka 2013-10-06 0:20 ` Torvald Riegel 1 sibling, 2 replies; 21+ messages in thread From: Carlos O'Donell @ 2013-05-23 4:15 UTC (permalink / raw) To: Torvald Riegel; +Cc: GLIBC Devel, libc-ports On 05/08/2013 10:43 AM, Torvald Riegel wrote: > See http://sourceware.org/bugzilla/show_bug.cgi?id=15215 for background. You've already hashed out the details of these changes with Rich and he has no objection with this first phase of the patch which is to unify the implementations. > I1 and I2 follow essentially the same algorithm, and we can replace it > with a unified variant, as the bug suggests. See the attached patch for > a modified version of the sparc instance. The differences between both > are either cosmetic, or are unnecessary changes (ie, how the > init-finished state is set (atomic_inc vs. store), or how the fork > generations are compared). > > Both I1 and I2 were missing a release memory order (MO) when marking > once_control as finished initialization. If the particular arch doesn't > need a HW barrier for release, we at least need a compiler barrier; if > it's needed, the original I1 and I2 are not guaranteed to work. > > Both I1 and I2 were missing acquire MO on the very first load of > once_control. This needs to synchronize with the release MO on setting > the state to init-finished, so without it it's not guaranteed to work > either. > Note that this will make a call to pthread_once that doesn't need to > actually run the init routine slightly slower due to the additional > acquire barrier. If you're really concerned about this overhead, speak > up. There are ways to avoid it, but it comes with additional complexity > and bookkeeping. We want correctness. This is a place where correctness is infinitely more important than speed. We should be correct first and then we should argue about how to make it fast. > I'm currently also using the existing atomic_{read/write}_barrier > functions instead of not-yet-existing load_acq or store_rel functions. > I'm not sure whether the latter can have somewhat more efficient > implementations on Power and ARM; if so, and if you're concerned about > the overhead, we can add load_acq and store_rel to atomic.h and start > using it. This would be in line with C11, where we should eventually be > heading to anyways, IMO. Agreed. > Both I1 and I2 have an ABA issue on __fork_generation, as explained in > the comments that the patch adds. How do you all feel about this? > I can't present a simple fix right now, but I believe it could be fixed > with additional bookkeeping. > > If there's no objection to the essence of this patch, I'll post another > patch that actually replaces I1 and I2 with the modified variant in the > attached patch. Please repost. > Cleaning up the magic numbers, perhaps fixing the ABA issue, and > comparing to the custom asm versions would be next. I had a brief look > at the latter, and at least x86 doesn't seem to do anything logically > different. Right, that can be another step. > diff --git a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c > index 5879f44..f9b0953 100644 > --- a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c > +++ b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c > @@ -28,11 +28,31 @@ clear_once_control (void *arg) > { > pthread_once_t *once_control = (pthread_once_t *) arg; > > + /* Reset to the uninitialized state here (see __pthread_once). Also, we > + don't need a stronger memory order because we do not need to make any > + other of our writes visible to other threads that see this value. */ > *once_control = 0; > lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); > } > > > +/* This is similar to a lock implementation, but we distinguish between three > + states: not yet initialized (0), initialization finished (2), and > + initialization in progress (__fork_generation | 1). If in the first state, > + threads will try to run the initialization by moving to the second state; > + the first thread to do so via a CAS on once_control runs init_routine, > + other threads block. > + When forking the process, some threads can be interrupted during the second > + state; they won't be present in the forked child, so we need to restart > + initialization in the child. To distinguish an in-progress initialization > + from an interrupted initialization (in which case we need to reclaim the > + lock), we look at the fork generation that's part of the second state: We > + can reclaim iff it differs from the current fork generation. > + XXX: This algorithm has an ABA issue on the fork generation: If an > + initialization is interrupted, we then fork 2^30 times (30b of once_control > + are used for the fork generation), and try to initialize again, we can > + deadlock because we can't distinguish the in-progress and interrupted cases > + anymore. */ Good comment. Good note on the ABA issue, even if somewhat impractical today. > int > __pthread_once (once_control, init_routine) > pthread_once_t *once_control; > @@ -42,15 +62,26 @@ __pthread_once (once_control, init_routine) > { > int oldval, val, newval; > > + /* We need acquire memory order for this load because if the value > + signals that initialization has finished, we need to be see any > + data modifications done during initialization. */ > val = *once_control; > + atomic_read_barrier(); > do > { > - /* Check if the initialized has already been done. */ > - if ((val & 2) != 0) > + /* Check if the initialization has already been done. */ > + if (__builtin_expect ((val & 2) != 0, 1)) > return 0; > > oldval = val; > - newval = (oldval & 3) | __fork_generation | 1; > + /* We try to set the state to in-progress and having the current > + fork generation. We don't need atomic accesses for the fork > + generation because it's immutable in a particular process, and > + forked child processes start with a single thread that modified > + the generation. */ > + newval = __fork_generation | 1; > + /* We need acquire memory order here for the same reason as for the > + load from once_control above. */ > val = atomic_compare_and_exchange_val_acq (once_control, newval, > oldval); > } > @@ -59,9 +90,10 @@ __pthread_once (once_control, init_routine) > /* Check if another thread already runs the initializer. */ > if ((oldval & 1) != 0) > { > - /* Check whether the initializer execution was interrupted > - by a fork. */ > - if (((oldval ^ newval) & -4) == 0) > + /* Check whether the initializer execution was interrupted by a > + fork. (We know that for both values, bit 0 is set and bit 1 is > + not.) */ > + if (oldval == newval) > { > /* Same generation, some other thread was faster. Wait. */ > lll_futex_wait (once_control, newval, LLL_PRIVATE); > @@ -79,8 +111,11 @@ __pthread_once (once_control, init_routine) > pthread_cleanup_pop (0); > > > - /* Add one to *once_control. */ > - atomic_increment (once_control); > + /* Mark *once_control as having finished the initialization. We need > + release memory order here because we need to synchronize with other > + threads that want to use the initialized data. */ > + atomic_write_barrier(); > + *once_control = 2; > > /* Wake up all other threads. */ > lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); Cheers, Carlos. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-23 4:15 ` Carlos O'Donell @ 2013-08-26 12:50 ` Ondřej Bílka 2013-08-26 16:45 ` Rich Felker 2013-10-06 0:20 ` Torvald Riegel 1 sibling, 1 reply; 21+ messages in thread From: Ondřej Bílka @ 2013-08-26 12:50 UTC (permalink / raw) To: Carlos O'Donell; +Cc: Torvald Riegel, GLIBC Devel, libc-ports On Thu, May 23, 2013 at 12:15:32AM -0400, Carlos O'Donell wrote: > On 05/08/2013 10:43 AM, Torvald Riegel wrote: > > Note that this will make a call to pthread_once that doesn't need to > > actually run the init routine slightly slower due to the additional > > acquire barrier. If you're really concerned about this overhead, speak > > up. There are ways to avoid it, but it comes with additional complexity > > and bookkeeping. > > We want correctness. This is a place where correctness is infinitely > more important than speed. We should be correct first and then we > should argue about how to make it fast. > As pthread_once calls tend to be called once per thread performance is not an issue. > > If there's no objection to the essence of this patch, I'll post another > > patch that actually replaces I1 and I2 with the modified variant in the > > attached patch. > > Please repost. > We wait for new version. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-08-26 12:50 ` Ondřej Bílka @ 2013-08-26 16:45 ` Rich Felker 2013-08-26 18:41 ` Ondřej Bílka 0 siblings, 1 reply; 21+ messages in thread From: Rich Felker @ 2013-08-26 16:45 UTC (permalink / raw) To: Ondřej Bílka Cc: Carlos O'Donell, Torvald Riegel, GLIBC Devel, libc-ports On Mon, Aug 26, 2013 at 02:49:55PM +0200, OndÅej BÃlka wrote: > On Thu, May 23, 2013 at 12:15:32AM -0400, Carlos O'Donell wrote: > > On 05/08/2013 10:43 AM, Torvald Riegel wrote: > > > Note that this will make a call to pthread_once that doesn't need to > > > actually run the init routine slightly slower due to the additional > > > acquire barrier. If you're really concerned about this overhead, speak > > > up. There are ways to avoid it, but it comes with additional complexity > > > and bookkeeping. > > > > We want correctness. This is a place where correctness is infinitely > > more important than speed. We should be correct first and then we > > should argue about how to make it fast. > > > As pthread_once calls tend to be called once per thread performance is > not an issue. No, pthread_once _calls_ tend to be once per access to an interface that requires static data to have been initialized, so possibly very often. On the other hand, pthread_once only invokes the init function once per program instance. I don't see anything that would typically happen once per thread, although I suppose you could optimize out calls to pthread_once with tls: static __thread int once_done = 0; static pthread_once_t once; if (!once_done) { pthread_once(&once, init); once_done = 1; } This requires work at the application level, though, and whether it's a net advantage depends a lot on whether multiple threads are likely to be hammering pthread_once on the same once object, and whether the arch has expensive acquire barriers and inexpensive TLS access. Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-08-26 16:45 ` Rich Felker @ 2013-08-26 18:41 ` Ondřej Bílka 2013-08-27 2:29 ` Rich Felker 0 siblings, 1 reply; 21+ messages in thread From: Ondřej Bílka @ 2013-08-26 18:41 UTC (permalink / raw) To: Rich Felker; +Cc: Carlos O'Donell, Torvald Riegel, GLIBC Devel, libc-ports On Mon, Aug 26, 2013 at 12:45:07PM -0400, Rich Felker wrote: > On Mon, Aug 26, 2013 at 02:49:55PM +0200, OndÅej BÃlka wrote: > > On Thu, May 23, 2013 at 12:15:32AM -0400, Carlos O'Donell wrote: > > > On 05/08/2013 10:43 AM, Torvald Riegel wrote: > > > > Note that this will make a call to pthread_once that doesn't need to > > > > actually run the init routine slightly slower due to the additional > > > > acquire barrier. If you're really concerned about this overhead, speak > > > > up. There are ways to avoid it, but it comes with additional complexity > > > > and bookkeeping. > > > > > > We want correctness. This is a place where correctness is infinitely > > > more important than speed. We should be correct first and then we > > > should argue about how to make it fast. > > > > > As pthread_once calls tend to be called once per thread performance is > > not an issue. > > No, pthread_once _calls_ tend to be once per access to an interface > that requires static data to have been initialized, so possibly very > often. On the other hand, pthread_once only invokes the init function > once per program instance. I don't see anything that would typically > happen once per thread, although I suppose you could optimize out > calls to pthread_once with tls: > Could happen often but dees it? Given need of doing locking you need to avoid it in performance critical code. With once per thread I meant an patterns: computation(){ pthread_once(baz,init); // Do common initialization. pthread_create(foo,bar,routine); } or pthread_create(foo,bar,routine); with routine() { pthread_once(baz,init); // Do common initialization. ... } > static __thread int once_done = 0; > static pthread_once_t once; > if (!once_done) { > pthread_once(&once, init); > once_done = 1; > } > > This requires work at the application level, though, and whether it's > a net advantage depends a lot on whether multiple threads are likely > to be hammering pthread_once on the same once object, and whether the > arch has expensive acquire barriers and inexpensive TLS access. > Actually you can use following if you are concerned about that use cases: #define pthread_once2(x,y) ({ \ static __thread int once = 0; \ if (!once) \ pthread_once(x,y); \ once=1; \ }) > Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-08-26 18:41 ` Ondřej Bílka @ 2013-08-27 2:29 ` Rich Felker 0 siblings, 0 replies; 21+ messages in thread From: Rich Felker @ 2013-08-27 2:29 UTC (permalink / raw) To: Ondřej Bílka Cc: Carlos O'Donell, Torvald Riegel, GLIBC Devel, libc-ports On Mon, Aug 26, 2013 at 08:41:50PM +0200, OndÅej BÃlka wrote: > > No, pthread_once _calls_ tend to be once per access to an interface > > that requires static data to have been initialized, so possibly very > > often. On the other hand, pthread_once only invokes the init function > > once per program instance. I don't see anything that would typically > > happen once per thread, although I suppose you could optimize out > > calls to pthread_once with tls: > > > Could happen often but dees it? Given need of doing locking you need to > avoid it in performance critical code. With once per thread I meant an > patterns: > > computation(){ > pthread_once(baz,init); // Do common initialization. > pthread_create(foo,bar,routine); > } > > or > > pthread_create(foo,bar,routine); > > with > > routine() > { > pthread_once(baz,init); // Do common initialization. > ... > } These patterns arise is the library is making threads and using pthread_once to initialize its static data before making the thread. I'm thinking instead of the case where your library is being _called_ by multi-threaded code, and using pthread_once to ensure that its data is safely initialized even if there are multiple threads which might be racing to be the first caller. > > static __thread int once_done = 0; > > static pthread_once_t once; > > if (!once_done) { > > pthread_once(&once, init); > > once_done = 1; > > } > > > > This requires work at the application level, though, and whether it's > > a net advantage depends a lot on whether multiple threads are likely > > to be hammering pthread_once on the same once object, and whether the > > arch has expensive acquire barriers and inexpensive TLS access. > > > Actually you can use following if you are concerned about that use cases: > > #define pthread_once2(x,y) ({ \ > static __thread int once = 0; \ > if (!once) \ > pthread_once(x,y); \ > once=1; \ > }) Indeed; actually, this could even be done in pthread.h, with some slight variations, perhaps: #define pthread_once(x,y) ({ \ pthread_once_t *__x = (x); \ static __thread pthread_once_t *__once; \ if (__once != __x) { \ pthread_once(__x,y); \ __once = __x; \ } \ }) Rich ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-05-23 4:15 ` Carlos O'Donell 2013-08-26 12:50 ` Ondřej Bílka @ 2013-10-06 0:20 ` Torvald Riegel 2013-10-06 21:41 ` Torvald Riegel 2013-10-07 16:04 ` Joseph S. Myers 1 sibling, 2 replies; 21+ messages in thread From: Torvald Riegel @ 2013-10-06 0:20 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GLIBC Devel, libc-ports [-- Attachment #1: Type: text/plain, Size: 2971 bytes --] On Thu, 2013-05-23 at 00:15 -0400, Carlos O'Donell wrote: > On 05/08/2013 10:43 AM, Torvald Riegel wrote: > > See http://sourceware.org/bugzilla/show_bug.cgi?id=15215 for background. > > You've already hashed out the details of these changes with Rich > and he has no objection with this first phase of the patch which > is to unify the implementations. > > > I1 and I2 follow essentially the same algorithm, and we can replace it > > with a unified variant, as the bug suggests. See the attached patch for > > a modified version of the sparc instance. The differences between both > > are either cosmetic, or are unnecessary changes (ie, how the > > init-finished state is set (atomic_inc vs. store), or how the fork > > generations are compared). > > > > Both I1 and I2 were missing a release memory order (MO) when marking > > once_control as finished initialization. If the particular arch doesn't > > need a HW barrier for release, we at least need a compiler barrier; if > > it's needed, the original I1 and I2 are not guaranteed to work. > > > > Both I1 and I2 were missing acquire MO on the very first load of > > once_control. This needs to synchronize with the release MO on setting > > the state to init-finished, so without it it's not guaranteed to work > > either. > > Note that this will make a call to pthread_once that doesn't need to > > actually run the init routine slightly slower due to the additional > > acquire barrier. If you're really concerned about this overhead, speak > > up. There are ways to avoid it, but it comes with additional complexity > > and bookkeeping. > > We want correctness. This is a place where correctness is infinitely > more important than speed. We should be correct first and then we > should argue about how to make it fast. > > > I'm currently also using the existing atomic_{read/write}_barrier > > functions instead of not-yet-existing load_acq or store_rel functions. > > I'm not sure whether the latter can have somewhat more efficient > > implementations on Power and ARM; if so, and if you're concerned about > > the overhead, we can add load_acq and store_rel to atomic.h and start > > using it. This would be in line with C11, where we should eventually be > > heading to anyways, IMO. > > Agreed. > > > Both I1 and I2 have an ABA issue on __fork_generation, as explained in > > the comments that the patch adds. How do you all feel about this? > > I can't present a simple fix right now, but I believe it could be fixed > > with additional bookkeeping. > > > > If there's no objection to the essence of this patch, I'll post another > > patch that actually replaces I1 and I2 with the modified variant in the > > attached patch. > > Please repost. See attached patch. This has been tested on ppc64 but not on the other archs that are affected. Nonetheless, ppc has a weak memory model, so, for example, having an acquire barrier on a load or not having it does make a difference. OK? [-- Attachment #2: patch --] [-- Type: text/plain, Size: 31090 bytes --] commit 287d2b33853dec608439521df049b6d818b0b8c1 Author: Torvald Riegel <triegel@redhat.com> Date: Wed May 8 16:35:10 2013 +0200 Fixed and unified pthread_once. PR nptl/15215 nptl/ * sysdeps/unix/sysv/linux/pthread_once.c: New file. * sysdeps/unix/sysv/linux/sparc/pthread_once.c: Remove file. ports/ * sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c: Remove file. * sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c: Remove file. diff --git a/nptl/ChangeLog b/nptl/ChangeLog index a089153..7e87334 100644 --- a/nptl/ChangeLog +++ b/nptl/ChangeLog @@ -1,3 +1,8 @@ +2013-10-06 Torvald Riegel <triegel@redhat.com> + + * sysdeps/unix/sysv/linux/pthread_once.c: New file. + * sysdeps/unix/sysv/linux/sparc/pthread_once.c: Remove file. + 2013-10-04 Maciej W. Rozycki <macro@codesourcery.com> * tst-mutex8.c (check_type) [ENABLE_PI]: Handle ENOTSUP failure diff --git a/nptl/sysdeps/unix/sysv/linux/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/pthread_once.c new file mode 100644 index 0000000..f9b0953 --- /dev/null +++ b/nptl/sysdeps/unix/sysv/linux/pthread_once.c @@ -0,0 +1,128 @@ +/* Copyright (C) 2003-2013 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include "pthreadP.h" +#include <lowlevellock.h> + + +unsigned long int __fork_generation attribute_hidden; + + +static void +clear_once_control (void *arg) +{ + pthread_once_t *once_control = (pthread_once_t *) arg; + + /* Reset to the uninitialized state here (see __pthread_once). Also, we + don't need a stronger memory order because we do not need to make any + other of our writes visible to other threads that see this value. */ + *once_control = 0; + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); +} + + +/* This is similar to a lock implementation, but we distinguish between three + states: not yet initialized (0), initialization finished (2), and + initialization in progress (__fork_generation | 1). If in the first state, + threads will try to run the initialization by moving to the second state; + the first thread to do so via a CAS on once_control runs init_routine, + other threads block. + When forking the process, some threads can be interrupted during the second + state; they won't be present in the forked child, so we need to restart + initialization in the child. To distinguish an in-progress initialization + from an interrupted initialization (in which case we need to reclaim the + lock), we look at the fork generation that's part of the second state: We + can reclaim iff it differs from the current fork generation. + XXX: This algorithm has an ABA issue on the fork generation: If an + initialization is interrupted, we then fork 2^30 times (30b of once_control + are used for the fork generation), and try to initialize again, we can + deadlock because we can't distinguish the in-progress and interrupted cases + anymore. */ +int +__pthread_once (once_control, init_routine) + pthread_once_t *once_control; + void (*init_routine) (void); +{ + while (1) + { + int oldval, val, newval; + + /* We need acquire memory order for this load because if the value + signals that initialization has finished, we need to be see any + data modifications done during initialization. */ + val = *once_control; + atomic_read_barrier(); + do + { + /* Check if the initialization has already been done. */ + if (__builtin_expect ((val & 2) != 0, 1)) + return 0; + + oldval = val; + /* We try to set the state to in-progress and having the current + fork generation. We don't need atomic accesses for the fork + generation because it's immutable in a particular process, and + forked child processes start with a single thread that modified + the generation. */ + newval = __fork_generation | 1; + /* We need acquire memory order here for the same reason as for the + load from once_control above. */ + val = atomic_compare_and_exchange_val_acq (once_control, newval, + oldval); + } + while (__builtin_expect (val != oldval, 0)); + + /* Check if another thread already runs the initializer. */ + if ((oldval & 1) != 0) + { + /* Check whether the initializer execution was interrupted by a + fork. (We know that for both values, bit 0 is set and bit 1 is + not.) */ + if (oldval == newval) + { + /* Same generation, some other thread was faster. Wait. */ + lll_futex_wait (once_control, newval, LLL_PRIVATE); + continue; + } + } + + /* This thread is the first here. Do the initialization. + Register a cleanup handler so that in case the thread gets + interrupted the initialization can be restarted. */ + pthread_cleanup_push (clear_once_control, once_control); + + init_routine (); + + pthread_cleanup_pop (0); + + + /* Mark *once_control as having finished the initialization. We need + release memory order here because we need to synchronize with other + threads that want to use the initialized data. */ + atomic_write_barrier(); + *once_control = 2; + + /* Wake up all other threads. */ + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); + break; + } + + return 0; +} +weak_alias (__pthread_once, pthread_once) +hidden_def (__pthread_once) diff --git a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/ChangeLog b/ports/ChangeLog index a1dd8c3..13cefd2 100644 --- a/ports/ChangeLog +++ b/ports/ChangeLog @@ -1,3 +1,13 @@ +2013-10-06 Torvald Riegel <triegel@redhat.com> + + * sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c: Remove file. + 2013-10-04 Alan Modra <amodra@gmail.com> * sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change. diff --git a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c deleted file mode 100644 index 2fb9b85..0000000 --- a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public License as - published by the Free Software Foundation; either version 2.1 of the - License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c deleted file mode 100644 index 0c897ab..0000000 --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c +++ /dev/null @@ -1,89 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c deleted file mode 100644 index b032f29..0000000 --- a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_private_futex_wake (once_control, INT_MAX); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_private_futex_wait (once_control, newval); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_private_futex_wake (once_control, INT_MAX); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c deleted file mode 100644 index 8a1f307..0000000 --- a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2010-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Maxim Kuvyrkov <maxim@codesourcery.com>, 2010. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c deleted file mode 100644 index 97f1ddf..0000000 --- a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c deleted file mode 100644 index 68456f0..0000000 --- a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c +++ /dev/null @@ -1,94 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Chris Metcalf <cmetcalf@tilera.com>, 2011. - Based on work contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include <nptl/pthreadP.h> -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-10-06 0:20 ` Torvald Riegel @ 2013-10-06 21:41 ` Torvald Riegel 2013-10-07 16:04 ` Joseph S. Myers 1 sibling, 0 replies; 21+ messages in thread From: Torvald Riegel @ 2013-10-06 21:41 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GLIBC Devel, libc-ports [-- Attachment #1: Type: text/plain, Size: 3301 bytes --] On Sun, 2013-10-06 at 02:20 +0200, Torvald Riegel wrote: > On Thu, 2013-05-23 at 00:15 -0400, Carlos O'Donell wrote: > > On 05/08/2013 10:43 AM, Torvald Riegel wrote: > > > See http://sourceware.org/bugzilla/show_bug.cgi?id=15215 for background. > > > > You've already hashed out the details of these changes with Rich > > and he has no objection with this first phase of the patch which > > is to unify the implementations. > > > > > I1 and I2 follow essentially the same algorithm, and we can replace it > > > with a unified variant, as the bug suggests. See the attached patch for > > > a modified version of the sparc instance. The differences between both > > > are either cosmetic, or are unnecessary changes (ie, how the > > > init-finished state is set (atomic_inc vs. store), or how the fork > > > generations are compared). > > > > > > Both I1 and I2 were missing a release memory order (MO) when marking > > > once_control as finished initialization. If the particular arch doesn't > > > need a HW barrier for release, we at least need a compiler barrier; if > > > it's needed, the original I1 and I2 are not guaranteed to work. > > > > > > Both I1 and I2 were missing acquire MO on the very first load of > > > once_control. This needs to synchronize with the release MO on setting > > > the state to init-finished, so without it it's not guaranteed to work > > > either. > > > Note that this will make a call to pthread_once that doesn't need to > > > actually run the init routine slightly slower due to the additional > > > acquire barrier. If you're really concerned about this overhead, speak > > > up. There are ways to avoid it, but it comes with additional complexity > > > and bookkeeping. > > > > We want correctness. This is a place where correctness is infinitely > > more important than speed. We should be correct first and then we > > should argue about how to make it fast. > > > > > I'm currently also using the existing atomic_{read/write}_barrier > > > functions instead of not-yet-existing load_acq or store_rel functions. > > > I'm not sure whether the latter can have somewhat more efficient > > > implementations on Power and ARM; if so, and if you're concerned about > > > the overhead, we can add load_acq and store_rel to atomic.h and start > > > using it. This would be in line with C11, where we should eventually be > > > heading to anyways, IMO. > > > > Agreed. > > > > > Both I1 and I2 have an ABA issue on __fork_generation, as explained in > > > the comments that the patch adds. How do you all feel about this? > > > I can't present a simple fix right now, but I believe it could be fixed > > > with additional bookkeeping. > > > > > > If there's no objection to the essence of this patch, I'll post another > > > patch that actually replaces I1 and I2 with the modified variant in the > > > attached patch. > > > > Please repost. > > See attached patch. This has been tested on ppc64 but not on the other > archs that are affected. Nonetheless, ppc has a weak memory model, so, > for example, having an acquire barrier on a load or not having it does > make a difference. > > OK? Attached is a slightly updated version; the only difference is that the changelog chunks now incorporate earlier feedback that I had forgotten about. [-- Attachment #2: patch --] [-- Type: text/x-patch, Size: 30331 bytes --] diff --git a/nptl/ChangeLog b/nptl/ChangeLog index a089153..e4af42a 100644 --- a/nptl/ChangeLog +++ b/nptl/ChangeLog @@ -1,3 +1,10 @@ +2013-10-06 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/sparc/pthread_once.c: Moved to ... + * sysdeps/unix/sysv/linux/pthread_once.c: ... here. Add missing + memory barriers. Add comments. + 2013-10-04 Maciej W. Rozycki <macro@codesourcery.com> * tst-mutex8.c (check_type) [ENABLE_PI]: Handle ENOTSUP failure diff --git a/nptl/sysdeps/unix/sysv/linux/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/pthread_once.c new file mode 100644 index 0000000..f9b0953 --- /dev/null +++ b/nptl/sysdeps/unix/sysv/linux/pthread_once.c @@ -0,0 +1,128 @@ +/* Copyright (C) 2003-2013 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include "pthreadP.h" +#include <lowlevellock.h> + + +unsigned long int __fork_generation attribute_hidden; + + +static void +clear_once_control (void *arg) +{ + pthread_once_t *once_control = (pthread_once_t *) arg; + + /* Reset to the uninitialized state here (see __pthread_once). Also, we + don't need a stronger memory order because we do not need to make any + other of our writes visible to other threads that see this value. */ + *once_control = 0; + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); +} + + +/* This is similar to a lock implementation, but we distinguish between three + states: not yet initialized (0), initialization finished (2), and + initialization in progress (__fork_generation | 1). If in the first state, + threads will try to run the initialization by moving to the second state; + the first thread to do so via a CAS on once_control runs init_routine, + other threads block. + When forking the process, some threads can be interrupted during the second + state; they won't be present in the forked child, so we need to restart + initialization in the child. To distinguish an in-progress initialization + from an interrupted initialization (in which case we need to reclaim the + lock), we look at the fork generation that's part of the second state: We + can reclaim iff it differs from the current fork generation. + XXX: This algorithm has an ABA issue on the fork generation: If an + initialization is interrupted, we then fork 2^30 times (30b of once_control + are used for the fork generation), and try to initialize again, we can + deadlock because we can't distinguish the in-progress and interrupted cases + anymore. */ +int +__pthread_once (once_control, init_routine) + pthread_once_t *once_control; + void (*init_routine) (void); +{ + while (1) + { + int oldval, val, newval; + + /* We need acquire memory order for this load because if the value + signals that initialization has finished, we need to be see any + data modifications done during initialization. */ + val = *once_control; + atomic_read_barrier(); + do + { + /* Check if the initialization has already been done. */ + if (__builtin_expect ((val & 2) != 0, 1)) + return 0; + + oldval = val; + /* We try to set the state to in-progress and having the current + fork generation. We don't need atomic accesses for the fork + generation because it's immutable in a particular process, and + forked child processes start with a single thread that modified + the generation. */ + newval = __fork_generation | 1; + /* We need acquire memory order here for the same reason as for the + load from once_control above. */ + val = atomic_compare_and_exchange_val_acq (once_control, newval, + oldval); + } + while (__builtin_expect (val != oldval, 0)); + + /* Check if another thread already runs the initializer. */ + if ((oldval & 1) != 0) + { + /* Check whether the initializer execution was interrupted by a + fork. (We know that for both values, bit 0 is set and bit 1 is + not.) */ + if (oldval == newval) + { + /* Same generation, some other thread was faster. Wait. */ + lll_futex_wait (once_control, newval, LLL_PRIVATE); + continue; + } + } + + /* This thread is the first here. Do the initialization. + Register a cleanup handler so that in case the thread gets + interrupted the initialization can be restarted. */ + pthread_cleanup_push (clear_once_control, once_control); + + init_routine (); + + pthread_cleanup_pop (0); + + + /* Mark *once_control as having finished the initialization. We need + release memory order here because we need to synchronize with other + threads that want to use the initialized data. */ + atomic_write_barrier(); + *once_control = 2; + + /* Wake up all other threads. */ + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); + break; + } + + return 0; +} +weak_alias (__pthread_once, pthread_once) +hidden_def (__pthread_once) diff --git a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/ChangeLog b/ports/ChangeLog index a1dd8c3..06a12c4 100644 --- a/ports/ChangeLog +++ b/ports/ChangeLog @@ -1,3 +1,14 @@ +2013-10-06 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c: Remove file. + * sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c: Remove file. + 2013-10-04 Alan Modra <amodra@gmail.com> * sysdeps/ia64/fpu/printf_fphex.c: Adjust for fpnum change. diff --git a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c deleted file mode 100644 index 2fb9b85..0000000 --- a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public License as - published by the Free Software Foundation; either version 2.1 of the - License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c deleted file mode 100644 index 0c897ab..0000000 --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c +++ /dev/null @@ -1,89 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c deleted file mode 100644 index b032f29..0000000 --- a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_private_futex_wake (once_control, INT_MAX); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_private_futex_wait (once_control, newval); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_private_futex_wake (once_control, INT_MAX); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c deleted file mode 100644 index 8a1f307..0000000 --- a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2010-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Maxim Kuvyrkov <maxim@codesourcery.com>, 2010. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c deleted file mode 100644 index 97f1ddf..0000000 --- a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c deleted file mode 100644 index 68456f0..0000000 --- a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c +++ /dev/null @@ -1,94 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Chris Metcalf <cmetcalf@tilera.com>, 2011. - Based on work contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include <nptl/pthreadP.h> -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-10-06 0:20 ` Torvald Riegel 2013-10-06 21:41 ` Torvald Riegel @ 2013-10-07 16:04 ` Joseph S. Myers 2013-10-07 21:53 ` Torvald Riegel 1 sibling, 1 reply; 21+ messages in thread From: Joseph S. Myers @ 2013-10-07 16:04 UTC (permalink / raw) To: Torvald Riegel; +Cc: Carlos O'Donell, GLIBC Devel, libc-ports I have no comments on the substance of this patch, but note that ports/ has a separate ChangeLog file for each architecture. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-10-07 16:04 ` Joseph S. Myers @ 2013-10-07 21:53 ` Torvald Riegel 2014-03-31 11:44 ` Will Newton 0 siblings, 1 reply; 21+ messages in thread From: Torvald Riegel @ 2013-10-07 21:53 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Carlos O'Donell, GLIBC Devel, libc-ports [-- Attachment #1: Type: text/plain, Size: 282 bytes --] On Mon, 2013-10-07 at 16:04 +0000, Joseph S. Myers wrote: > I have no comments on the substance of this patch, but note that ports/ > has a separate ChangeLog file for each architecture. Sorry. The attached patch now has separate ChangeLog entries for each of the affected archs. [-- Attachment #2: patch --] [-- Type: text/x-patch, Size: 32258 bytes --] diff --git a/nptl/ChangeLog b/nptl/ChangeLog index a089153..ea161a3 100644 --- a/nptl/ChangeLog +++ b/nptl/ChangeLog @@ -1,3 +1,10 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/sparc/pthread_once.c: Moved to ... + * sysdeps/unix/sysv/linux/pthread_once.c: ... here. Add missing + memory barriers. Add comments. + 2013-10-04 Maciej W. Rozycki <macro@codesourcery.com> * tst-mutex8.c (check_type) [ENABLE_PI]: Handle ENOTSUP failure diff --git a/nptl/sysdeps/unix/sysv/linux/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/pthread_once.c new file mode 100644 index 0000000..f9b0953 --- /dev/null +++ b/nptl/sysdeps/unix/sysv/linux/pthread_once.c @@ -0,0 +1,128 @@ +/* Copyright (C) 2003-2013 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include "pthreadP.h" +#include <lowlevellock.h> + + +unsigned long int __fork_generation attribute_hidden; + + +static void +clear_once_control (void *arg) +{ + pthread_once_t *once_control = (pthread_once_t *) arg; + + /* Reset to the uninitialized state here (see __pthread_once). Also, we + don't need a stronger memory order because we do not need to make any + other of our writes visible to other threads that see this value. */ + *once_control = 0; + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); +} + + +/* This is similar to a lock implementation, but we distinguish between three + states: not yet initialized (0), initialization finished (2), and + initialization in progress (__fork_generation | 1). If in the first state, + threads will try to run the initialization by moving to the second state; + the first thread to do so via a CAS on once_control runs init_routine, + other threads block. + When forking the process, some threads can be interrupted during the second + state; they won't be present in the forked child, so we need to restart + initialization in the child. To distinguish an in-progress initialization + from an interrupted initialization (in which case we need to reclaim the + lock), we look at the fork generation that's part of the second state: We + can reclaim iff it differs from the current fork generation. + XXX: This algorithm has an ABA issue on the fork generation: If an + initialization is interrupted, we then fork 2^30 times (30b of once_control + are used for the fork generation), and try to initialize again, we can + deadlock because we can't distinguish the in-progress and interrupted cases + anymore. */ +int +__pthread_once (once_control, init_routine) + pthread_once_t *once_control; + void (*init_routine) (void); +{ + while (1) + { + int oldval, val, newval; + + /* We need acquire memory order for this load because if the value + signals that initialization has finished, we need to be see any + data modifications done during initialization. */ + val = *once_control; + atomic_read_barrier(); + do + { + /* Check if the initialization has already been done. */ + if (__builtin_expect ((val & 2) != 0, 1)) + return 0; + + oldval = val; + /* We try to set the state to in-progress and having the current + fork generation. We don't need atomic accesses for the fork + generation because it's immutable in a particular process, and + forked child processes start with a single thread that modified + the generation. */ + newval = __fork_generation | 1; + /* We need acquire memory order here for the same reason as for the + load from once_control above. */ + val = atomic_compare_and_exchange_val_acq (once_control, newval, + oldval); + } + while (__builtin_expect (val != oldval, 0)); + + /* Check if another thread already runs the initializer. */ + if ((oldval & 1) != 0) + { + /* Check whether the initializer execution was interrupted by a + fork. (We know that for both values, bit 0 is set and bit 1 is + not.) */ + if (oldval == newval) + { + /* Same generation, some other thread was faster. Wait. */ + lll_futex_wait (once_control, newval, LLL_PRIVATE); + continue; + } + } + + /* This thread is the first here. Do the initialization. + Register a cleanup handler so that in case the thread gets + interrupted the initialization can be restarted. */ + pthread_cleanup_push (clear_once_control, once_control); + + init_routine (); + + pthread_cleanup_pop (0); + + + /* Mark *once_control as having finished the initialization. We need + release memory order here because we need to synchronize with other + threads that want to use the initialized data. */ + atomic_write_barrier(); + *once_control = 2; + + /* Wake up all other threads. */ + lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); + break; + } + + return 0; +} +weak_alias (__pthread_once, pthread_once) +hidden_def (__pthread_once) diff --git a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c b/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/nptl/sysdeps/unix/sysv/linux/sparc/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/ChangeLog.aarch64 b/ports/ChangeLog.aarch64 index cebf505..13d2a72 100644 --- a/ports/ChangeLog.aarch64 +++ b/ports/ChangeLog.aarch64 @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c: Remove file. + 2013-09-30 Andrew Pinski <andrew.pinski@caviumnetworks.com> * sysdeps/unix/sysv/linux/aarch64/sysdep.h (SYSCALL_ERROR_HANDLER): diff --git a/ports/ChangeLog.arm b/ports/ChangeLog.arm index 6707d2e..cc876e5 100644 --- a/ports/ChangeLog.arm +++ b/ports/ChangeLog.arm @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c: Remove file. + 2013-10-04 Will Newton <will.newton@linaro.org> * sysdeps/arm/__longjmp.S (NO_THUMB): Remove define. diff --git a/ports/ChangeLog.hppa b/ports/ChangeLog.hppa index 3e7c162..53ade36 100644 --- a/ports/ChangeLog.hppa +++ b/ports/ChangeLog.hppa @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c: Remove file. + 2013-09-11 Andreas Schwab <schwab@suse.de> * sysdeps/unix/sysv/linux/hppa/bits/fcntl.h (__O_TMPFILE): Define. diff --git a/ports/ChangeLog.ia64 b/ports/ChangeLog.ia64 index 272f73a..f149373 100644 --- a/ports/ChangeLog.ia64 +++ b/ports/ChangeLog.ia64 @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c: Remove file. + 2013-09-22 Carlos O'Donell <carlos@redhat.com> [BZ #15754] diff --git a/ports/ChangeLog.m68k b/ports/ChangeLog.m68k index 4f933c6..0982588 100644 --- a/ports/ChangeLog.m68k +++ b/ports/ChangeLog.m68k @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c: Remove file. + 2013-09-20 Andreas Schwab <schwab@linux-m68k.org> * sysdeps/m68k/ffs.c (__ffs): Define as hidden. diff --git a/ports/ChangeLog.mips b/ports/ChangeLog.mips index eaa6ade..f52a80d 100644 --- a/ports/ChangeLog.mips +++ b/ports/ChangeLog.mips @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c: Remove file. + 2013-09-26 Steve Ellcey <sellcey@mips.com> [BZ #15632] diff --git a/ports/ChangeLog.tile b/ports/ChangeLog.tile index dc2e7e4..3821ded 100644 --- a/ports/ChangeLog.tile +++ b/ports/ChangeLog.tile @@ -1,3 +1,8 @@ +2013-10-07 Torvald Riegel <triegel@redhat.com> + + [BZ #15215] + * sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c: Remove file. + 2013-09-22 Carlos O'Donell <carlos@redhat.com> [BZ #15754] diff --git a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c deleted file mode 100644 index 2fb9b85..0000000 --- a/ports/sysdeps/unix/sysv/linux/aarch64/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public License as - published by the Free Software Foundation; either version 2.1 of the - License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c deleted file mode 100644 index 0c897ab..0000000 --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c +++ /dev/null @@ -1,89 +0,0 @@ -/* Copyright (C) 2004-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c deleted file mode 100644 index b032f29..0000000 --- a/ports/sysdeps/unix/sysv/linux/hppa/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_private_futex_wake (once_control, INT_MAX); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_private_futex_wait (once_control, newval); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_private_futex_wake (once_control, INT_MAX); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c deleted file mode 100644 index 5879f44..0000000 --- a/ports/sysdeps/unix/sysv/linux/ia64/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c deleted file mode 100644 index 8a1f307..0000000 --- a/ports/sysdeps/unix/sysv/linux/m68k/nptl/pthread_once.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2010-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Maxim Kuvyrkov <maxim@codesourcery.com>, 2010. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - -unsigned long int __fork_generation attribute_hidden; - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - -int -__pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) -{ - for (;;) - { - int oldval; - int newval; - - /* Pseudo code: - newval = __fork_generation | 1; - oldval = *once_control; - if ((oldval & 2) == 0) - *once_control = newval; - Do this atomically. - */ - do - { - newval = __fork_generation | 1; - oldval = *once_control; - if (oldval & 2) - break; - } while (atomic_compare_and_exchange_val_acq (once_control, newval, oldval) != oldval); - - /* Check if the initializer has already been done. */ - if ((oldval & 2) != 0) - return 0; - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) == 0) - break; - - /* Check whether the initializer execution was interrupted by a fork. */ - if (oldval != newval) - break; - - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, oldval, LLL_PRIVATE); - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - /* Say that the initialisation is done. */ - *once_control = __fork_generation | 2; - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c deleted file mode 100644 index 97f1ddf..0000000 --- a/ports/sysdeps/unix/sysv/linux/mips/nptl/pthread_once.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (C) 2003-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "pthreadP.h" -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) diff --git a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c b/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c deleted file mode 100644 index 68456f0..0000000 --- a/ports/sysdeps/unix/sysv/linux/tile/nptl/pthread_once.c +++ /dev/null @@ -1,94 +0,0 @@ -/* Copyright (C) 2011-2013 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Chris Metcalf <cmetcalf@tilera.com>, 2011. - Based on work contributed by Jakub Jelinek <jakub@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - <http://www.gnu.org/licenses/>. */ - -#include <nptl/pthreadP.h> -#include <lowlevellock.h> - - -unsigned long int __fork_generation attribute_hidden; - - -static void -clear_once_control (void *arg) -{ - pthread_once_t *once_control = (pthread_once_t *) arg; - - *once_control = 0; - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); -} - - -int -__pthread_once (once_control, init_routine) - pthread_once_t *once_control; - void (*init_routine) (void); -{ - while (1) - { - int oldval, val, newval; - - val = *once_control; - do - { - /* Check if the initialized has already been done. */ - if ((val & 2) != 0) - return 0; - - oldval = val; - newval = (oldval & 3) | __fork_generation | 1; - val = atomic_compare_and_exchange_val_acq (once_control, newval, - oldval); - } - while (__builtin_expect (val != oldval, 0)); - - /* Check if another thread already runs the initializer. */ - if ((oldval & 1) != 0) - { - /* Check whether the initializer execution was interrupted - by a fork. */ - if (((oldval ^ newval) & -4) == 0) - { - /* Same generation, some other thread was faster. Wait. */ - lll_futex_wait (once_control, newval, LLL_PRIVATE); - continue; - } - } - - /* This thread is the first here. Do the initialization. - Register a cleanup handler so that in case the thread gets - interrupted the initialization can be restarted. */ - pthread_cleanup_push (clear_once_control, once_control); - - init_routine (); - - pthread_cleanup_pop (0); - - - /* Add one to *once_control. */ - atomic_increment (once_control); - - /* Wake up all other threads. */ - lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE); - break; - } - - return 0; -} -weak_alias (__pthread_once, pthread_once) -hidden_def (__pthread_once) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2013-10-07 21:53 ` Torvald Riegel @ 2014-03-31 11:44 ` Will Newton 2014-03-31 20:09 ` Torvald Riegel 0 siblings, 1 reply; 21+ messages in thread From: Will Newton @ 2014-03-31 11:44 UTC (permalink / raw) To: Torvald Riegel Cc: Joseph S. Myers, Carlos O'Donell, GLIBC Devel, libc-ports On 7 October 2013 22:53, Torvald Riegel <triegel@redhat.com> wrote: > On Mon, 2013-10-07 at 16:04 +0000, Joseph S. Myers wrote: >> I have no comments on the substance of this patch, but note that ports/ >> has a separate ChangeLog file for each architecture. > > Sorry. The attached patch now has separate ChangeLog entries for each of > the affected archs. There seems to be a significant performance delta on aarch64: Old code: "pthread_once": { "": { "duration": 9.29471e+09, "iterations": 1.10667e+09, "max": 24.54, "min": 8.38, "mean": 8.39882 New code: "pthread_once": { "": { "duration": 9.72366e+09, "iterations": 4.33843e+08, "max": 30.86, "min": 22.38, "mean": 22.4128 And also ARM: Old code: "pthread_once": { "": { "duration": 8.38662e+09, "iterations": 6.6695e+08, "max": 35.292, "min": 12.416, "mean": 12.5746 New code: "pthread_once": { "": { "duration": 9.26424e+09, "iterations": 3.07574e+08, "max": 86.125, "min": 28.875, "mean": 30.1204 It would be nice to understand the source of this variation. I can put it on my todo list but I can't promise I will be able to look at it any time soon. -- Will Newton Toolchain Working Group, Linaro ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Unify pthread_once (bug 15215) 2014-03-31 11:44 ` Will Newton @ 2014-03-31 20:09 ` Torvald Riegel 0 siblings, 0 replies; 21+ messages in thread From: Torvald Riegel @ 2014-03-31 20:09 UTC (permalink / raw) To: Will Newton; +Cc: Joseph S. Myers, Carlos O'Donell, GLIBC Devel, libc-ports On Mon, 2014-03-31 at 12:44 +0100, Will Newton wrote: > On 7 October 2013 22:53, Torvald Riegel <triegel@redhat.com> wrote: > > On Mon, 2013-10-07 at 16:04 +0000, Joseph S. Myers wrote: > >> I have no comments on the substance of this patch, but note that ports/ > >> has a separate ChangeLog file for each architecture. > > > > Sorry. The attached patch now has separate ChangeLog entries for each of > > the affected archs. > > There seems to be a significant performance delta on aarch64: > > Old code: > > "pthread_once": { > "": { > "duration": 9.29471e+09, "iterations": 1.10667e+09, "max": 24.54, > "min": 8.38, "mean": 8.39882 > > New code: > > "pthread_once": { > "": { > "duration": 9.72366e+09, "iterations": 4.33843e+08, "max": 30.86, > "min": 22.38, "mean": 22.4128 > > And also ARM: > > Old code: > > "pthread_once": { > "": { > "duration": 8.38662e+09, "iterations": 6.6695e+08, "max": 35.292, > "min": 12.416, "mean": 12.5746 > > New code: > > "pthread_once": { > "": { > "duration": 9.26424e+09, "iterations": 3.07574e+08, "max": 86.125, > "min": 28.875, "mean": 30.1204 > > It would be nice to understand the source of this variation. I can put > it on my todo list but I can't promise I will be able to look at it > any time soon. The ARM code (or, the code in general) was lacking a memory barrier. Here's what I wrote in the email that first sent the patch: > > Both I1 and I2 were missing acquire MO on the very first load of > > once_control. This needs to synchronize with the release MO on setting > > the state to init-finished, so without it it's not guaranteed to work > > either. > > Note that this will make a call to pthread_once that doesn't need to > > actually run the init routine slightly slower due to the additional > > acquire barrier. If you're really concerned about this overhead, speak > > up. There are ways to avoid it, but it comes with additional complexity > > and bookkeeping. One way to try to work around the overhead is to keep thread-local state that checks via a counter or such whether a particular thread already used an acquire barrier on a load to this pthread_once previously. This will help only if the same pthread_once is called several times from the same thread -- it won't help if a couple of threads all just call a particular pthread_once a few times. Also, because we can't keep thread-local state for each pthread_once, we'd need to group them all -- in return, this will lead to some synchronization between the initialization phases of unrelated pthread_once instances. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2014-03-31 20:09 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-05-08 14:44 [PATCH] Unify pthread_once (bug 15215) Torvald Riegel 2013-05-08 17:51 ` Rich Felker 2013-05-08 20:47 ` Torvald Riegel 2013-05-08 21:25 ` Rich Felker 2013-05-09 8:39 ` Torvald Riegel 2013-05-09 14:02 ` Rich Felker 2013-05-09 15:14 ` Torvald Riegel 2013-05-09 15:56 ` Rich Felker 2013-05-10 8:31 ` Torvald Riegel 2013-05-10 13:22 ` Rich Felker 2013-05-23 4:15 ` Carlos O'Donell 2013-08-26 12:50 ` Ondřej Bílka 2013-08-26 16:45 ` Rich Felker 2013-08-26 18:41 ` Ondřej Bílka 2013-08-27 2:29 ` Rich Felker 2013-10-06 0:20 ` Torvald Riegel 2013-10-06 21:41 ` Torvald Riegel 2013-10-07 16:04 ` Joseph S. Myers 2013-10-07 21:53 ` Torvald Riegel 2014-03-31 11:44 ` Will Newton 2014-03-31 20:09 ` Torvald Riegel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).