CCing Tom's current address, as he's not @redhat.com now. On Mon, 11 Dec 2023, 19:24 Nate Eldredge, wrote: > On Mon, 11 Dec 2023, Nate Eldredge wrote: > > > To fix, we need something like `__args._M_old = __val;` inside the loop > in > > __atomic_wait_address(), so that we always wait on the exact value that > the > > predicate __pred() rejected. Again, there are similar instances in > > atomic_timed_wait.h. > > Thinking through this, there's another problem. The main loop in > __atomic_wait_address() would then be: > > while (!__pred(__val)) > { > __args._M_old = __val; > __detail::__wait_impl(__wait_addr, &__args); > __val = __vfn(); > } > > The idea being that we only call __wait_impl to wait on a value that the > predicate said was unacceptable. But looking for instance at its caller > __atomic_semaphore::_M_acquire() in bits/semaphore_base.h, the predicate > passed in is _S_do_try_acquire(), whose code is: > > _S_do_try_acquire(__detail::__platform_wait_t* __counter, > __detail::__platform_wait_t __old) noexcept > { > if (__old == 0) > return false; > > return __atomic_impl::compare_exchange_strong(__counter, > __old, __old - 1, > memory_order::acquire, > > memory_order::relaxed); > } > > It returns false if the value passed in was unacceptable (i.e. zero), *or* > if it was nonzero (let's say 1) but the compare_exchange failed because > another thread swooped in and modified the semaphore counter. In that > latter case, __atomic_wait_address() would pass 1 to __wait_impl(), which > is likewise bad. If the counter is externally changed back to 1 just > before we call __platform_wait (that's the futex call), we would go to > sleep waiting on a semaphore that is already available: deadlock. > > I guess there's a couple ways to fix it. > > You could have the "predicate" callback instead return a tri-state value: > "all done, stop waiting" (like current true), "value passed is not > acceptable" (like current false), and "value was acceptable but something > else went wrong". Only the second case should result in calling > __wait_impl(). In the third case, __atomic_wait_address() should > just reload the value (using __vfn()) and loop again. > > Or, make the callback __pred() a pure predicate that only tests its input > value for acceptability, without any other side effects. Then have > __atomic_wait_address() simply return as soon as __pred(__val) returns > true. It would be up to the caller to actually decrement the semaphore or > whatever, and to call __atomic_wait_address() again if this fails. In > that case, __atomic_wait_address should probably return the final value > that was read, so the caller can immediately do something like a > compare-exchange using it, and not have to do an additional load and > predicate test. > > Or, make __pred() a pure predicate as before, and give > __atomic_wait_address yet one more callback function argument, call it > __taker(), whose job is to acquire the semaphore etc, and have > __atomic_wait_address call it after __pred(__val) returns true. > > -- > Nate Eldredge > nate@thatsmathematics.com > >