I need to look at this a bit more (and not on my phone, at lunch). Ultimately, C++26 expects to add predicate waits and returning a ‘tri-state’ result isn’t something that’s been considered or likely to be approved. On Mon, Dec 11, 2023 at 12:18 PM Jonathan Wakely wrote: > CCing Tom's current address, as he's not @redhat.com now. > > On Mon, 11 Dec 2023, 19:24 Nate Eldredge, > wrote: > >> On Mon, 11 Dec 2023, Nate Eldredge wrote: >> >> > To fix, we need something like `__args._M_old = __val;` inside the loop >> in >> > __atomic_wait_address(), so that we always wait on the exact value that >> the >> > predicate __pred() rejected. Again, there are similar instances in >> > atomic_timed_wait.h. >> >> Thinking through this, there's another problem. The main loop in >> __atomic_wait_address() would then be: >> >> while (!__pred(__val)) >> { >> __args._M_old = __val; >> __detail::__wait_impl(__wait_addr, &__args); >> __val = __vfn(); >> } >> >> The idea being that we only call __wait_impl to wait on a value that the >> predicate said was unacceptable. But looking for instance at its caller >> __atomic_semaphore::_M_acquire() in bits/semaphore_base.h, the predicate >> passed in is _S_do_try_acquire(), whose code is: >> >> _S_do_try_acquire(__detail::__platform_wait_t* __counter, >> __detail::__platform_wait_t __old) noexcept >> { >> if (__old == 0) >> return false; >> >> return __atomic_impl::compare_exchange_strong(__counter, >> __old, __old - 1, >> >> memory_order::acquire, >> >> memory_order::relaxed); >> } >> >> It returns false if the value passed in was unacceptable (i.e. zero), >> *or* >> if it was nonzero (let's say 1) but the compare_exchange failed because >> another thread swooped in and modified the semaphore counter. In that >> latter case, __atomic_wait_address() would pass 1 to __wait_impl(), which >> is likewise bad. If the counter is externally changed back to 1 just >> before we call __platform_wait (that's the futex call), we would go to >> sleep waiting on a semaphore that is already available: deadlock. >> >> I guess there's a couple ways to fix it. >> >> You could have the "predicate" callback instead return a tri-state value: >> "all done, stop waiting" (like current true), "value passed is not >> acceptable" (like current false), and "value was acceptable but something >> else went wrong". Only the second case should result in calling >> __wait_impl(). In the third case, __atomic_wait_address() should >> just reload the value (using __vfn()) and loop again. >> >> Or, make the callback __pred() a pure predicate that only tests its input >> value for acceptability, without any other side effects. Then have >> __atomic_wait_address() simply return as soon as __pred(__val) returns >> true. It would be up to the caller to actually decrement the semaphore >> or >> whatever, and to call __atomic_wait_address() again if this fails. In >> that case, __atomic_wait_address should probably return the final value >> that was read, so the caller can immediately do something like a >> compare-exchange using it, and not have to do an additional load and >> predicate test. >> >> Or, make __pred() a pure predicate as before, and give >> __atomic_wait_address yet one more callback function argument, call it >> __taker(), whose job is to acquire the semaphore etc, and have >> __atomic_wait_address call it after __pred(__val) returns true. >> >> -- >> Nate Eldredge >> nate@thatsmathematics.com >> >>