public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/104442] New: atomic<T>::wait incorrectly loops in case of spurious notification when __waiter is shared
@ 2022-02-08 16:31 poulhies at adacore dot com
  2022-02-09  0:27 ` [Bug libstdc++/104442] " rodgertq at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: poulhies at adacore dot com @ 2022-02-08 16:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104442

            Bug ID: 104442
           Summary: atomic<T>::wait incorrectly loops in case of spurious
                    notification when __waiter is shared
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: poulhies at adacore dot com
  Target Milestone: ---

Created attachment 52377
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52377&action=edit
patch fixing the issue

We are observing a deadlock in 100334.cc on vxworks.

This is caused by :

        template<typename _Tp, typename _ValFn>
          void
          _M_do_wait_v(_Tp __old, _ValFn __vfn)
          {
            __platform_wait_t __val;
            if (__base_type::_M_do_spin_v(__old, __vfn, __val))
               return;

            do
              {
                __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
              }
            while (__detail::__atomic_compare(__old, __vfn()));
          }

When several thread are sharing the waiter (as in 100334.cc), the notify_one()
will wake all threads blocked in the _M_do_wait() above. The thread whose data
changed exits the loop correctly, but the others are looping back in
_M_do_wait() with the same arguments. As the waiter's value has changed since
the previous iteration but not __val, the method directly returns (as if it had
detected a notification) and the loop continues.

On GNU/Linux, the test is PASS because the main thread is still scheduled and
will do a .store(1) on all atoms, unblocking all the busy-waiting thread (but
the thread doing a busywait can still be observed with gdb).

On vxworks, the main thread is never scheduled again (I think there's no
preemption at the same prio level) and the busy-wait starves the system.

The attached patch is a possible fix. It moves the spin() call inside the loop,
updating the __val at every iteration. A better fix is probably possible but
may require some refactoring (a bit more than I'm comfortable with).

I've checked the patch for regression on gcc-master for x86_64. It also fixes
the test on gcc-11 for aarch64-vxworks7.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-11  8:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08 16:31 [Bug libstdc++/104442] New: atomic<T>::wait incorrectly loops in case of spurious notification when __waiter is shared poulhies at adacore dot com
2022-02-09  0:27 ` [Bug libstdc++/104442] " rodgertq at gcc dot gnu.org
2022-02-09  8:51 ` poulhies at adacore dot com
2022-02-09 10:16 ` redi at gcc dot gnu.org
2022-02-09 15:29 ` rodgertq at gcc dot gnu.org
2022-02-09 20:31 ` cvs-commit at gcc dot gnu.org
2022-02-09 20:32 ` cvs-commit at gcc dot gnu.org
2022-02-10  8:52 ` poulhies at adacore dot com
2022-02-10 17:57 ` rodgertq at gcc dot gnu.org
2022-02-11  8:16 ` poulhies at adacore dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).