[Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
@ 2023-05-28 21:11 amy at amyspark dot me
  2023-05-28 21:44 ` [Bug libstdc++/110016] " pinskia at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: amy at amyspark dot me @ 2023-05-28 21:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

            Bug ID: 110016
           Summary: [12/13/14] Possible miscodegen when inlining
                    std::condition_variable::wait predicate causes
                    deadlock
           Product: gcc
           Version: 12.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amy at amyspark dot me
  Target Milestone: ---

Created attachment 55181
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55181&action=edit
Minimum test case to reproduce the deadlock

Hi all,

This is to report a possible codegen issue when inlining a lambda predicate for
std::condition_variable::wait. We've verified this to happen with the following
versions:

- g++-8 (Homebrew GCC 8.5.0) 8.5.0
- g++.exe (Rev6, Built by MSYS2 project) 13.1.0 (both UCRT64 and MINGW64)
- g++
(Compiler-Explorer-Build-gcc-4579954f25020f0b39361ab6ec0c8876fda27041-binutils-2.40)
14.0.0 20230522 (experimental)

The deadlock seems to happen with 100% certainty on GCC 12.2.1 if one enables
ThreadSanitizer; otherwise it happens sporadically in CI.

I packaged a reduced version of the test suite:
https://godbolt.org/z/fj8rnrbo7, a copy of which you'll find attached to this
report. Build with `-std=c++17 -pthread -O2 -fsanitize=thread`.

In all cases, once the deadlock is hit (wait for ~3 seconds under GDB) the
"finished" atomic boolean and the "workQueue" are correctly flagged as true and
empty, respectively; however, the thread will still wait for the condition
variable indefinitely. This can be easily worked around by blocking the
inlining eg. turn the lambda into a std::bind instance.

The complete code of the library where we reproduced this is available here:
https://github.com/bad-alloc-heavy-industries/substrate/tree/375db811308ad7414771dbde9af4efa7aa393ca8.
You can build it with `meson setup build -Dcpp_std=c++17 -Db_sanitize=thread`
and run the test with `meson test -C build`.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
@ 2023-05-28 21:44 ` pinskia at gcc dot gnu.org
  2023-05-29  0:47 ` [Bug libstdc++/110016] " amy at amyspark dot me
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-28 21:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c++                         |libstdc++

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I doubt this is a code generation issue but rather either a libstdc++ issue or
a problem in the code itself.

Inlining if anything might expose a race condition that was in the code more
often than not.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
  2023-05-28 21:44 ` [Bug libstdc++/110016] " pinskia at gcc dot gnu.org
@ 2023-05-29  0:47 ` amy at amyspark dot me
  2023-05-29  1:03 ` pinskia at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: amy at amyspark dot me @ 2023-05-29  0:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #2 from Amyspark <amy at amyspark dot me> ---
We've seen failures both in Windows (all ABI flavors) and macOS only when
compiled with GCC -- AppleClang, Clang, and MSVC (in its three flavors) all
work without issue. So I'm doubtful it's a logical issue, especially given that
preventing the compiler from inlining void Predicate::operator() into
std::condition_variable::wait seems to be enough to work around it.

Re 98033-- looks somewhat like it, though as explained earlier, it may also
affect non Linux platforms ie. any target where GCC relies on (win)pthreads.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
  2023-05-28 21:44 ` [Bug libstdc++/110016] " pinskia at gcc dot gnu.org
  2023-05-29  0:47 ` [Bug libstdc++/110016] " amy at amyspark dot me
@ 2023-05-29  1:03 ` pinskia at gcc dot gnu.org
  2023-05-29  1:05 ` pinskia at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
When you say you compiled with clang, did you use libstdc++ or libc++?

Did you try adding gnu::always_inline attribute on the lambda to see if it
fails there too?

Again the inlining should not have an effect here except if there is some kind
of race condition happening.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (2 preceding siblings ...)
  2023-05-29  1:03 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:05 ` pinskia at gcc dot gnu.org
  2023-05-29  1:05 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Did you try on some other target than x86 for gcc?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (3 preceding siblings ...)
  2023-05-29  1:05 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:05 ` pinskia at gcc dot gnu.org
  2023-05-29  1:20 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Did you try on some other target than x86 for gcc?

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Did you try on some other target than x86 for gcc?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (4 preceding siblings ...)
  2023-05-29  1:05 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:20 ` pinskia at gcc dot gnu.org
  2023-05-29  1:27 ` pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #5)
> Did you try on some other target than x86 for gcc?

To answer my own question is that it fails on aarch64-linux-gnu also. So this
makes it more likely a library issue (maybe glibc ...)


Thread 26 (Thread 0xffffe5f3ea10 (LWP 1003246)):
#0  0x0000fffff6acd22c in futex_wait_cancelable (private=<optimized out>,
expected=0, futex_word=0xfffff3103c64) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xfffff3103c08,
cond=0xfffff3103c38) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xfffff3103c38, mutex=0xfffff3103c08) at
pthread_cond_wait.c:655
#3  0x0000fffff6efc2e4 in __tsan::call_pthread_cancel_with_cleanup
(fn=fn@entry=0xfffff6eafd00 <_FUN(void*)>, cleanup=cleanup@entry=0xfffff6eb5364
<_FUN(void*)>, arg=arg@entry=0xffffe5f3e0d0) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_platform_linux.cpp:493
#4  0x0000fffff6ed4194 in cond_wait<__interceptor_pthread_cond_wait(void*,
void*)::<lambda()> > (m=0xfffff3103c08, c=0xfffff3103c38, fn=...,
si=0xffffe5f3e0b0, pc=281474824487080, thr=<optimized out>) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1259
#5  __interceptor_pthread_cond_wait (c=<optimized out>, m=0xfffff3103c08) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1270
#6  0x0000000000403850 in
std::condition_variable::wait<substrate::threadPool_t<bool
()>::waitWork()::{lambda()#1}>(std::unique_lock<std::mutex>&,
substrate::threadPool_t<bool ()>::waitWork()::{lambda()#1}) (__p=...,
__lock=..., this=0xfffff3103c38)
    at /home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/atomic_base.h:503
#7  substrate::threadPool_t<bool ()>::waitWork() (this=0xfffff3103c00) at
t.cc:287
#8  substrate::threadPool_t<bool ()>::workerThread(unsigned long)
(processor=<optimized out>, this=0xfffff3103c00) at t.cc:312
#9  substrate::threadPool_t<bool ()>::threadPool_t(bool
(*)())::{lambda(unsigned long)#1}::operator()(unsigned long) const
(currentProcessor=<error reading variable: dwarf2_find_location_expression:
Corrupted DWARF expression.>, __closure=<optimized out>) at t.cc:338
#10 std::__invoke_impl<void, substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned
long>(std::__invoke_other, substrate::threadPool_t<bool ()>::threadPool_t(bool
(*)())::{lambda(unsigned long)#1}&&, unsigned long&&) (__f=...)
    at /home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/invoke.h:61
#11 std::__invoke<substrate::threadPool_t<bool ()>::threadPool_t(bool
(*)())::{lambda(unsigned long)#1}, unsigned long>(std::__invoke_result&&,
(substrate::threadPool_t<bool ()>::threadPool_t(bool (*)())::{lambda(unsigned
long)#1}&&)...) (__fn=...) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/invoke.h:96
#12 std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long>
>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (this=<optimized out>) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:292
#13 std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long>
>::operator()() (this=<optimized out>) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:299
#14
std::thread::_State_impl<std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long> >
>::_M_run() (this=<optimized out>) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:244
#15 0x0000fffff6ced74c in std::execute_native_thread_routine
(__p=0xfffff56004a0) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libstdc++-v3/src/c++11/thread.cc:104
#16 0x0000fffff6eaf63c in __tsan_thread_start_func (arg=0xfffffffff730) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1038
#17 0x0000fffff6ac7088 in start_thread (arg=0xfffffffff66f) at
pthread_create.c:463
#18 0x0000fffff6a304ec in thread_start () at
../sysdeps/unix/sysv/linux/aarch64/clone.S:78

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (5 preceding siblings ...)
  2023-05-29  1:20 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:27 ` pinskia at gcc dot gnu.org
  2023-05-29  1:34 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I can reproduce the failure on aarch64-linux-gnu on the trunk with `-std=c++17
-pthread -O2 -fsanitize=thread -fno-inline` so your theory about inlining is
causing an issue is so incorrect.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (6 preceding siblings ...)
  2023-05-29  1:27 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:34 ` pinskia at gcc dot gnu.org
  2023-05-29  1:47 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is the backtrace in that case:
(gdb) bt
#0  0x0000fffff6acd22c in futex_wait_cancelable (private=<optimized out>,
expected=0, futex_word=0xfffff3103c64) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xfffff3103c08,
cond=0xfffff3103c38) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xfffff3103c38, mutex=0xfffff3103c08) at
pthread_cond_wait.c:655
#3  0x0000fffff6efc2e4 in __tsan::call_pthread_cancel_with_cleanup
(fn=fn@entry=0xfffff6eafd00 <_FUN(void*)>, cleanup=cleanup@entry=0xfffff6eb5364
<_FUN(void*)>, arg=arg@entry=0xffffe5f3dff0) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_platform_linux.cpp:493
#4  0x0000fffff6ed4194 in cond_wait<__interceptor_pthread_cond_wait(void*,
void*)::<lambda()> > (m=0xfffff3103c08, c=0xfffff3103c38, fn=...,
si=0xffffe5f3dfd0, pc=281474824487080, thr=<optimized out>) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1259
#5  __interceptor_pthread_cond_wait (c=<optimized out>, m=0xfffff3103c08) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1270
#6  0x00000000004045f4 in
std::condition_variable::wait<substrate::threadPool_t<bool
()>::waitWork()::{lambda()#1}>(std::unique_lock<std::mutex>&,
substrate::threadPool_t<bool ()>::waitWork()::{lambda()#1})
(this=this@entry=0xfffff3103c38, __lock=..., __p=__p@entry=...)
    at /home/ubuntu/upstream-gcc/include/c++/14.0.0/condition_variable:102
#7  0x00000000004064e0 in substrate::threadPool_t<bool ()>::waitWork()
(this=this@entry=0xfffff3103c00) at t.cc:282
#8  0x00000000004081e4 in substrate::threadPool_t<bool
()>::workerThread(unsigned long) (this=this@entry=0xfffff3103c00,
processor=<error reading variable: dwarf2_find_location_expression: Corrupted
DWARF expression.>) at t.cc:310
#9  0x0000000000408234 in substrate::threadPool_t<bool ()>::threadPool_t(bool
(*)())::{lambda(unsigned long)#1}::operator()(unsigned long) const
(currentProcessor=<error reading variable: dwarf2_find_location_expression:
Corrupted DWARF expression.>,
    __closure=<error reading variable: dwarf2_find_location_expression:
Corrupted DWARF expression.>) at t.cc:337
#10 0x0000000000408294 in std::__invoke_impl<void, substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned
long>(std::__invoke_other, substrate::threadPool_t<bool ()>::threadPool_t(bool
(*)())::{lambda(unsigned long)#1}&&, unsigned long&&) (__f=...)
    at /home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/invoke.h:60
#11 0x00000000004082e0 in std::__invoke<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned
long>(std::__invoke_result&&, (substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}&&)...) (__fn=...)
    at /home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/invoke.h:90
#12 0x00000000004084d4 in
std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long>
>::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>)
(this=this@entry=0xfffff56004a8) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:291
#13 0x0000000000408504 in
std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long>
>::operator()() (this=this@entry=0xfffff56004a8) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:295
#14 0x0000000000408534 in
std::thread::_State_impl<std::thread::_Invoker<std::tuple<substrate::threadPool_t<bool
()>::threadPool_t(bool (*)())::{lambda(unsigned long)#1}, unsigned long> >
>::_M_run() (this=0xfffff56004a0) at
/home/ubuntu/upstream-gcc/include/c++/14.0.0/bits/std_thread.h:244
#15 0x0000fffff6ced74c in std::execute_native_thread_routine
(__p=0xfffff56004a0) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libstdc++-v3/src/c++11/thread.cc:104
#16 0x0000fffff6eaf63c in __tsan_thread_start_func (arg=0xfffffffff6f0) at
/home/ubuntu/src/upstream-gcc-aarch64/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1038
#17 0x0000fffff6ac7088 in start_thread (arg=0xfffffffff62f) at
pthread_create.c:463
#18 0x0000fffff6a304ec in thread_start () at
../sysdeps/unix/sysv/linux/aarch64/clone.S:78

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (7 preceding siblings ...)
  2023-05-29  1:34 ` pinskia at gcc dot gnu.org
@ 2023-05-29  1:47 ` pinskia at gcc dot gnu.org
  2023-05-29  6:43 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I think this is a bug in your code:

Inside substrate::threadPool_t::finish,
we have:

                        finished = true;
                        haveWork.notify_all();

If I change it to:
                        {
                          std::lock_guard<std::mutex> lock{workMutex};
                          finished = true;
                          haveWork.notify_all();
                        }

Then I don't get a deadlock at all.
As I mentioned, I did think there was a race condition.
Here is what I think happened:
Thread26:                            thread 1
checks finished, still false         sets finished to be true
calls wait                           calls notify_all
...                                  notify_all happens
finally gets into futex_wait syscall ....

And then thread26 never got the notification.

With my change the check for finished has to wait till thread1 lets go of the
mutex (and the other way around).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (8 preceding siblings ...)
  2023-05-29  1:47 ` pinskia at gcc dot gnu.org
@ 2023-05-29  6:43 ` pinskia at gcc dot gnu.org
  2023-05-29 13:30 ` rachel at rachelmant dot com
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29  6:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #9)
> So I think this is a bug in your code:
> 
> Inside substrate::threadPool_t::finish,
> we have:
> 
>                         finished = true;
>                         haveWork.notify_all();
> 
> If I change it to:
>                         {
>                           std::lock_guard<std::mutex> lock{workMutex};
>                           finished = true;
>                           haveWork.notify_all();
>                         }
> 
> Then I don't get a deadlock at all.
> As I mentioned, I did think there was a race condition.
> Here is what I think happened:
> Thread26:                            thread 1
> checks finished, still false         sets finished to be true
> calls wait                           calls notify_all
> ...                                  notify_all happens
> finally gets into futex_wait syscall ....
> 
> And then thread26 never got the notification.
> 
> With my change the check for finished has to wait till thread1 lets go of
> the mutex (and the other way around).

I should note that adding -fsanitize=thread increases the time between checking
of finished and getting into the wait which increases the chances of the race
condition of finished changing to true and notify_all happens before the wait
has happened.

I can prevent this race condition from showing up by adding:
        for(volatile int i = 0 ; i < 10000;i ++) ;
Right before the call to finish method in main. because it forces a "sleep"
before setting of finished variable which is enough time to pass on the other
thread to get into the wait state.

Though the race condition might show up when adding work queue entries still
and then calling finish method.

It is not even an ABA issue that is mentioned in PR 98033 as the predicate
never changes from false to true back to false. In this case it is false and
then true, but we never saw the change because that thread never got a
notification as it was not in wait when the notifications were sent out.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (9 preceding siblings ...)
  2023-05-29  6:43 ` pinskia at gcc dot gnu.org
@ 2023-05-29 13:30 ` rachel at rachelmant dot com
  2023-05-29 14:28 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rachel at rachelmant dot com @ 2023-05-29 13:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

Rachel Mant <rachel at rachelmant dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rachel at rachelmant dot com

--- Comment #11 from Rachel Mant <rachel at rachelmant dot com> ---
(In reply to Andrew Pinski from comment #9)

Hi Andrew, original author of the ode here

> So I think this is a bug in your code:
> 
> Inside substrate::threadPool_t::finish,
> we have:
> 
>                         finished = true;
>                         haveWork.notify_all();
> 

`finished` here should be a std::atomic in strict sequential ordering mode

> If I change it to:
>                         {
>                           std::lock_guard<std::mutex> lock{workMutex};
>                           finished = true;
>                           haveWork.notify_all();
>                         }
> 

This will work, yes, except it will also cause mutex contention on the
condition variable, and should not be necessary iff the atomic is working
correctly as the order of operations between the threads should be:

 Thread 1             | Thread 2
----------------------+-------------------
finished = true       | <in haveWork.wait()>
haveWork.notify_all() | <wake-up>
<joining begins>      | lambda evlauation
                      | <haveWork.wait() exists>

The only way that the mutex can fix this is to add in the barrier that should
already be there by virtue of the atomic being used std::memory_order_seq_cst
unless there's also a TOCTOU issue in play.

> Then I don't get a deadlock at all.
> As I mentioned, I did think there was a race condition.
> Here is what I think happened:
> Thread26:                            thread 1
> checks finished, still false         sets finished to be true
> calls wait                           calls notify_all
> ...                                  notify_all happens
> finally gets into futex_wait syscall ....
> 
> And then thread26 never got the notification.
> 
> With my change the check for finished has to wait till thread1 lets go of
> the mutex (and the other way around).

Do please correct us if we're wrong about our understanding on how the atomic
should be working here. We're well aware that concurrency is /hard/ and getting
things like this right is even harder. The thing that stands out, as Amy said,
is that GCC is the *only* compiler on which we get this deadlock which is why
she and we think this is a stdlib or codegen bug rather than our bug, but if
you can say how it's not, then we're happy to accept that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (10 preceding siblings ...)
  2023-05-29 13:30 ` rachel at rachelmant dot com
@ 2023-05-29 14:28 ` pinskia at gcc dot gnu.org
  2023-05-29 15:15 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29 14:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Let me try again to show the exact events of why I think this is not a
libstdc++/GCC bug here.

time      thread/core 1                       thread/core N
-1                                            grab the mutex
0         atomically load waitingThreads      atomically increment
waitingThreads
1         compare  waitingThreads             atomically load finished
2         atomically set finished to 1        atomically load work.empty()
(queueLength)
3         start of notify_all                 branch on finished/queueLength
4         ...(some code before ...)           start on haveWork.wait
5         notifies all threads finished       .....(some more before ...)
6         .....                               waiting now
7         starts of joins                     still inside wait
8         joins hit thread N                  still inside wait
etc.

You will notice the ordering of loading finished and the wait (and setting of
finished and notify_all) is exactly ordered as you expect them to be ordered
with memory_order_seq_cst on each of the core; that is there is no reordering
going on each thread/core. It is still strictly ordered even. 

The reason why maybe libstdc++ exposes this is that the wait implemention
checks the predicate before it goes into wait system call. or the time between
the start of the call of notify_all call starts and the notifications go out is
shorter than the time it takes to after the atomic load of finished happenes
and the wait system call happens.

Since on thread 1, updating finished to 1  and notify_all is not done
atomically (together), a thread could have read finished before the update and
get into the wait loop after the notifications have gone out.

It is very similar to a TOCTOU issue because the use of the idea of finished is
the wait itself rather than the comparison. and setting of finished and notify
are done atomically (together); right now there is only an atomic ordering of
the two.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (11 preceding siblings ...)
  2023-05-29 14:28 ` pinskia at gcc dot gnu.org
@ 2023-05-29 15:15 ` pinskia at gcc dot gnu.org
  2023-05-29 15:18 ` rachel at rachelmant dot com
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29 15:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I suspect if you change the lambda/call to substrate::threadPool_t::waitWork to
be:

                inline std::pair<bool, std::tuple<args_t...>> waitWork()
noexcept
                {
                        std::unique_lock<std::mutex> lock{workMutex};
                        ++waitingThreads;
                        // wait, but protect ourselves from accidental
wake-ups..
                        auto b = [this]() noexcept -> bool { bool t = finished;
for(volatile int i = 0 ; i < 10000;i ++);  return t || !work.empty(); };
                        //if (!b())
                          haveWork.wait(lock, b);
                        --waitingThreads;
                        if (!work.empty())
                                return {true, work.pop()};
                        return {false, {}};
                }



you might hit the issue with more C++ implementations. This should simulates
the issue I was mentioning by adding a slight delay between the load of
finished before the call to wait(lock).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (12 preceding siblings ...)
  2023-05-29 15:15 ` pinskia at gcc dot gnu.org
@ 2023-05-29 15:18 ` rachel at rachelmant dot com
  2023-05-29 15:37 ` pinskia at gcc dot gnu.org
  2023-05-29 20:38 ` redi at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: rachel at rachelmant dot com @ 2023-05-29 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #14 from Rachel Mant <rachel at rachelmant dot com> ---
(In reply to Andrew Pinski from comment #12)
> Let me try again to show the exact events of why I think this is not a
> libstdc++/GCC bug here.
> 
> 
> time      thread/core 1                       thread/core N
> -1                                            grab the mutex
> 0         atomically load waitingThreads      atomically increment
> waitingThreads
> 1         compare  waitingThreads             atomically load finished
> 2         atomically set finished to 1        atomically load work.empty()
> (queueLength)
> 3         start of notify_all                 branch on finished/queueLength
> 4         ...(some code before ...)           start on haveWork.wait
> 5         notifies all threads finished       .....(some more before ...)
> 6         .....                               waiting now
> 7         starts of joins                     still inside wait
> 8         joins hit thread N                  still inside wait
> etc.
> 
> You will notice the ordering of loading finished and the wait (and setting
> of finished and notify_all) is exactly ordered as you expect them to be
> ordered with memory_order_seq_cst on each of the core; that is there is no
> reordering going on each thread/core. It is still strictly ordered even. 
> 
> The reason why maybe libstdc++ exposes this is that the wait implemention
> checks the predicate before it goes into wait system call. or the time
> between the start of the call of notify_all call starts and the
> notifications go out is shorter than the time it takes to after the atomic
> load of finished happenes and the wait system call happens.
> 
> Since on thread 1, updating finished to 1  and notify_all is not done
> atomically (together), a thread could have read finished before the update
> and get into the wait loop after the notifications have gone out.
> 
> It is very similar to a TOCTOU issue because the use of the idea of finished
> is the wait itself rather than the comparison. and setting of finished and
> notify are done atomically (together); right now there is only an atomic
> ordering of the two.

Thank you for the clear run-through of the series of events you see leading to
the deadlock. That's very helpful.

To properly understand this problem space, why do you think locking the mutex
before setting `finished` is sufficient to fix this? It feels to us like it
shouldn't, and should only mask the bug, making it less likely to trigger?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (13 preceding siblings ...)
  2023-05-29 15:18 ` rachel at rachelmant dot com
@ 2023-05-29 15:37 ` pinskia at gcc dot gnu.org
  2023-05-29 20:38 ` redi at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-29 15:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Rachel Mant from comment #14) 
> To properly understand this problem space, why do you think locking the
> mutex before setting `finished` is sufficient to fix this? It feels to us
> like it shouldn't, and should only mask the bug, making it less likely to
> trigger?

So currently wait (and the check of finished) is under the mutex, workMutex. If
you want to modify finish, it needs to be under the same mutex.


Note https://en.cppreference.com/w/cpp/thread/condition_variable mentiones the
exact fix too.

And even has the following: "Even if the shared variable is atomic, it must be
modified while owning the mutex to correctly publish the modification to the
waiting thread."

(this part was added back in 2015:
https://en.cppreference.com/mwiki/index.php?title=cpp/thread/condition_variable&diff=81211&oldid=80448
)

So I think I can now close this as invalid.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock
  2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
                   ` (14 preceding siblings ...)
  2023-05-29 15:37 ` pinskia at gcc dot gnu.org
@ 2023-05-29 20:38 ` redi at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: redi at gcc dot gnu.org @ 2023-05-29 20:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110016

--- Comment #16 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The condition variable uses that mutex to synchronize. If you just modify the
atomic variable _without locking and unlocking the mutex_ then you are not
synchronizing with the condition variable.

If you don't use the mutex for the producer thread, then (as Andrew described)
the consumer can check the atomic, see that the condition is not yet true, and
decide it needs to wait. Then the producer makes the condition true and signals
using notify_all but that is missed because the consumer thread isn't waiting
yet, then the consumer starts to wait. But because it already missed the
notify_all call, it waits forever.

Using the mutex in the producer does not just hide the bug, it fixes it
properly. If the producer updates the atomic while the mutex is held then it's
impossible for the condition to become true between the consumer checking it
and starting to wait. Those two things (checking the condition and starting to
wait) become atomic w.r.t the producer. It's not possible to get this
interleaving if you use the mutex correctly:

consumer checks condition
producer makes condition true
producer notifies
consumer waits

Instead you either get:

consumer checks condition and waits
producer makes condition true and notifies, waking the consumer

or:

producer makes condition true (and notifies, but consumer isn't waiting)
consumer sees condition is true and never waits

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-05-29 20:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-28 21:11 [Bug c++/110016] New: [12/13/14] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock amy at amyspark dot me
2023-05-28 21:44 ` [Bug libstdc++/110016] " pinskia at gcc dot gnu.org
2023-05-29  0:47 ` [Bug libstdc++/110016] " amy at amyspark dot me
2023-05-29  1:03 ` pinskia at gcc dot gnu.org
2023-05-29  1:05 ` pinskia at gcc dot gnu.org
2023-05-29  1:05 ` pinskia at gcc dot gnu.org
2023-05-29  1:20 ` pinskia at gcc dot gnu.org
2023-05-29  1:27 ` pinskia at gcc dot gnu.org
2023-05-29  1:34 ` pinskia at gcc dot gnu.org
2023-05-29  1:47 ` pinskia at gcc dot gnu.org
2023-05-29  6:43 ` pinskia at gcc dot gnu.org
2023-05-29 13:30 ` rachel at rachelmant dot com
2023-05-29 14:28 ` pinskia at gcc dot gnu.org
2023-05-29 15:15 ` pinskia at gcc dot gnu.org
2023-05-29 15:18 ` rachel at rachelmant dot com
2023-05-29 15:37 ` pinskia at gcc dot gnu.org
2023-05-29 20:38 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).