public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* deadlock on console mutex in gdb
@ 2021-12-22 15:26 David McFarland
  2021-12-22 20:44 ` Takashi Yano
  0 siblings, 1 reply; 9+ messages in thread
From: David McFarland @ 2021-12-22 15:26 UTC (permalink / raw)
  To: cygwin-developers; +Cc: David McFarland, Takashi Yano

I'm seeing a deadlock when using gdb on a process with a lot of console
output and thread creation:

Thread 1 (Thread 13016.0x2804):
#0  0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x0000000180071719 in fhandler_console::set_output_mode (m=tty::cygwin, t=0x180000028, p=0x1803a3418) at ../../../../winsup/cygwin/fhandler_console.cc:524
#3  0x0000000180079c6e in fhandler_console::write (this=0x1803a3218, vsrc=0x800312dc0, len=26) at ../../../../winsup/cygwin/fhandler_console.cc:3115
#4  0x000000018014a022 in write (fd=1, ptr=0x800312dc0, len=26) at ../../../../winsup/cygwin/syscalls.cc:1360
#5  0x0000000180227349 in _write_r (ptr=0xffffd680, fd=1, buf=0x800312dc0, cnt=26) at ../../../../../newlib/libc/reent/writer.c:49
#6  0x00000001801e8e0c in __swrite64 (ptr=0xffffd680, cookie=0x180273c70 <reent_data+1520>, buf=0x800312dc0 "[New Thread 11424.0x244c]\ne/c/cygwin64/home/David_M/src/nix/test/bin/nix build -f nixpkgs/test --log-format raw -vvvvvvvvv\n", n=26) at ../../../../../newlib/libc/stdio64/stdio64.c:69
#7  0x00000001801d13cf in __sflush_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fflush.c:224
#8  0x00000001801d14d5 in _fflush_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fflush.c:278
#9  0x00000001801d8bbd in __sfvwrite_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>, uio=0xffffbbb0) at ../../../../../newlib/libc/stdio/fvwrite.c:251
#10 0x00000001801d5026 in _fputs_r (ptr=0xffffd680, s=0x10097cdd5 <bright_colors+3413> "\n", fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fputs.c:107
#11 0x00000001801d509d in fputs (s=0x10097cdd5 <bright_colors+3413> "\n", fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fputs.c:140
#12 0x00000001801a6beb in _sigfe () at sigfe.s:37
#13 0x0000000100716246 in fputs_maybe_filtered (linebuffer=<optimized out>, stream=0x8002f8910, filter=0) at /usr/src/debug/gdb-10.2-1/gdb/utils.c:1828
#14 0x00000001007174db in vfprintf_styled_no_gdbfmt (stream=0x8002f8910, style=..., filter=<optimized out>, format=<optimized out>, args=0xffffbe18 "") at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/basic_string.h:3360
#15 0x0000000100482e57 in cli_ui_out::do_message (args=0xffffbe18 "", format=0x808ba1359 "]\n", style=..., this=0xffffbf30) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/stl_iterator.h:953
#16 cli_ui_out::do_message (this=0xffffbf30, style=..., format=0x808ba1359 "]\n", args=0xffffbe18 "") at /usr/src/debug/gdb-10.2-1/gdb/cli-out.c:227
#17 0x000000010070f21e in ui_out::call_do_message (this=this@entry=0xffffbf30, style=..., format=<optimized out>) at /usr/src/debug/gdb-10.2-1/gdb/ui-out.c:597
#18 0x000000010070f9db in ui_out::vmessage (this=this@entry=0xffffbf30, in_style=..., format=<optimized out>, format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=0xffffc170 "\030", args@entry=0xffffc168 "8ĵ\t\b") at /usr/src/debug/gdb-10.2-1/gdb/ui-out.c:778
#19 0x000000010071634c in vfprintf_maybe_filtered (stream=stream@entry=0x8002f8910, format=format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=args@entry=0xffffc168 "8ĵ\t\b", filter=filter@entry=false, gdbfmt=gdbfmt@entry=true) at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2064
#20 0x00000001007154dc in vfprintf_unfiltered (stream=0x8002f8910, format=format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=args@entry=0xffffc168 "8ĵ\t\b") at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2107
#21 0x0000000100715878 in printf_unfiltered (format=0x100970683 <dummy_target_info+1779> "[New %s]\n") at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2218
#22 0x00000001006e1f86 in add_thread_with_info (targ=targ@entry=0x100893b80 <the_windows_nat_target>, ptid=..., priv=priv@entry=0x0) at /usr/src/debug/gdb-10.2-1/gdb/thread.c:297
#23 0x00000001006e2023 in add_thread (targ=targ@entry=0x100893b80 <the_windows_nat_target>, ptid=<error reading variable: Cannot access memory at address 0x0>) at /usr/src/debug/gdb-10.2-1/gdb/thread.c:306
#24 0x000000010073906c in windows_add_thread (ptid=..., h=<optimized out>, tlb=<optimized out>, main_thread_p=main_thread_p@entry=false) at /usr/src/debug/gdb-10.2-1/gdb/windows-nat.c:451
#25 0x000000010073999c in windows_nat_target::get_windows_debug_event (this=this@entry=0x100893b80 <the_windows_nat_target>, pid=pid@entry=-1, ourstatus=ourstatus@entry=0xffffca28) at /usr/src/debug/gdb-10.2-1/gdb/../gdbsupport/ptid.h:49
#26 0x0000000100739c72 in windows_nat_target::wait (this=0x100893b80 <the_windows_nat_target>, ptid=..., ourstatus=0xffffca28, options=0) at /usr/src/debug/gdb-10.2-1/gdb/windows-nat.c:1801
#27 0x00000001006d91cf in target_wait (ptid=..., status=status@entry=0xffffca28, options=options@entry=0) at /usr/src/debug/gdb-10.2-1/gdb/target.c:2017
#28 0x0000000100593c2e in do_target_wait_1 (inf=inf@entry=0x800262c70, ptid=..., status=status@entry=0xffffca28, options=0, options@entry=1) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3544
#29 0x0000000100594058 in operator() (inf=0x800262c70, __closure=<synthetic pointer>) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3606
#30 do_target_wait (wait_ptid=..., ecs=ecs@entry=0xffffca00, options=options@entry=1) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3619
#31 0x00000001005a1f26 in fetch_inferior_event () at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3905
#32 0x0000000100435d76 in check_async_event_handlers () at /usr/src/debug/gdb-10.2-1/gdb/async-event.c:295
#33 0x0000000100803522 in gdb_do_one_event () at /usr/src/debug/gdb-10.2-1/gdbsupport/event-loop.cc:194
#34 0x00000001005c0a1d in start_event_loop () at /usr/src/debug/gdb-10.2-1/gdb/main.c:356
#35 captured_command_loop () at /usr/src/debug/gdb-10.2-1/gdb/main.c:416
#36 0x00000001005c2b25 in captured_main (data=0xffffcb90) at /usr/src/debug/gdb-10.2-1/gdb/main.c:1253
#37 gdb_main (args=args@entry=0xffffcbf0) at /usr/src/debug/gdb-10.2-1/gdb/main.c:1268
#38 0x000000010087c3d9 in main (argc=9, argv=0xffffcc50) at /usr/src/debug/gdb-10.2-1/gdb/gdb.c:32

No other threads appear to be doing anything relevant. I believe the
inferior process holds the mutex, but I haven't 100% confirmed that. I
can't attach another debugger to it, so I'm not sure how I'd do that.

It looks like gdb is getting a new thread event (and trying to log it)
while the inferior process has a lock on the inter-process output mutex.
I'm not sure how we'd avoid a deadlock in that case.

I can reproduce this on 3.3.3 as well as master.

The mutexes were added in:

commit f4b47827cf87f055687a0c52a3485d42b3e2b941
Author: Takashi Yano <takashi.yano@nifty.ne.jp>
Date:   Mon Apr 1 00:47:48 2019 +0900

    Cygwin: console: Make I/O functions thread-safe

    - POSIX states I/O functions shall be thread-safe, however, cygwin
      console I/O functions were not. This patch makes console I/O
      functions thread-safe.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-22 15:26 deadlock on console mutex in gdb David McFarland
@ 2021-12-22 20:44 ` Takashi Yano
  2021-12-22 22:17   ` David McFarland
  0 siblings, 1 reply; 9+ messages in thread
From: Takashi Yano @ 2021-12-22 20:44 UTC (permalink / raw)
  To: cygwin-developers; +Cc: David McFarland

On Wed, 22 Dec 2021 11:26:54 -0400
David McFarland wrote:
> I'm seeing a deadlock when using gdb on a process with a lot of console
> output and thread creation:
> 
> Thread 1 (Thread 13016.0x2804):
> #0  0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
> #1  0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
> #2  0x0000000180071719 in fhandler_console::set_output_mode (m=tty::cygwin, t=0x180000028, p=0x1803a3418) at ../../../../winsup/cygwin/fhandler_console.cc:524
> #3  0x0000000180079c6e in fhandler_console::write (this=0x1803a3218, vsrc=0x800312dc0, len=26) at ../../../../winsup/cygwin/fhandler_console.cc:3115
> #4  0x000000018014a022 in write (fd=1, ptr=0x800312dc0, len=26) at ../../../../winsup/cygwin/syscalls.cc:1360
> #5  0x0000000180227349 in _write_r (ptr=0xffffd680, fd=1, buf=0x800312dc0, cnt=26) at ../../../../../newlib/libc/reent/writer.c:49
> #6  0x00000001801e8e0c in __swrite64 (ptr=0xffffd680, cookie=0x180273c70 <reent_data+1520>, buf=0x800312dc0 "[New Thread 11424.0x244c]\ne/c/cygwin64/home/David_M/src/nix/test/bin/nix build -f nixpkgs/test --log-format raw -vvvvvvvvv\n", n=26) at ../../../../../newlib/libc/stdio64/stdio64.c:69
> #7  0x00000001801d13cf in __sflush_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fflush.c:224
> #8  0x00000001801d14d5 in _fflush_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fflush.c:278
> #9  0x00000001801d8bbd in __sfvwrite_r (ptr=0xffffd680, fp=0x180273c70 <reent_data+1520>, uio=0xffffbbb0) at ../../../../../newlib/libc/stdio/fvwrite.c:251
> #10 0x00000001801d5026 in _fputs_r (ptr=0xffffd680, s=0x10097cdd5 <bright_colors+3413> "\n", fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fputs.c:107
> #11 0x00000001801d509d in fputs (s=0x10097cdd5 <bright_colors+3413> "\n", fp=0x180273c70 <reent_data+1520>) at ../../../../../newlib/libc/stdio/fputs.c:140
> #12 0x00000001801a6beb in _sigfe () at sigfe.s:37
> #13 0x0000000100716246 in fputs_maybe_filtered (linebuffer=<optimized out>, stream=0x8002f8910, filter=0) at /usr/src/debug/gdb-10.2-1/gdb/utils.c:1828
> #14 0x00000001007174db in vfprintf_styled_no_gdbfmt (stream=0x8002f8910, style=..., filter=<optimized out>, format=<optimized out>, args=0xffffbe18 "") at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/basic_string.h:3360
> #15 0x0000000100482e57 in cli_ui_out::do_message (args=0xffffbe18 "", format=0x808ba1359 "]\n", style=..., this=0xffffbf30) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/stl_iterator.h:953
> #16 cli_ui_out::do_message (this=0xffffbf30, style=..., format=0x808ba1359 "]\n", args=0xffffbe18 "") at /usr/src/debug/gdb-10.2-1/gdb/cli-out.c:227
> #17 0x000000010070f21e in ui_out::call_do_message (this=this@entry=0xffffbf30, style=..., format=<optimized out>) at /usr/src/debug/gdb-10.2-1/gdb/ui-out.c:597
> #18 0x000000010070f9db in ui_out::vmessage (this=this@entry=0xffffbf30, in_style=..., format=<optimized out>, format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=0xffffc170 "\030", args@entry=0xffffc168 "8ĵ\t\b") at /usr/src/debug/gdb-10.2-1/gdb/ui-out.c:778
> #19 0x000000010071634c in vfprintf_maybe_filtered (stream=stream@entry=0x8002f8910, format=format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=args@entry=0xffffc168 "8ĵ\t\b", filter=filter@entry=false, gdbfmt=gdbfmt@entry=true) at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2064
> #20 0x00000001007154dc in vfprintf_unfiltered (stream=0x8002f8910, format=format@entry=0x100970683 <dummy_target_info+1779> "[New %s]\n", args=args@entry=0xffffc168 "8ĵ\t\b") at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2107
> #21 0x0000000100715878 in printf_unfiltered (format=0x100970683 <dummy_target_info+1779> "[New %s]\n") at /usr/src/debug/gdb-10.2-1/gdb/utils.c:2218
> #22 0x00000001006e1f86 in add_thread_with_info (targ=targ@entry=0x100893b80 <the_windows_nat_target>, ptid=..., priv=priv@entry=0x0) at /usr/src/debug/gdb-10.2-1/gdb/thread.c:297
> #23 0x00000001006e2023 in add_thread (targ=targ@entry=0x100893b80 <the_windows_nat_target>, ptid=<error reading variable: Cannot access memory at address 0x0>) at /usr/src/debug/gdb-10.2-1/gdb/thread.c:306
> #24 0x000000010073906c in windows_add_thread (ptid=..., h=<optimized out>, tlb=<optimized out>, main_thread_p=main_thread_p@entry=false) at /usr/src/debug/gdb-10.2-1/gdb/windows-nat.c:451
> #25 0x000000010073999c in windows_nat_target::get_windows_debug_event (this=this@entry=0x100893b80 <the_windows_nat_target>, pid=pid@entry=-1, ourstatus=ourstatus@entry=0xffffca28) at /usr/src/debug/gdb-10.2-1/gdb/../gdbsupport/ptid.h:49
> #26 0x0000000100739c72 in windows_nat_target::wait (this=0x100893b80 <the_windows_nat_target>, ptid=..., ourstatus=0xffffca28, options=0) at /usr/src/debug/gdb-10.2-1/gdb/windows-nat.c:1801
> #27 0x00000001006d91cf in target_wait (ptid=..., status=status@entry=0xffffca28, options=options@entry=0) at /usr/src/debug/gdb-10.2-1/gdb/target.c:2017
> #28 0x0000000100593c2e in do_target_wait_1 (inf=inf@entry=0x800262c70, ptid=..., status=status@entry=0xffffca28, options=0, options@entry=1) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3544
> #29 0x0000000100594058 in operator() (inf=0x800262c70, __closure=<synthetic pointer>) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3606
> #30 do_target_wait (wait_ptid=..., ecs=ecs@entry=0xffffca00, options=options@entry=1) at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3619
> #31 0x00000001005a1f26 in fetch_inferior_event () at /usr/src/debug/gdb-10.2-1/gdb/infrun.c:3905
> #32 0x0000000100435d76 in check_async_event_handlers () at /usr/src/debug/gdb-10.2-1/gdb/async-event.c:295
> #33 0x0000000100803522 in gdb_do_one_event () at /usr/src/debug/gdb-10.2-1/gdbsupport/event-loop.cc:194
> #34 0x00000001005c0a1d in start_event_loop () at /usr/src/debug/gdb-10.2-1/gdb/main.c:356
> #35 captured_command_loop () at /usr/src/debug/gdb-10.2-1/gdb/main.c:416
> #36 0x00000001005c2b25 in captured_main (data=0xffffcb90) at /usr/src/debug/gdb-10.2-1/gdb/main.c:1253
> #37 gdb_main (args=args@entry=0xffffcbf0) at /usr/src/debug/gdb-10.2-1/gdb/main.c:1268
> #38 0x000000010087c3d9 in main (argc=9, argv=0xffffcc50) at /usr/src/debug/gdb-10.2-1/gdb/gdb.c:32
> 
> No other threads appear to be doing anything relevant. I believe the
> inferior process holds the mutex, but I haven't 100% confirmed that. I
> can't attach another debugger to it, so I'm not sure how I'd do that.
> 
> It looks like gdb is getting a new thread event (and trying to log it)
> while the inferior process has a lock on the inter-process output mutex.
> I'm not sure how we'd avoid a deadlock in that case.
> 
> I can reproduce this on 3.3.3 as well as master.
> 
> The mutexes were added in:
> 
> commit f4b47827cf87f055687a0c52a3485d42b3e2b941
> Author: Takashi Yano <takashi.yano@nifty.ne.jp>
> Date:   Mon Apr 1 00:47:48 2019 +0900
> 
>     Cygwin: console: Make I/O functions thread-safe
> 
>     - POSIX states I/O functions shall be thread-safe, however, cygwin
>       console I/O functions were not. This patch makes console I/O
>       functions thread-safe.

Thanks for the report.
Could you provide simple test case in C?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-22 20:44 ` Takashi Yano
@ 2021-12-22 22:17   ` David McFarland
  2021-12-23  9:24     ` Takashi Yano
  0 siblings, 1 reply; 9+ messages in thread
From: David McFarland @ 2021-12-22 22:17 UTC (permalink / raw)
  To: cygwin-developers

Takashi Yano <takashi.yano@nifty.ne.jp> writes:

> Thanks for the report.
> Could you provide simple test case in C?

Sure, this seems to do it:

===== test.c
#include <pthread.h>
#include <stdio.h>

void *thread(void* p) {
        printf("thread %p\n", p);
        return 0;
}

int main() {
        pthread_t ids[100];
        for (int i = 0; i < 100; ++i) {
                pthread_create(&ids[i], 0, &thread, &ids[i]);
        }
        for (int i = 0; i < 100; ++i) {
                pthread_join(ids[i], 0);
        }
        return 0;
}
=====

If I compile that with gcc and run it under gdb, it hangs almost
immediately. 5/5 attempts using a fresh cygwin with latest packages.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-22 22:17   ` David McFarland
@ 2021-12-23  9:24     ` Takashi Yano
  2021-12-23 15:32       ` David McFarland
  0 siblings, 1 reply; 9+ messages in thread
From: Takashi Yano @ 2021-12-23  9:24 UTC (permalink / raw)
  To: cygwin-developers; +Cc: David McFarland

On Wed, 22 Dec 2021 18:17:20 -0400
David McFarland wrote:
> ===== test.c
> #include <pthread.h>
> #include <stdio.h>
> 
> void *thread(void* p) {
>         printf("thread %p\n", p);
>         return 0;
> }
> 
> int main() {
>         pthread_t ids[100];
>         for (int i = 0; i < 100; ++i) {
>                 pthread_create(&ids[i], 0, &thread, &ids[i]);
>         }
>         for (int i = 0; i < 100; ++i) {
>                 pthread_join(ids[i], 0);
>         }
>         return 0;
> }
> =====
> 
> If I compile that with gcc and run it under gdb, it hangs almost
> immediately. 5/5 attempts using a fresh cygwin with latest packages.

Thanks for the test case. I could reproduce your problem.

I looked into this problem and found the mechanism causing
the issue.

GDB inferior may be suspended while the inferior grabs mutex.
When the inferior creates new thread, GDB receives
CREATE_THREAD_DEBUG_EVENT and inferior is suspended even if
the inferior is grabing the mutex. This causes deadlock in
terminal I/O.

I think there is no other way than not to wait mutex in the
debugger process. If anyone have any other idea, please let
me know.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-23  9:24     ` Takashi Yano
@ 2021-12-23 15:32       ` David McFarland
  2021-12-23 19:28         ` David McFarland
  2022-01-13 10:56         ` Takashi Yano
  0 siblings, 2 replies; 9+ messages in thread
From: David McFarland @ 2021-12-23 15:32 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin-developers

Takashi Yano <takashi.yano@nifty.ne.jp> writes:

> I think there is no other way than not to wait mutex in the
> debugger process. If anyone have any other idea, please let
> me know.

I have a few thoughts:

I believe gdb currently just uses windows debugging APIs, so it doesn't
treat cygwin as an operating system. On other operating systems there
would be some mechanism to prevent kernel mutexes from deadlocking. It
obviously can't just wait for all syscalls to complete, but it must
avoid blocking kernel tasks that hold mutexes. Perhaps we could do
something similar on cygwin, but it would probably mean providing
cygwin-specific debugging APIs and support in gdb. I'd have to do some
more research into how this works on e.g. Linux.

Can we avoid using inter-process mutexes for this? What would you expect
to break if we just used a per-process mutex?

If we do need an inter-process mutex, perhaps we could have a daemon
process responsible for it? I think that would be a bit of a departure
for cygwin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-23 15:32       ` David McFarland
@ 2021-12-23 19:28         ` David McFarland
  2021-12-26 15:25           ` David McFarland
  2022-01-13 11:09           ` Takashi Yano
  2022-01-13 10:56         ` Takashi Yano
  1 sibling, 2 replies; 9+ messages in thread
From: David McFarland @ 2021-12-23 19:28 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin-developers

I've also been hitting another console related deadlock, even when
there's no debugger involved:

#0  0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x0000000180075f5c in acquire_attach_mutex (t=4294967295) at ../../../../winsup/cygwin/fhandler_console.cc:66
#3  fhandler_console::write (this=0x180375e50, vsrc=0x804f229c8, len=94) at ../../../../winsup/cygwin/fhandler_console.cc:3112
#4  0x0000000180145c68 in write (fd=2, ptr=0x804f229c8, len=94) at ../../../../winsup/cygwin/syscalls.cc:1360
#5  0x000000018019442b in _sigfe () at sigfe.s:37
#6  0x000000046b6c24eb in nix::writeFull (fd=2, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m...\n", allowInterrupts=false) at src/libutil/util.cc:661
#7  0x000000046b6b43a5 in nix::writeToStderr (s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m...\n") at src/libutil/logging.cc:119
#8  0x000000046b6d640d in nix::SimpleLogger::log (this=0x8000182f0, lvl=nix::lvlWarn, fs=...) at src/libutil/logging.cc:74
#9  0x000000046b6d6239 in nix::SimpleLogger::startActivity (this=0x8000182f0, act=8156142895253, lvl=nix::lvlWarn, type=nix::actBuildWaiting, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m", fields=std::vector of length 0, capacity 0, parent=0)
    at src/libutil/logging.cc:90
#10 0x000000046b6b449f in nix::Activity::Activity (this=0x804d470c0, logger=..., lvl=nix::lvlWarn, type=nix::actBuildWaiting, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m", fields=std::vector of length 0, capacity 0, parent=0) at src/libutil/logging.cc:139
#11 0x00000004f52dc061 in std::make_unique<nix::Activity, nix::Logger&, nix::Verbosity, nix::ActivityType, std::string> () at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/unique_ptr.h:962
#12 0x00000004f508fbbf in nix::DerivationGoal::tryToBuild (this=0x804e93fc0) at src/libstore/build/derivation-goal.cc:577
#13 0x00000004f508b2e7 in nix::DerivationGoal::work (this=0x804e93fc0) at src/libstore/build/derivation-goal.cc:144
#14 0x00000004f50b87e4 in nix::Worker::run (this=0xffff97f0, _topGoals=std::set with 1 element = {...}) at src/libstore/build/worker.cc:268
#15 0x00000004f5098c52 in nix::Store::buildPaths (this=0x8000ae2d8, reqs=std::vector of length 1, capacity 1 = {...}, buildMode=nix::bmNormal, evalStore=std::shared_ptr<nix::Store> (use count 5, weak count 1) = {...}) at src/libstore/build/entry-points.cc:24
#16 0x0000000100406243 in operator() (__closure=0xffffa6d0, paths0=std::vector of length 1, capacity 1 = {...}) at src/nix-build/nix-build.cc:345
#17 0x000000010040b361 in main_nix_build (argc=4, argv=0xffffcc60) at src/nix-build/nix-build.cc:574
#18 0x00000001005f5916 in std::__invoke_impl<void, void (*&)(int, char**), int, char**> (__f=@0xffffbe30: 0x1004063d7 <main_nix_build(int, char**)>) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:60
#19 0x00000001005e9a83 in std::__invoke_r<void, void (*&)(int, char**), int, char**> (__fn=@0xffffbe30: 0x1004063d7 <main_nix_build(int, char**)>) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:110
#20 0x000000010056b31b in std::_Function_handler<void (int, char**), void (*)(int, char**)>::_M_invoke(std::_Any_data const&, int&&, char**&&) (__functor=..., __args#0=@0xffffbde8: 4, __args#1=@0xffffbdf0: 0xffffcc60) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:291
#21 0x0000000100517d8c in std::function<void (int, char**)>::operator()(int, char**) const (this=0xffffbe30, __args#0=4, __args#1=0xffffcc60) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:622
#22 0x000000010044e248 in nix::mainWrapped (argc=4, argv=0xffffcc60) at src/nix/main.cc:277
#23 0x000000010044f54d in operator() (__closure=0xffffcbc0) at src/nix/main.cc:392
#24 0x00000001004508a2 in std::__invoke_impl<void, main(int, char**)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:60
#25 0x00000001004503d7 in std::__invoke_r<void, main(int, char**)::<lambda()>&>(struct {...} &) (__fn=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:110
#26 0x000000010044fdc0 in std::_Function_handler<void(), main(int, char**)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:291
#27 0x00000004473816a2 in std::function<void ()>::operator()() const (this=0xffffcbc0) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:622
#28 0x000000044736a686 in nix::handleExceptions(std::string const&, std::function<void ()>) (programName="test/bin/nix-build", fun=...) at src/libmain/shared.cc:373
#29 0x000000010044f5d3 in main (argc=4, argv=0xffffcc60) at src/nix/main.cc:391


This one is much harder to reproduce, but I think what's happening is:

- PT is opened with posix_openpt()

- attach_mutex is created:
  if (InterlockedIncrement (&master_cnt) == 1)
    attach_mutex = CreateMutex (&sa, FALSE, NULL);

- PT is closed by close()

- attach_mutex is destroyed:
  if (InterlockedDecrement (&master_cnt) == 0)
    CloseHandle (attach_mutex);

- the handle in attach_mutex is reused for another object

- fhandler::write is called on stdout/err

- WaitForSingleObject is called on attach_mutex, but it's pointing to
  something else, which deadlocks

While I don't have a solid repro for the deadlock, if I apply a patch
like this:


From 926d05a1211fa8e5a7fc1cccdbe89e5c980dfad1 Mon Sep 17 00:00:00 2001
From: David McFarland <corngood@gmail.com>
Date: Thu, 23 Dec 2021 15:15:11 -0400
Subject: [PATCH] tty: add helper for attach_mutex

---
 winsup/cygwin/fhandler.h          |  9 +++++++++
 winsup/cygwin/fhandler_console.cc | 17 -----------------
 winsup/cygwin/fhandler_tty.cc     | 30 +++++++++++++++++++++++++-----
 winsup/cygwin/select.cc           | 16 ----------------
 4 files changed, 34 insertions(+), 38 deletions(-)

diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h
index 4f70c4c0b..48d7107e5 100644
--- a/winsup/cygwin/fhandler.h
+++ b/winsup/cygwin/fhandler.h
@@ -1881,6 +1881,15 @@ class fhandler_serial: public fhandler_base
 #define release_output_mutex() \
   __release_output_mutex (__PRETTY_FUNCTION__, __LINE__)
 
+bool __acquire_attach_mutex (const char *fn, int ln, DWORD t);
+void __release_attach_mutex (const char *fn, int ln);
+
+#define acquire_attach_mutex(ms)                             \
+  __acquire_attach_mutex (__PRETTY_FUNCTION__, __LINE__, ms)
+
+#define release_attach_mutex()                           \
+  __release_attach_mutex (__PRETTY_FUNCTION__, __LINE__)
+
 class tty;
 class tty_min;
 class fhandler_termios: public fhandler_base
diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc
index 4c98b5355..ca2d5c74e 100644
--- a/winsup/cygwin/fhandler_console.cc
+++ b/winsup/cygwin/fhandler_console.cc
@@ -56,23 +56,6 @@ fhandler_console::console_state NO_COPY *fhandler_console::shared_console_info;
 
 bool NO_COPY fhandler_console::invisible_console;
 
-/* Mutex for AttachConsole()/FreeConsole() in fhandler_tty.cc */
-HANDLE attach_mutex;
-
-static inline void
-acquire_attach_mutex (DWORD t)
-{
-  if (attach_mutex)
-    WaitForSingleObject (attach_mutex, t);
-}
-
-static inline void
-release_attach_mutex ()
-{
-  if (attach_mutex)
-    ReleaseMutex (attach_mutex);
-}
-
 /* con_ra is shared in the same process.
    Only one console can exist in a process, therefore, static is suitable. */
 static struct fhandler_base::rabuf_t con_ra;
diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
index c8ad53cb7..c425c158b 100644
--- a/winsup/cygwin/fhandler_tty.cc
+++ b/winsup/cygwin/fhandler_tty.cc
@@ -28,6 +28,7 @@ details. */
 #include "registry.h"
 #include "tls_pbuf.h"
 #include "winf.h"
+#include <assert.h>
 
 #ifndef PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE
 #define PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE 0x00020016
@@ -56,7 +57,26 @@ struct pipe_reply {
   DWORD error;
 };
 
-extern HANDLE attach_mutex; /* Defined in fhandler_console.cc */
+/* Mutex for AttachConsole()/FreeConsole() */
+HANDLE attach_mutex;
+
+bool __acquire_attach_mutex (const char *fn, int ln, DWORD t)
+{
+  if (!attach_mutex)
+    return false;
+  DWORD res = WaitForSingleObject (attach_mutex, t);
+  assert(res == WAIT_TIMEOUT || res == WAIT_OBJECT_0);
+  return res == WAIT_OBJECT_0;
+}
+
+void __release_attach_mutex (const char *fn, int ln)
+{
+  if (!attach_mutex)
+    return;
+  BOOL r = ReleaseMutex (attach_mutex);
+  assert(r);
+}
+
 static LONG master_cnt = 0;
 
 inline static bool pcon_pid_alive (DWORD pid);
@@ -519,13 +539,13 @@ fhandler_pty_master::accept_input ()
 	{
 	  /* Slave attaches to a different console than master.
 	     Therefore reattach here. */
-	  WaitForSingleObject (attach_mutex, INFINITE);
+    acquire_attach_mutex (INFINITE);
 	  FreeConsole ();
 	  AttachConsole (target_pid);
 	  cp_to = GetConsoleCP ();
 	  FreeConsole ();
 	  AttachConsole (resume_pid);
-	  ReleaseMutex (attach_mutex);
+    release_attach_mutex ();
 	}
       else
 	cp_to = GetConsoleCP ();
@@ -2827,13 +2847,13 @@ fhandler_pty_master::pty_master_fwd_thread (const master_fwd_thread_param_t *p)
 	{
 	  /* Slave attaches to a different console than master.
 	     Therefore reattach here. */
-	  WaitForSingleObject (attach_mutex, INFINITE);
+    acquire_attach_mutex (INFINITE);
 	  FreeConsole ();
 	  AttachConsole (target_pid);
 	  cp_from = GetConsoleOutputCP ();
 	  FreeConsole ();
 	  AttachConsole (resume_pid);
-	  ReleaseMutex (attach_mutex);
+    release_attach_mutex ();
 	}
       else
 	cp_from = GetConsoleOutputCP ();
diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index a2868abd0..95edc10fe 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -1095,22 +1095,6 @@ fhandler_fifo::select_except (select_stuff *ss)
   return s;
 }
 
-extern HANDLE attach_mutex; /* Defined in fhandler_console.cc */
-
-static inline void
-acquire_attach_mutex (DWORD t)
-{
-  if (attach_mutex)
-    WaitForSingleObject (attach_mutex, t);
-}
-
-static inline void
-release_attach_mutex ()
-{
-  if (attach_mutex)
-    ReleaseMutex (attach_mutex);
-}
-
 static int
 peek_console (select_record *me, bool)
 {
-- 
2.32.0


I can use this program:

=====test-pt.c
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>

void *thread(void* p) {
        for (int i = 0;; ++i) {
                int fd = posix_openpt(O_RDWR | O_NOCTTY);
                close(fd);
                fprintf(stderr, "test %i\n", i);
        }
        return 0;
}

int main() {
        pthread_t ids[3];
        for (int i = 0; i < 3; ++i) {
                pthread_create(&ids[i], 0, &thread, &ids[i]);
        }
        for (int i = 0; i < 3; ++i) {
                pthread_join(ids[i], 0);
        }
        return 0;
}
=====

compiled with:

$ gcc -D_GNU_SOURCE test-pt.c -o test-pt

To hit those asserts.


Perhaps the mutex should never be closed? It's also worrying that
acquiring the mutex just silently fails before a PT is opened.
There could be a thread that is acting as if it has a lock a the time
that the mutex is created, and it will even try to release the lock.

It might just be safest to create it at process start, but it could have
a significant performance cost.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-23 19:28         ` David McFarland
@ 2021-12-26 15:25           ` David McFarland
  2022-01-13 11:09           ` Takashi Yano
  1 sibling, 0 replies; 9+ messages in thread
From: David McFarland @ 2021-12-26 15:25 UTC (permalink / raw)
  To: Takashi Yano; +Cc: cygwin-developers

David McFarland <corngood@gmail.com> writes:

> I've also been hitting another console related deadlock, even when
> there's no debugger involved:

So I tried to fix this one by doing this:

From a66d334577217f178a841f006e295a9b8e50c677 Mon Sep 17 00:00:00 2001
From: David McFarland <corngood@gmail.com>
Date: Thu, 23 Dec 2021 16:12:47 -0400
Subject: [PATCH] tty: create attach_mutex in static constructor

---
 winsup/cygwin/fhandler_tty.cc | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
index c425c158b..36262dbb9 100644
--- a/winsup/cygwin/fhandler_tty.cc
+++ b/winsup/cygwin/fhandler_tty.cc
@@ -58,7 +58,7 @@ struct pipe_reply {
 };
 
 /* Mutex for AttachConsole()/FreeConsole() */
-HANDLE attach_mutex;
+HANDLE attach_mutex = CreateMutex (&sec_none, FALSE, NULL);
 
 bool __acquire_attach_mutex (const char *fn, int ln, DWORD t)
 {
@@ -77,8 +77,6 @@ void __release_attach_mutex (const char *fn, int ln)
   assert(r);
 }
 
-static LONG master_cnt = 0;
-
 inline static bool pcon_pid_alive (DWORD pid);
 
 DWORD
@@ -2122,8 +2120,6 @@ fhandler_pty_master::close ()
 	  master_fwd_thread->terminate_thread ();
 	}
     }
-  if (InterlockedDecrement (&master_cnt) == 0)
-    CloseHandle (attach_mutex);
 
   /* Check if the last master handle has been closed.  If so, set
      input_available_event to wake up potentially waiting slaves. */
@@ -3002,9 +2998,6 @@ fhandler_pty_master::setup ()
   if (!(pcon_mutex = CreateMutex (&sa, FALSE, buf)))
     goto err;
 
-  if (InterlockedIncrement (&master_cnt) == 1)
-    attach_mutex = CreateMutex (&sa, FALSE, NULL);
-
   /* Create master control pipe which allows the master to duplicate
      the pty pipe handles to processes which deserve it. */
   __small_sprintf (buf, "\\\\.\\pipe\\cygwin-%S-pty%d-master-ctl",
-- 
2.32.0


Unfortunately I seem to have another deadlock when using this patch.
With the above patch applied, if I just open bash and hold down ^C, it
will deadlock pretty quickly.  It's possible this is an exiting problem
that isn't normally noticed because bash never opens the PT.

The threads involved are:

Thread 1 (Thread 14700.0x3804):
#0  0x00007ffca30ed8c4 in ntdll!ZwWaitForMultipleObjects () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffca0dbcb20 in WaitForMultipleObjectsEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x00007ffca0dbca1e in WaitForMultipleObjects () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#3  0x000000018012f89e in select_stuff::wait (this=0xffffb4b0, readfds=0xffffb720, writefds=0xffffb620, exceptfds=0xffffb630, us=-1) at ../../../../winsup/cygwin/select.cc:419
#4  0x000000018012ee17 in select (maxfds=1, readfds=0xffffb720, writefds=0xffffb620, exceptfds=0xffffb630, us=-1) at ../../../../winsup/cygwin/select.cc:192
#5  0x000000018012e967 in pselect (maxfds=1, readfds=0xffffb720, writefds=0x0, exceptfds=0x0, to=0x0, set=0x3fd45b890 <cygreadline7!_rl_orig_sigset>) at ../../../../winsup/cygwin/select.cc:120
#6  0x00000001801a761b in _sigfe () at sigfe.s:37
#7  0x00000003fd43e698 in rl_getc () from /usr/bin/cygreadline7.dll
#8  0x00000003fd43eecc in rl_read_key () from /usr/bin/cygreadline7.dll
#9  0x00000003fd421ef3 in readline_internal_char () from /usr/bin/cygreadline7.dll
#10 0x00000003fd422905 in readline () from /usr/bin/cygreadline7.dll
#11 0x000000010040265d in reader_loop ()
#12 0x0000000100404723 in decode_prompt_string ()
#13 0x00000001004074ac in read_secondary_line ()
#14 0x000000010040aa97 in yyparse ()
#15 0x0000000100401b92 in parse_command ()
#16 0x0000000100401c96 in read_command ()
#17 0x0000000100401ec4 in reader_loop ()
#18 0x000000010047c3ba in main ()

Thread 6 (Thread 14700.0x1e9c):
#0  0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x00000001800c13b8 in __acquire_attach_mutex (fn=0x1802c59b8 <fhandler_console::MAX_WRITE_CHARS+1076> "virtual int fhandler_console::tcflush(int)", ln=1574, t=4294967295) at ../../../../winsup/cygwin/fhandler_tty.cc:67
#3  0x0000000180074fa7 in fhandler_console::tcflush (this=0x1803b6760, queue=0) at ../../../../winsup/cygwin/fhandler_console.cc:1574
#4  0x00000001800c025c in fhandler_termios::sigflush (this=0x1803b6760) at ../../../../winsup/cygwin/fhandler_termios.cc:503
#5  0x00000001800bec92 in tty_min::kill_pgrp (this=0x180000000, sig=2) at ../../../../winsup/cygwin/fhandler_termios.cc:131
#6  0x0000000180070a62 in fhandler_console::cons_master_thread (p=0x26acb70, ttyp=0x180000000) at ../../../../winsup/cygwin/fhandler_console.cc:253
#7  0x0000000180070555 in cons_master_thread (arg=0x1803b63e0) at ../../../../winsup/cygwin/fhandler_console.cc:173
#8  0x000000018004657c in cygthread::callfunc (this=0x180278278 <threads+88>, issimplestub=false) at ../../../../winsup/cygwin/cygthread.cc:48
#9  0x0000000180046724 in cygthread::stub (arg=0x180278278 <threads+88>) at ../../../../winsup/cygwin/cygthread.cc:91
#10 0x00000001800473b6 in _cygtls::call2 (this=0x26ace00, func=0x180046582 <cygthread::stub(void*)>, arg=0x180278278 <threads+88>, buf=0x26acce0) at ../../../../winsup/cygwin/cygtls.cc:55
#11 0x000000018004735b in _cygtls::call (func=0x180046582 <cygthread::stub(void*)>, arg=0x180278278 <threads+88>) at ../../../../winsup/cygwin/cygtls.cc:42
#12 0x00000001800df5b2 in threadfunc_fe (arg=0x180278278 <threads+88>) at ../../../../winsup/cygwin/init.cc:30
#13 0x00007ffca2ae7034 in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/WINDOWS/System32/KERNEL32.DLL
#14 0x00007ffca30a2651 in ntdll!RtlUserThreadStart () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#15 0x0000000000000000 in ?? ()

Thread 9 (Thread 14700.0x183c):
#0  0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll
#2  0x000000018007bb0c in fhandler_console::__acquire_input_mutex (this=0x1803b7420, fn=0x1802e1a20 <cw_std_mask+1536> "int peek_console(select_record*, bool)", ln=1133, ms=4294967295) at ../../../../winsup/cygwin/fhandler_console.cc:3726
#3  0x0000000180131b3f in peek_console (me=0x8000ac530) at ../../../../winsup/cygwin/select.cc:1133
#4  0x0000000180131ce3 in thread_console (arg=0x800075610) at ../../../../winsup/cygwin/select.cc:1173
#5  0x000000018004657c in cygthread::callfunc (this=0x1802782d0 <threads+176>, issimplestub=false) at ../../../../winsup/cygwin/cygthread.cc:48
#6  0x0000000180046724 in cygthread::stub (arg=0x1802782d0 <threads+176>) at ../../../../winsup/cygwin/cygthread.cc:91
#7  0x00000001800473b6 in _cygtls::call2 (this=0x2cbce00, func=0x180046582 <cygthread::stub(void*)>, arg=0x1802782d0 <threads+176>, buf=0x2cbcce0) at ../../../../winsup/cygwin/cygtls.cc:55
#8  0x000000018004735b in _cygtls::call (func=0x180046582 <cygthread::stub(void*)>, arg=0x1802782d0 <threads+176>) at ../../../../winsup/cygwin/cygtls.cc:42
#9  0x00000001800df5b2 in threadfunc_fe (arg=0x1802782d0 <threads+176>) at ../../../../winsup/cygwin/init.cc:30
#10 0x00007ffca2ae7034 in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/WINDOWS/System32/KERNEL32.DLL
#11 0x00007ffca30a2651 in ntdll!RtlUserThreadStart () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#12 0x0000000000000000 in ?? ()


Thread 6 holds input_mutex from cons_master_thread, and thread 9 holds
attach_mutex from peek_console. Perhaps it would make sense for
peek_console to hold attach_mutex only around the call to
PeekConsoleInputW?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-23 15:32       ` David McFarland
  2021-12-23 19:28         ` David McFarland
@ 2022-01-13 10:56         ` Takashi Yano
  1 sibling, 0 replies; 9+ messages in thread
From: Takashi Yano @ 2022-01-13 10:56 UTC (permalink / raw)
  To: cygwin-developers

Sorry for being absent for a long time.

On Thu, 23 Dec 2021 11:32:51 -0400
David McFarland wrote:
> Takashi Yano <takashi.yano@nifty.ne.jp> writes:
> 
> > I think there is no other way than not to wait mutex in the
> > debugger process. If anyone have any other idea, please let
> > me know.
> 
> I have a few thoughts:
> 
> I believe gdb currently just uses windows debugging APIs, so it doesn't
> treat cygwin as an operating system. On other operating systems there
> would be some mechanism to prevent kernel mutexes from deadlocking. It
> obviously can't just wait for all syscalls to complete, but it must
> avoid blocking kernel tasks that hold mutexes. Perhaps we could do
> something similar on cygwin, but it would probably mean providing
> cygwin-specific debugging APIs and support in gdb. I'd have to do some
> more research into how this works on e.g. Linux.
> 
> Can we avoid using inter-process mutexes for this? What would you expect
> to break if we just used a per-process mutex?

I think we need inter-process mutex. Otherwise, write() calls
from multiple process will not be done in atomic.

> If we do need an inter-process mutex, perhaps we could have a daemon
> process responsible for it? I think that would be a bit of a departure
> for cygwin.

I cannot imagine how to implement this...

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: deadlock on console mutex in gdb
  2021-12-23 19:28         ` David McFarland
  2021-12-26 15:25           ` David McFarland
@ 2022-01-13 11:09           ` Takashi Yano
  1 sibling, 0 replies; 9+ messages in thread
From: Takashi Yano @ 2022-01-13 11:09 UTC (permalink / raw)
  To: cygwin-developers

On Thu, 23 Dec 2021 15:28:58 -0400
David McFarland wrote:
> I've also been hitting another console related deadlock, even when
> there's no debugger involved:
[...]
> I can use this program:
> 
> =====test-pt.c
> #include <pthread.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <stdlib.h>
> 
> void *thread(void* p) {
>         for (int i = 0;; ++i) {
>                 int fd = posix_openpt(O_RDWR | O_NOCTTY);
>                 close(fd);
>                 fprintf(stderr, "test %i\n", i);
>         }
>         return 0;
> }
> 
> int main() {
>         pthread_t ids[3];
>         for (int i = 0; i < 3; ++i) {
>                 pthread_create(&ids[i], 0, &thread, &ids[i]);
>         }
>         for (int i = 0; i < 3; ++i) {
>                 pthread_join(ids[i], 0);
>         }
>         return 0;
> }
> =====
> 
> compiled with:
> 
> $ gcc -D_GNU_SOURCE test-pt.c -o test-pt
> 
> To hit those asserts.

Thanks for the report. I tried your test case, however I
encountered another problem of pty. The test case causes
memory leak and exhausts the system memory in a short time.

I will submit a patch for the memory leak issue.

> Perhaps the mutex should never be closed? It's also worrying that
> acquiring the mutex just silently fails before a PT is opened.
> There could be a thread that is acting as if it has a lock a the time
> that the mutex is created, and it will even try to release the lock.
> 
> It might just be safest to create it at process start, but it could have
> a significant performance cost.

With the patch above, I tried your test case again and
confirmed the assertion fails indeed. I also confirmed
the deadlock.

Thanks for pointing out this. Let me consider.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-01-13 11:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 15:26 deadlock on console mutex in gdb David McFarland
2021-12-22 20:44 ` Takashi Yano
2021-12-22 22:17   ` David McFarland
2021-12-23  9:24     ` Takashi Yano
2021-12-23 15:32       ` David McFarland
2021-12-23 19:28         ` David McFarland
2021-12-26 15:25           ` David McFarland
2022-01-13 11:09           ` Takashi Yano
2022-01-13 10:56         ` Takashi Yano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).