From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by sourceware.org (Postfix) with ESMTPS id 00829385840B for ; Thu, 23 Dec 2021 19:29:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 00829385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-qk1-x72b.google.com with SMTP id m2so4792543qkd.8 for ; Thu, 23 Dec 2021 11:29:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=4U5W/+IDhxN1znM6R4EdQs4Tl+p/kdTuMGRynLj8PCQ=; b=nMA2+ub6KwXK02dU7yGADEBvJNsgzTa6Cm/RerICiB8fa/S2P0CRqQWZ5X2riLd721 0PdVZlOMY6d+W7bk1CHRFt3Aa2WyQsiRjqU9Hv6olg71OUFfNJnr7L5vqgHXh2zW8VWt aJZj+O2S+vqUYVu8/13TdOSl/LjXRuebbKxKKBpU87Ivpi3dX9FdMNw4QzKzhaLRqU11 +REUO2JghMOM2ETiR3G+Q2PCw8PX8of+a6+w4zzFTklqL0LSSB+raCcSRA3VagAtr1qE njXuwGVXypGFrBHhP6VziuvxszTKW2EP2Q86IfM5y14WLAEOBEDNDOcpA4IzyCeI54Cl rJYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=4U5W/+IDhxN1znM6R4EdQs4Tl+p/kdTuMGRynLj8PCQ=; b=fcQmN0Epfg7W1jCN61zq7vPj+OqIihzulV2n+IXnpPfc+tC1JBzUPgkWQ2jdQYrJEB LZVFk9VDPiibddtPxcXfObnl9mbkg+I7bFx4zohX73SmTcxAecJ9myVOr0Oo/+H11KpI ZMhyKrTQc2o3+93RlLgI+1VEm10FwWEM73EPplbGgxlACmehV5zMFEUlx/tLhHw4XRYb O2H1/DHdR+uP7IJXPON0vBdOAW1xg3kZQ4cbKfzgf51oFt++M/j94HofZh65PeFJUgQs nyxg+cpgq6vo1+t2Wuw8PcQbij5+XV+sg+bFAwziYywXqJ23TCvKBVZD386giIDK5dfe 4s6A== X-Gm-Message-State: AOAM532Om3eXXnfLfqzSrMDRGE+tXhVaipDRrauj0Vtt5ypWpNzSz0lA hdjY24HD/fjvkcKDrin407qApdCGVMQ= X-Google-Smtp-Source: ABdhPJyRc74IkmhliYOOpelBoQmjMqVSsaaL4Kq/GqNJ8fY+e00ZtyJdkhxG5LoIboSHpwYBVtskTA== X-Received: by 2002:a37:513:: with SMTP id 19mr2478939qkf.669.1640287740240; Thu, 23 Dec 2021 11:29:00 -0800 (PST) Received: from davidm-laptop (hlfxns018gw-134-41-172-106.dhcp-dynamic.fibreop.ns.bellaliant.net. [134.41.172.106]) by smtp.gmail.com with ESMTPSA id y12sm5117512qko.36.2021.12.23.11.28.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Dec 2021 11:28:59 -0800 (PST) From: David McFarland To: Takashi Yano Cc: cygwin-developers@cygwin.com Subject: Re: deadlock on console mutex in gdb References: <20211223054409.610242feb0e96199904b3c9e@nifty.ne.jp> <20211223182440.e37115a4493516aca7ed7873@nifty.ne.jp> <87mtkribfg.fsf@gmail.com> Date: Thu, 23 Dec 2021 15:28:58 -0400 In-Reply-To: <87mtkribfg.fsf@gmail.com> (David McFarland's message of "Thu, 23 Dec 2021 11:32:51 -0400") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (cygwin) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin-developers@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin core component developers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Dec 2021 19:29:03 -0000 I've also been hitting another console related deadlock, even when there's no debugger involved: #0 0x00007ffca30ecdf4 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll #1 0x00007ffca0d91a5e in WaitForSingleObjectEx () from /cygdrive/c/WINDOWS/System32/KERNELBASE.dll #2 0x0000000180075f5c in acquire_attach_mutex (t=4294967295) at ../../../../winsup/cygwin/fhandler_console.cc:66 #3 fhandler_console::write (this=0x180375e50, vsrc=0x804f229c8, len=94) at ../../../../winsup/cygwin/fhandler_console.cc:3112 #4 0x0000000180145c68 in write (fd=2, ptr=0x804f229c8, len=94) at ../../../../winsup/cygwin/syscalls.cc:1360 #5 0x000000018019442b in _sigfe () at sigfe.s:37 #6 0x000000046b6c24eb in nix::writeFull (fd=2, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m...\n", allowInterrupts=false) at src/libutil/util.cc:661 #7 0x000000046b6b43a5 in nix::writeToStderr (s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m...\n") at src/libutil/logging.cc:119 #8 0x000000046b6d640d in nix::SimpleLogger::log (this=0x8000182f0, lvl=nix::lvlWarn, fs=...) at src/libutil/logging.cc:74 #9 0x000000046b6d6239 in nix::SimpleLogger::startActivity (this=0x8000182f0, act=8156142895253, lvl=nix::lvlWarn, type=nix::actBuildWaiting, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m", fields=std::vector of length 0, capacity 0, parent=0) at src/libutil/logging.cc:90 #10 0x000000046b6b449f in nix::Activity::Activity (this=0x804d470c0, logger=..., lvl=nix::lvlWarn, type=nix::actBuildWaiting, s="waiting for lock on \033[35;1m'/nix/store/lqcj88cy92w6mjsqicjmgbfj0dnggvdw-libiconv-1.16'\033[0m", fields=std::vector of length 0, capacity 0, parent=0) at src/libutil/logging.cc:139 #11 0x00000004f52dc061 in std::make_unique () at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/unique_ptr.h:962 #12 0x00000004f508fbbf in nix::DerivationGoal::tryToBuild (this=0x804e93fc0) at src/libstore/build/derivation-goal.cc:577 #13 0x00000004f508b2e7 in nix::DerivationGoal::work (this=0x804e93fc0) at src/libstore/build/derivation-goal.cc:144 #14 0x00000004f50b87e4 in nix::Worker::run (this=0xffff97f0, _topGoals=std::set with 1 element = {...}) at src/libstore/build/worker.cc:268 #15 0x00000004f5098c52 in nix::Store::buildPaths (this=0x8000ae2d8, reqs=std::vector of length 1, capacity 1 = {...}, buildMode=nix::bmNormal, evalStore=std::shared_ptr (use count 5, weak count 1) = {...}) at src/libstore/build/entry-points.cc:24 #16 0x0000000100406243 in operator() (__closure=0xffffa6d0, paths0=std::vector of length 1, capacity 1 = {...}) at src/nix-build/nix-build.cc:345 #17 0x000000010040b361 in main_nix_build (argc=4, argv=0xffffcc60) at src/nix-build/nix-build.cc:574 #18 0x00000001005f5916 in std::__invoke_impl (__f=@0xffffbe30: 0x1004063d7 ) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:60 #19 0x00000001005e9a83 in std::__invoke_r (__fn=@0xffffbe30: 0x1004063d7 ) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:110 #20 0x000000010056b31b in std::_Function_handler::_M_invoke(std::_Any_data const&, int&&, char**&&) (__functor=..., __args#0=@0xffffbde8: 4, __args#1=@0xffffbdf0: 0xffffcc60) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:291 #21 0x0000000100517d8c in std::function::operator()(int, char**) const (this=0xffffbe30, __args#0=4, __args#1=0xffffcc60) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:622 #22 0x000000010044e248 in nix::mainWrapped (argc=4, argv=0xffffcc60) at src/nix/main.cc:277 #23 0x000000010044f54d in operator() (__closure=0xffffcbc0) at src/nix/main.cc:392 #24 0x00000001004508a2 in std::__invoke_impl&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:60 #25 0x00000001004503d7 in std::__invoke_r&>(struct {...} &) (__fn=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/invoke.h:110 #26 0x000000010044fdc0 in std::_Function_handler >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:291 #27 0x00000004473816a2 in std::function::operator()() const (this=0xffffcbc0) at /usr/lib/gcc/x86_64-pc-cygwin/10/include/c++/bits/std_function.h:622 #28 0x000000044736a686 in nix::handleExceptions(std::string const&, std::function) (programName="test/bin/nix-build", fun=...) at src/libmain/shared.cc:373 #29 0x000000010044f5d3 in main (argc=4, argv=0xffffcc60) at src/nix/main.cc:391 This one is much harder to reproduce, but I think what's happening is: - PT is opened with posix_openpt() - attach_mutex is created: if (InterlockedIncrement (&master_cnt) == 1) attach_mutex = CreateMutex (&sa, FALSE, NULL); - PT is closed by close() - attach_mutex is destroyed: if (InterlockedDecrement (&master_cnt) == 0) CloseHandle (attach_mutex); - the handle in attach_mutex is reused for another object - fhandler::write is called on stdout/err - WaitForSingleObject is called on attach_mutex, but it's pointing to something else, which deadlocks While I don't have a solid repro for the deadlock, if I apply a patch like this: >From 926d05a1211fa8e5a7fc1cccdbe89e5c980dfad1 Mon Sep 17 00:00:00 2001 From: David McFarland Date: Thu, 23 Dec 2021 15:15:11 -0400 Subject: [PATCH] tty: add helper for attach_mutex --- winsup/cygwin/fhandler.h | 9 +++++++++ winsup/cygwin/fhandler_console.cc | 17 ----------------- winsup/cygwin/fhandler_tty.cc | 30 +++++++++++++++++++++++++----- winsup/cygwin/select.cc | 16 ---------------- 4 files changed, 34 insertions(+), 38 deletions(-) diff --git a/winsup/cygwin/fhandler.h b/winsup/cygwin/fhandler.h index 4f70c4c0b..48d7107e5 100644 --- a/winsup/cygwin/fhandler.h +++ b/winsup/cygwin/fhandler.h @@ -1881,6 +1881,15 @@ class fhandler_serial: public fhandler_base #define release_output_mutex() \ __release_output_mutex (__PRETTY_FUNCTION__, __LINE__) +bool __acquire_attach_mutex (const char *fn, int ln, DWORD t); +void __release_attach_mutex (const char *fn, int ln); + +#define acquire_attach_mutex(ms) \ + __acquire_attach_mutex (__PRETTY_FUNCTION__, __LINE__, ms) + +#define release_attach_mutex() \ + __release_attach_mutex (__PRETTY_FUNCTION__, __LINE__) + class tty; class tty_min; class fhandler_termios: public fhandler_base diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc index 4c98b5355..ca2d5c74e 100644 --- a/winsup/cygwin/fhandler_console.cc +++ b/winsup/cygwin/fhandler_console.cc @@ -56,23 +56,6 @@ fhandler_console::console_state NO_COPY *fhandler_console::shared_console_info; bool NO_COPY fhandler_console::invisible_console; -/* Mutex for AttachConsole()/FreeConsole() in fhandler_tty.cc */ -HANDLE attach_mutex; - -static inline void -acquire_attach_mutex (DWORD t) -{ - if (attach_mutex) - WaitForSingleObject (attach_mutex, t); -} - -static inline void -release_attach_mutex () -{ - if (attach_mutex) - ReleaseMutex (attach_mutex); -} - /* con_ra is shared in the same process. Only one console can exist in a process, therefore, static is suitable. */ static struct fhandler_base::rabuf_t con_ra; diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc index c8ad53cb7..c425c158b 100644 --- a/winsup/cygwin/fhandler_tty.cc +++ b/winsup/cygwin/fhandler_tty.cc @@ -28,6 +28,7 @@ details. */ #include "registry.h" #include "tls_pbuf.h" #include "winf.h" +#include #ifndef PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE #define PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE 0x00020016 @@ -56,7 +57,26 @@ struct pipe_reply { DWORD error; }; -extern HANDLE attach_mutex; /* Defined in fhandler_console.cc */ +/* Mutex for AttachConsole()/FreeConsole() */ +HANDLE attach_mutex; + +bool __acquire_attach_mutex (const char *fn, int ln, DWORD t) +{ + if (!attach_mutex) + return false; + DWORD res = WaitForSingleObject (attach_mutex, t); + assert(res == WAIT_TIMEOUT || res == WAIT_OBJECT_0); + return res == WAIT_OBJECT_0; +} + +void __release_attach_mutex (const char *fn, int ln) +{ + if (!attach_mutex) + return; + BOOL r = ReleaseMutex (attach_mutex); + assert(r); +} + static LONG master_cnt = 0; inline static bool pcon_pid_alive (DWORD pid); @@ -519,13 +539,13 @@ fhandler_pty_master::accept_input () { /* Slave attaches to a different console than master. Therefore reattach here. */ - WaitForSingleObject (attach_mutex, INFINITE); + acquire_attach_mutex (INFINITE); FreeConsole (); AttachConsole (target_pid); cp_to = GetConsoleCP (); FreeConsole (); AttachConsole (resume_pid); - ReleaseMutex (attach_mutex); + release_attach_mutex (); } else cp_to = GetConsoleCP (); @@ -2827,13 +2847,13 @@ fhandler_pty_master::pty_master_fwd_thread (const master_fwd_thread_param_t *p) { /* Slave attaches to a different console than master. Therefore reattach here. */ - WaitForSingleObject (attach_mutex, INFINITE); + acquire_attach_mutex (INFINITE); FreeConsole (); AttachConsole (target_pid); cp_from = GetConsoleOutputCP (); FreeConsole (); AttachConsole (resume_pid); - ReleaseMutex (attach_mutex); + release_attach_mutex (); } else cp_from = GetConsoleOutputCP (); diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc index a2868abd0..95edc10fe 100644 --- a/winsup/cygwin/select.cc +++ b/winsup/cygwin/select.cc @@ -1095,22 +1095,6 @@ fhandler_fifo::select_except (select_stuff *ss) return s; } -extern HANDLE attach_mutex; /* Defined in fhandler_console.cc */ - -static inline void -acquire_attach_mutex (DWORD t) -{ - if (attach_mutex) - WaitForSingleObject (attach_mutex, t); -} - -static inline void -release_attach_mutex () -{ - if (attach_mutex) - ReleaseMutex (attach_mutex); -} - static int peek_console (select_record *me, bool) { -- 2.32.0 I can use this program: =====test-pt.c #include #include #include #include #include void *thread(void* p) { for (int i = 0;; ++i) { int fd = posix_openpt(O_RDWR | O_NOCTTY); close(fd); fprintf(stderr, "test %i\n", i); } return 0; } int main() { pthread_t ids[3]; for (int i = 0; i < 3; ++i) { pthread_create(&ids[i], 0, &thread, &ids[i]); } for (int i = 0; i < 3; ++i) { pthread_join(ids[i], 0); } return 0; } ===== compiled with: $ gcc -D_GNU_SOURCE test-pt.c -o test-pt To hit those asserts. Perhaps the mutex should never be closed? It's also worrying that acquiring the mutex just silently fails before a PT is opened. There could be a thread that is acting as if it has a lock a the time that the mutex is created, and it will even try to release the lock. It might just be safest to create it at process start, but it could have a significant performance cost.