From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id 31767385803E for ; Sat, 16 Apr 2022 13:21:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 31767385803E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from mail.ispras.ru (unknown [83.149.199.84]) by mail.ispras.ru (Postfix) with ESMTPSA id C4A8540D403D; Sat, 16 Apr 2022 13:21:34 +0000 (UTC) MIME-Version: 1.0 Date: Sat, 16 Apr 2022 16:21:34 +0300 From: Alexey Izbyshev To: Takashi Yano Cc: cygwin@cygwin.com Subject: Re: Deadlock of the process tree when running make In-Reply-To: <20220416183910.b532b2cc95725b508bfd0991@nifty.ne.jp> References: <9388316255ada0e0fcb2d849cce5a894@ispras.ru> <20220409191743.6da2268a36e8c9b4ab22c722@nifty.ne.jp> <1ecd670b1cdff43e0b0d7e5ee4c9cfc5@ispras.ru> <20220409204619.dd0e53902d5e108ef462e510@nifty.ne.jp> <907ce1b4416a826cb07990dd601bd687@ispras.ru> <20220410015753.753e2a238513eaf2a3da81e9@nifty.ne.jp> <20220410025410.196aa0a04368147dbbb31d3e@nifty.ne.jp> <7204ed0aa2d6b3fcfb239010e6b67646@ispras.ru> <20220410163432.00dd7b9f81f8f322d97688f2@nifty.ne.jp> <0e1a53626639cb21369225ff9092ecfc@ispras.ru> <20220411173526.6243b9492e0fc3d4132a58a8@nifty.ne.jp> <1bdd5ac77277343fbff9b560fa98b15e@ispras.ru> <20220416183910.b532b2cc95725b508bfd0991@nifty.ne.jp> User-Agent: Roundcube Webmail/1.4.4 Message-ID: <45f9160a597b25bc576eb153a138fb88@ispras.ru> X-Sender: izbyshev@ispras.ru Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_00, DOS_RCVD_IP_TWICE_B, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Apr 2022 13:21:39 -0000 On 2022-04-16 12:39, Takashi Yano wrote: > I am not sure yet what is essential, but the current code closes > pseudo console only if there is no other process which is attaching > to the pseudo console. I wonder why javac.exe is remaining as > zombie. The parent bash.exe calls ColosePseudoConsole() when > child non-cygwin app is terminated, i.e., after WaitForSingleObject() > for child process handle returns. > https://www.cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=7ac0767053e278f0ce9811bf6f77278bd2f49c20#l1009 > > What does the "zombie" mean? Is it listed in the process list of > ProcessHacker? I still suspect that the zombie javac.exe holds > the hWritePipe handle leaked from parent bash.exe. > By "zombie" I meant the same thing as in the Linux kernel: a data structure that remains after a process terminated, but hasn't been waited for yet (I don't know how this is implemented in Cygwin). So there is no javac.exe process in ProcessHacker, but "ps" and similar tools in Cygwin still list "javac". I'm now trying to create a small reproducer that I can share, and I've had a first small success this night: I could get a very similar hang with a simple Makefile and a script with Cygwin 3.3.4. Here is the tree: make(14479)-+-bash(14484)---bash(14611) |-bash(14515)---bash(14618) |-bash(14491)---bash(14500)---bash(14612) |-bash(14501)---bash(14510)---bash(14605) |-bash(14505)---bash(14607) |-bash(14494)---bash(14617) |-bash(14506)---bash(14513)---bash(14610) |-bash(14512)---bash(14518)---bash(14615) |-bash(14486)---bash(14495)---bash(14606) |-bash(14483)---bash(14490)---bash(14609) |-bash(14509)---bash(14614) |-bash(14489)---bash(14608) |-bash(14499)---bash(14613) |-bash(14481)---bash(14485)---python(14588) |-bash(14496)---bash(14504)---bash(14616) `-bash(14482)---bash(14604) "python" is a zombie, just as "javac" is in the original case. There is also a single "conhost.exe" again, and all of its 5 threads are doing the same things as in the original case (including the signal pipe thread trying to EnterCriticalSection()). The only difference is that leaf bash.exe are trying to acquire pcon mutex at a different point [1], but I guess this difference is not important. I'll try this reproducer with your patched DLL as well as on another machine and share it in case of success. Thanks, Alexey [1] https://www.cygwin.com/git?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=cygwin-3_3_4-release#l697