From: Takashi Yano <takashi.yano@nifty.ne.jp>
To: cygwin@cygwin.com
Cc: Alexey Izbyshev <izbyshev@ispras.ru>
Subject: Re: Deadlock of the process tree when running make
Date: Sat, 16 Apr 2022 18:39:10 +0900 [thread overview]
Message-ID: <20220416183910.b532b2cc95725b508bfd0991@nifty.ne.jp> (raw)
In-Reply-To: <f25d76d5897f60ab1a5a52bd0dffd484@ispras.ru>
On Thu, 14 Apr 2022 02:17:38 +0300
Alexey Izbyshev wrote:
> On 2022-04-13 19:48, Alexey Izbyshev wrote:
> > On 2022-04-11 13:10, Alexey Izbyshev wrote:
> > What's probably not normal is the behavior of the hanging conhost.exe.
> > I've compared the points where conhost.exe is blocked, and all but one
> > threads in the model case are doing the same things as in the hanging
> > case, but the remaining thread is blocked in
> > ReadFile("\Device\NamedPipe\") (i.e. the read end of "hWritePipe" of
> > pcon) instead of trying to enter a critical section like thread 1
> > above. So now I'm starting to doubt that it's a cygwin bug and not
> > some conhost.exe bug.
> >
> > I'll try to poke around the hanging conhost.exe some more, and also
> > may be will try to create a faster reproducer.
> >
> I've studied conhost.exe hang, and it indeed looks like it's buggy.
>
> TLDR: https://github.com/microsoft/terminal/pull/12181
>
> The full story:
>
> I dumped conhost.exe, opened the dump in windbg and looked at the stack
> trace of the hanging thread:
>
> ntdll!NtWaitForAlertByThreadId+0x14
> ntdll!RtlpWaitOnAddressWithTimeout+0x81
> ntdll!RtlpWaitOnAddress+0xae
> ntdll!RtlpWaitOnCriticalSection+0xfd
> ntdll!RtlpEnterCriticalSectionContended+0x1c4
> ntdll!RtlEnterCriticalSection+0x42
> conhost!Microsoft::Console::Render::Renderer::_PaintFrameForEngine+0x54
> conhost!Microsoft::Console::Render::Renderer::TriggerTeardown+0x19e60
> conhost!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit+0x21
> conhost!Microsoft::Console::PtySignalInputThread::_GetData+0x65
> conhost!Microsoft::Console::PtySignalInputThread::_InputThread+0x25
> kernel32!BaseThreadInitThunk+0x14
> ntdll!RtlUserThreadStart+0x21
>
> By looking at assembly, I've found that it hangs *after* ReadFile() on
> the pipe completes, so the problem is definitely not a leak of
> hWritePipe in bash.exe or elsewhere.
>
> Using the function names, I've found this issue:
> https://github.com/microsoft/terminal/issues/1810.
>
> This is a different one, but the discussion and the patch shows that
> synchronization on startup/shutdown is a disaster.
>
> Then I looked at the code and identified that hang happens while
> attempting to lock the console at [1]. After studying how this lock is
> used in other parts of the code, I noticed that
> PtySignalInputThread::_Shutdown() (which is further up in the call stack
> of the hanging function) uses ProcessCtrlEvents() incorrectly, because
> the latter unconditionally unlocks the console, but the lock is never
> taken by this thread at this point. Then I looked at a more recent
> version of the code and discovered the patch to _Shutdown() which I
> referenced above.
>
> I've also verified that assembly of _Shutdown() (which is inlined into
> PtySignalInputThread::_GetData()) corresponds to the unpatched version
> (i.e. without LockConsole() call):
>
> call conhost!CloseConsoleProcessState (00007ff6`22e7013c)
> call conhost!ProcessCtrlEvents (00007ff6`22e262a0)
> mov ecx,6Dh
> call
> conhost!Microsoft::Console::Interactivity::ServiceLocator::RundownAndExit
> (00007ff6`22e3c730)
>
> I'm not sure why this bug is not triggered more frequently, but one
> possible reason, as indicated by comment [2], is that the bad path is
> only taken if there are live clients after ClosePseudoConsole() is
> called, which is probably rare.
>
> A potential workaround on Cygwin side would be to ensure that the
> pseudoconsole doesn't have clients before calling ClosePseudoConsole(),
> but I don't know whether it's possible.
I am not sure yet what is essential, but the current code closes
pseudo console only if there is no other process which is attaching
to the pseudo console. I wonder why javac.exe is remaining as
zombie. The parent bash.exe calls ColosePseudoConsole() when
child non-cygwin app is terminated, i.e., after WaitForSingleObject()
for child process handle returns.
https://www.cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=7ac0767053e278f0ce9811bf6f77278bd2f49c20#l1009
What does the "zombie" mean? Is it listed in the process list of
ProcessHacker? I still suspect that the zombie javac.exe holds
the hWritePipe handle leaked from parent bash.exe.
> [1]
> https://github.com/microsoft/terminal/blob/9b92986b49bed8cc41fde4d6ef080921c41e6d9e/src/renderer/base/renderer.cpp#L75
>
> [2]
> https://github.com/microsoft/terminal/blob/9b92986b49bed8cc41fde4d6ef080921c41e6d9e/src/host/PtySignalInputThread.cpp#L205
--
Takashi Yano <takashi.yano@nifty.ne.jp>
next prev parent reply other threads:[~2022-04-16 9:39 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-07 21:53 Alexey Izbyshev
2022-04-07 23:54 ` Brian Inglis
2022-04-08 8:42 ` Alexey Izbyshev
2022-04-08 17:04 ` Brian Inglis
2022-04-11 13:27 ` Alexey Izbyshev
2022-04-09 10:17 ` Takashi Yano
2022-04-09 11:00 ` Alexey Izbyshev
2022-04-09 11:02 ` Alexey Izbyshev
2022-04-09 11:46 ` Takashi Yano
2022-04-09 16:07 ` Alexey Izbyshev
2022-04-09 16:57 ` Takashi Yano
2022-04-09 17:23 ` Alexey Izbyshev
2022-04-09 17:54 ` Takashi Yano
2022-04-09 19:35 ` Alexey Izbyshev
2022-04-09 20:26 ` Alexey Izbyshev
2022-04-10 7:34 ` Takashi Yano
2022-04-10 12:13 ` Alexey Izbyshev
2022-04-10 20:49 ` Alexey Izbyshev
2022-04-11 8:35 ` Takashi Yano
2022-04-11 10:10 ` Alexey Izbyshev
2022-04-13 16:48 ` Alexey Izbyshev
2022-04-13 17:22 ` Takashi Yano
2022-04-13 17:27 ` Alexey Izbyshev
2022-04-13 23:17 ` Alexey Izbyshev
2022-04-16 9:39 ` Takashi Yano [this message]
2022-04-16 13:21 ` Alexey Izbyshev
2022-04-27 11:22 ` Takashi Yano
2022-04-27 12:19 ` Alexey Izbyshev
2022-04-11 5:23 ` Jeremy Drake
2022-04-11 8:36 ` Takashi Yano
2022-04-11 15:28 ` Alexey Izbyshev
2022-04-11 17:02 ` Jeremy Drake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220416183910.b532b2cc95725b508bfd0991@nifty.ne.jp \
--to=takashi.yano@nifty.ne.jp \
--cc=cygwin@cygwin.com \
--cc=izbyshev@ispras.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).