From: Tom Honermann <thonermann@coverity.com>
To: <cygwin@cygwin.com>
Subject: Re: Intermittent failures retrieving process exit codes
Date: Fri, 21 Dec 2012 06:30:00 -0000 [thread overview]
Message-ID: <50D401EF.9040705@coverity.com> (raw)
In-Reply-To: <50C276AC.9090301@mailme.ath.cx>
I spent most of the week debugging this issue. This appears to be a
defect in Windows. I can reproduce the issue without Cygwin. I can't
rule out other third party kernel mode software possibly contributing to
the issue. A simple change to Cygwin works around the problem for me.
I don't know which Windows releases are affected by this. I've only
reproduced the problem (outside of Cygwin) with Wow64 processes running
on 64-bit Windows 7. I haven't yet tried elsewhere.
The problem appears to be a race condition involving concurrent calls to
TerminateProcess() and ExitThread(). The example code below minimally
mimics the threads created and exit process/thread calls that are
performed when running Cygwin's false.exe. The primary thread exits the
process via TerminateProcess() ala pinfo::exit() in
winsup/cygwin/pinfo.cc. The secondary thread exits itself via
ExitThread() ala Cygwin's signal processing thread function, wait_sig(),
in winsup/cygwin/sigproc.cc.
When the race condition results in the undesirable outcome, the exit
code for the process is set to the exit code for the secondary thread's
call to ExitThread(). I can only speculate at this point, but my guess
is that the TerminateProcess() code disassociates the calling thread
from the process before other threads are stopped such that
ExitThread(), concurrently running in another thread, may determine that
the calling thread is the last thread of the process and overwrite the
process exit code.
The issue also reproduces if ExitProcess() is called in place of
TerminateProcess(). The test case below only uses TerminateProcess()
because that is what Cygwin does.
Source code to reproduce the issue follows. Again, Cygwin is not
required to reproduce the problem. For my own testing, I compiled the
code using Microsoft's Visual Studio 2010 x86 compiler with the command
'cl /Fetest-exit-code.exe test-exit-code.cpp'
test-exit-code.cpp:
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
DWORD WINAPI SecondaryThread(
LPVOID lpParameter)
{
Sleep(1);
ExitThread(2);
}
int main() {
HANDLE hSecondaryThread = CreateThread(
NULL, // lpThreadAttributes
0, // dwStackSize
SecondaryThread, // lpStartAddress
(LPVOID)0, // lpParameter
0, // dwCreationFlags
NULL); // lpThreadId
if (!hSecondaryThread) {
fprintf(stderr, "CreateThread failed. GLE=%lu\n",
(unsigned long)GetLastError());
exit(127);
}
Sleep(1);
if (!TerminateProcess(GetCurrentProcess(), 1)) {
fprintf(stderr, "TerminateProcess failed. GLE=%lu\n",
(unsigned long)GetLastError());
exit(127);
}
return 0;
}
To run the test, a simple .bat file is used:
test.bat:
@echo off
setlocal
:loop
echo test...
test-exit-code.exe
if %ERRORLEVEL% NEQ 1 (
echo test-exit-code.exe returned %ERRORLEVEL%
exit /B 1
)
goto loop
test.bat should run indefinitely. The amount of time it takes to fail
on my machine (64-bit Windows 7 running in a VMware Workstation 8 VM
under Kubuntu 12.04 on a Lenovo T420 Intel i7-2640M 2 processor laptop)
varies considerably. I had one run fail in less than 10 iterations, but
most of the time it has taken upwards of 5 minutes to get a failure.
The workaround I implemented within Cygwin was simple and sloppy. I
added a call to Sleep(1000) immediately before the call to ExitThread()
in wait_sig() in winsup/cygwin/sigproc.cc. Since this thread (probably)
doesn't exit until the process is exiting anyway, the call to Sleep()
does not adversely affect shutdown. The thread just gets terminated
while in the call to Sleep() instead of exiting before the process is
terminated or getting terminated while still in the call to
ExitThread(). A better solution might be to avoid the thread exiting at
all (so long as it can't get terminated while holding critical
resources), or to have the process exiting thread wait on it. Neither
of these is ideal. Orderly shutdown of multi-threaded processes is
really hard to do correctly on Windows.
Since the exit code for the signal processing thread is not used, having
the wait_sig() thread (and any other threads that could potentially
concurrently exit with another thread) exit with a special status value
such as STATUS_THREAD_IS_TERMINATING (0xC000004BL) would enable
diagnosis of this issue as any process exit code matching this would be
a likely indicator that this issue was encountered.
As is, when this race condition results in the undesirable outcome,
since the signal processing thread exits with a status of 0, the exit
status of the process is 0. This explains why false.exe works so well
to reproduce the issue. It would be impossible to produce a negative
test using true.exe.
Tom.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next prev parent reply other threads:[~2012-12-21 6:30 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-07 19:55 Tom Honermann
2012-12-07 21:54 ` Tom Honermann
2012-12-07 23:07 ` bartels
2012-12-21 6:30 ` Tom Honermann [this message]
2012-12-21 10:33 ` Corinna Vinschen
2012-12-21 12:15 ` Nick Lowe
2012-12-21 19:45 ` Tom Honermann
2012-12-22 3:09 ` Nick Lowe
2012-12-21 16:10 ` Christopher Faylor
2012-12-21 17:02 ` Corinna Vinschen
2012-12-21 19:36 ` Intermittent failures retrieving process exit codes - snapshot test requested Christopher Faylor
2012-12-21 20:37 ` Daniel Colascione
2012-12-21 22:23 ` marco atzeri
2012-12-21 23:09 ` Tom Honermann
2012-12-22 2:53 ` Christopher Faylor
2012-12-22 2:57 ` Tom Honermann
2012-12-22 2:49 ` Christopher Faylor
2012-12-22 3:14 ` Christopher Faylor
2012-12-22 9:06 ` marco atzeri
2012-12-22 17:50 ` Christopher Faylor
2012-12-23 16:56 ` Christopher Faylor
2012-12-23 18:54 ` marco atzeri
2012-12-27 20:50 ` Tom Honermann
2012-12-29 21:57 ` Christopher Faylor
2013-01-01 1:45 ` Tom Honermann
2013-01-01 5:36 ` Christopher Faylor
2013-01-02 19:15 ` Tom Honermann
2013-01-02 20:48 ` Christopher Faylor
2013-01-02 20:53 ` Daniel Colascione
2013-01-02 21:41 ` Christopher Faylor
2013-01-02 21:25 ` Tom Honermann
2013-01-15 22:17 ` Intermittent failures with ctrl-c (was: retrieving process exit codes) Tom Honermann
2013-01-16 2:04 ` Christopher Faylor
2013-01-16 16:38 ` Intermittent failures with ctrl-c Tom Honermann
2013-01-16 16:53 ` marco atzeri
2013-01-16 17:42 ` Tom Honermann
2013-01-16 18:05 ` Earnie Boyd
2013-01-16 18:51 ` Tom Honermann
2013-01-16 18:59 ` Christopher Faylor
2013-01-16 20:19 ` Tom Honermann
2013-01-16 22:23 ` Christopher Faylor
2013-01-18 20:12 ` Tom Honermann
2013-01-19 5:58 ` Christopher Faylor
2013-01-20 22:09 ` Tom Honermann
2013-01-23 3:20 ` Tom Honermann
2013-01-23 5:27 ` Christopher Faylor
2013-01-23 18:18 ` Tom Honermann
2013-01-23 18:35 ` Christopher Faylor
2013-01-24 4:12 ` Tom Honermann
2013-01-16 19:14 ` Christopher Faylor
2013-01-16 20:24 ` Tom Honermann
2012-12-21 20:01 ` Intermittent failures retrieving process exit codes Tom Honermann
2013-11-14 4:02 ` Tom Honermann
2013-11-14 9:20 ` Corinna Vinschen
2013-11-14 15:21 ` Tom Honermann
2013-11-15 18:53 ` Denis Excoffier
2013-11-15 19:21 ` Christopher Faylor
2013-11-17 13:30 ` Denis Excoffier
2013-11-15 22:15 ` Tom Honermann
2013-11-25 19:59 ` Lasse Collin
2013-11-25 23:12 ` Antivirus strikes back (probably) (Was: Intermittent failures retrieving process exit codes) Denis Excoffier
2013-11-26 21:09 ` Denis Excoffier
2013-11-26 21:09 ` Denis Excoffier
2013-11-26 23:36 ` Christopher Faylor
2013-12-01 13:24 ` Lasse Collin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50D401EF.9040705@coverity.com \
--to=thonermann@coverity.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).