public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Intermittent failures retrieving process exit codes
@ 2012-12-07 19:55 Tom Honermann
  2012-12-07 21:54 ` Tom Honermann
  2012-12-07 23:07 ` bartels
  0 siblings, 2 replies; 65+ messages in thread
From: Tom Honermann @ 2012-12-07 19:55 UTC (permalink / raw)
  To: cygwin

I've witnessed intermittent failures in multiple build systems while 
working at multiple companies using Cygwin bash and make as part of the 
build system but using non-Cygwin compilers and other tools.  The 
intermittent failures occur when a process appears to complete 
successfully, but the process retrieving its exit code receives an 
unexpected value.  This has been seen on many different Cygwin versions 
across several years.

Several reports of similar sounding issues can be found online:
- 
http://cygwin.1069669.n5.nabble.com/Cygwin-1-7-x-on-Windows-7-Exit-statuses-of-Win32-executables-are-sometimes-wrong-td20186.html
- 
http://stackoverflow.com/questions/9769256/intermittent-failures-under-cygwin-possibly-related-to-candle-and-or-make

I recently was able to produce a very small test case that reproduces 
this issue reliably on some machines:

$ cat test.sh
#!/bin/sh

while [ 1 ]; do
   echo "test..."
   if cmd /c "false"; then
     echo "exiting..."
     exit 1
   fi
done

An invocation of test.sh should run indefinitely, but fails very quickly 
on one of my machines:

$ ./test.sh
test...
test...
exiting...

$ ./test.sh
test...
test...
test...
test...
exiting...

$ ./test.sh
test...
exiting...

There are several high-level possibilities for what is going wrong:

1) cmd.exe is failing to retrieve the correct exit code for the 
invocation of false.exe (A Cygwin process).

2) cmd.exe is failing to return the (correct) exit code it received for 
the invocation of false.exe.

3) bash.exe (A Cygwin process) is failing to retrieve the correct exit 
code for the invocation of cmd.exe.

It is possible that other software installed on the machines I've 
witnessed this on are contributing to the problem (ala 
http://cygwin.com/faq/faq.using.html#faq.using.bloda).  If so, such 
software would be a contributing factor to one of the explanations 
above, but does not necessarily mean that there is not a defect in 
Cygwin (or CreateProcess, WaitForSingleObject, or GetExitCodeProcess). 
I have not yet seen a similar case that does not involve Cygwin, so at 
present I suspect a defect in Cygwin, but possibly one that produces no 
negative symptoms in isolation.

I've reproduced this issue with both the 32-bit and 64-bit versions of 
cmd.exe.  I've also reproduced it by replacing cmd.exe with a C file 
that calls CreateProcess for Cygwin's false.exe on its own.  The issue 
reproduces whether that C file is compiled with Cygwin gcc, MinGW gcc 
(32-bit and 64-bit), and with MSVC (32-bit and 64-bit).  So, substitute 
what you like for 'cmd.exe' in the above.

Likewise, I've reproduced this issue by replacing false.exe in the test 
above with a custom false.exe (A C program that just returns 1).  The 
issue reproduces whether myfalse.exe is compiled with Cygwin gcc, MinGW 
gcc (32-bit and 64-bit), and with MSVC (32-bit and 64-bit).  So, 
substitute what you like for 'false.exe' in the above.

I am not able to reproduce the problem if I elide the invocation of 
false.exe.  (ie, if the cmd.exe invocation is 'cmd /c "exit /B 1"' or if 
my replacement for cmd.exe just returns 1).

The problem feels like a race condition in retrieving process exit 
codes.  Further, it seems that it may only occur when two related 
processes exit in quick succession.

I've been granted several weeks in the near future to work exclusively 
on this issue.  Before I start working on it though, I'd like to hear 
from other community members who have experienced this and tried to 
debug it.  What is and is not known about the issue.  What workarounds 
have been tried (especially any that were found to be successful).  Are 
there specific parts of the Cygwin (or bash) code that you recommend 
starting with?

The machine that I've been running the above script on is 64-bit Windows 
7 Professional SP1 running under VMware Workstation 8 which is running 
on Kubuntu 12.04.

Relevant parts of 'cygcheck-s' are:

Windows 7 Professional N Ver 6.1 Build 7601 Service Pack 1

Running under WOW64 on AMD64

     Cygwin DLL version info:
         DLL version: 1.7.16
         DLL epoch: 19
         DLL old termios: 5
         DLL malloc env: 28
         Cygwin conv: 181
         API major: 0
         API minor: 262
         Shared data: 5
         DLL identifier: cygwin1
         Mount registry: 3
         Cygwin registry name: Cygwin
         Program options name: Program Options
         Installations name: Installations
         Cygdrive default prefix:
         Build date:
         Shared id: cygwin1S5


Potential app conflicts:

ByteMobile laptop optimization client.

No Cygwin services found.

Cygwin Package Information
Package                    Version              Status
bash                       4.1.10-4             OK
cygwin                     1.7.16-1             OK


Tom.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2013-12-01 13:24 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-07 19:55 Intermittent failures retrieving process exit codes Tom Honermann
2012-12-07 21:54 ` Tom Honermann
2012-12-07 23:07 ` bartels
2012-12-21  6:30   ` Tom Honermann
2012-12-21 10:33     ` Corinna Vinschen
2012-12-21 12:15       ` Nick Lowe
2012-12-21 19:45         ` Tom Honermann
2012-12-22  3:09           ` Nick Lowe
2012-12-21 16:10       ` Christopher Faylor
2012-12-21 17:02         ` Corinna Vinschen
2012-12-21 19:36           ` Intermittent failures retrieving process exit codes - snapshot test requested Christopher Faylor
2012-12-21 20:37             ` Daniel Colascione
2012-12-21 22:23             ` marco atzeri
2012-12-21 23:09               ` Tom Honermann
2012-12-22  2:53                 ` Christopher Faylor
2012-12-22  2:57                   ` Tom Honermann
2012-12-22  2:49               ` Christopher Faylor
2012-12-22  3:14                 ` Christopher Faylor
2012-12-22  9:06                   ` marco atzeri
2012-12-22 17:50                     ` Christopher Faylor
2012-12-23 16:56                       ` Christopher Faylor
2012-12-23 18:54                         ` marco atzeri
2012-12-27 20:50                         ` Tom Honermann
2012-12-29 21:57                           ` Christopher Faylor
2013-01-01  1:45                             ` Tom Honermann
2013-01-01  5:36                               ` Christopher Faylor
2013-01-02 19:15                                 ` Tom Honermann
2013-01-02 20:48                                   ` Christopher Faylor
2013-01-02 20:53                                     ` Daniel Colascione
2013-01-02 21:41                                       ` Christopher Faylor
2013-01-02 21:25                                     ` Tom Honermann
2013-01-15 22:17                                       ` Intermittent failures with ctrl-c (was: retrieving process exit codes) Tom Honermann
2013-01-16  2:04                                         ` Christopher Faylor
2013-01-16 16:38                                           ` Intermittent failures with ctrl-c Tom Honermann
2013-01-16 16:53                                             ` marco atzeri
2013-01-16 17:42                                               ` Tom Honermann
2013-01-16 18:05                                                 ` Earnie Boyd
2013-01-16 18:51                                                   ` Tom Honermann
2013-01-16 18:59                                                     ` Christopher Faylor
2013-01-16 20:19                                                       ` Tom Honermann
2013-01-16 22:23                                                         ` Christopher Faylor
2013-01-18 20:12                                                           ` Tom Honermann
2013-01-19  5:58                                                             ` Christopher Faylor
2013-01-20 22:09                                                               ` Tom Honermann
2013-01-23  3:20                                                                 ` Tom Honermann
2013-01-23  5:27                                                                   ` Christopher Faylor
2013-01-23 18:18                                                                     ` Tom Honermann
2013-01-23 18:35                                                                       ` Christopher Faylor
2013-01-24  4:12                                                                         ` Tom Honermann
2013-01-16 19:14                                             ` Christopher Faylor
2013-01-16 20:24                                               ` Tom Honermann
2012-12-21 20:01     ` Intermittent failures retrieving process exit codes Tom Honermann
2013-11-14  4:02     ` Tom Honermann
2013-11-14  9:20       ` Corinna Vinschen
2013-11-14 15:21         ` Tom Honermann
2013-11-15 18:53       ` Denis Excoffier
2013-11-15 19:21         ` Christopher Faylor
2013-11-17 13:30           ` Denis Excoffier
2013-11-15 22:15         ` Tom Honermann
2013-11-25 19:59         ` Lasse Collin
2013-11-25 23:12           ` Antivirus strikes back (probably) (Was: Intermittent failures retrieving process exit codes) Denis Excoffier
2013-11-26 21:09             ` Denis Excoffier
2013-11-26 23:36               ` Christopher Faylor
2013-11-26 21:09             ` Denis Excoffier
2013-12-01 13:24             ` Lasse Collin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).