From: "E. Madison Bray" <erik.m.bray@gmail.com>
To: cygwin@cygwin.com
Subject: Re: Problem with zombie processes
Date: Thu, 04 Apr 2019 12:44:00 -0000 [thread overview]
Message-ID: <CAOTD34YYHE6qHhHFwwXq1VAJ6ME4oyZMsa=BbQx8txsH4p3puA@mail.gmail.com> (raw)
In-Reply-To: <CAOTD34YqZMD=e-U=r56bys7GfzHKYwjVUnjkQpngE+Y9nAL+EA@mail.gmail.com>
On Tue, Feb 21, 2017 at 12:58 PM Erik Bray wrote:
>
> On Mon, Feb 20, 2017 at 11:54 PM, Mark Geisert wrote:
> > Erik Bray wrote:
> >>
> >> On Mon, Feb 20, 2017 at 11:54 AM, Mark Geisert wrote:
> >>>>
> >>>> So my guess was that Cygwin might try to hold on to a handle to a
> >>>> child process at least until it's been explicitly wait()ed. But that
> >>>> does not seem to be the case after all.
> >>>
> >>>
> >>>
> >>> You might have missed a subtlety in what I said above. The Python
> >>> interpreter itself is calling wait4() to reap your child process. Cygwin
> >>> has told Python one of its children has died. You won't get the chance
> >>> to
> >>> wait() for it yourself. Cygwin *does* have a handle to the process, but
> >>> it
> >>> gets closed as part of Python calling wait4().
> >>
> >>
> >> To be clear, wait4() is not called from Python until the script
> >> explicitly calls p.wait().
> >> In other words, when run this step by step (e.g. in gdb) I don't see a
> >> wait4() call until the point where the script explicitly waits(). I
> >> don't see any reason Python would do this behind the scenes.
> >
> >
> > You're right. I missed the wait in your script and ASSumed too much of the
> > Python interpreter :-( .
> >
> >
> >>>> Anyways, I think it would be nicer if /proc returned at least partial
> >>>> information on zombie processes, rather than an error. I have a patch
> >>>> to this effect for /proc/<pid>/stat, and will add a few more as well.
> >>>> To me /proc/<pid>/stat was the most important because that's the
> >>>> easiest way to check the process's state in the first place! Now I
> >>>> also have to catch EINVAL as well and assume that means a zombie
> >>>> process.
> >>>
> >>>
> >>>
> >>> The file /proc/<pid>/stat is there until Cygwin finishes cleanup of the
> >>> child due to Python having wait()ed for it. When you run your test
> >>> script,
> >>> pay attention to the process state character in those cases where you
> >>> successfully read the stat file. It's often S (stopped, I think) or R
> >>> (running) but I also see Z (zombie) sometimes. Your script is in a race
> >>> with Cygwin, and you cannot guarantee you'll see a killed process's state
> >>> before Cygwin cleans it up.
> >>>
> >>> One way around this *might* be to install a SIGCHLD handler in your
> >>> Python
> >>> script. If that's possible, that should tell you when your child exits.
> >>
> >>
> >> Perhaps the Python script is a red herring. I just wrote it to
> >> demonstrate the problem. The difference between where I send stdout
> >> to is strange, but you're likely right that it just comes down to
> >> subtle timing differences. Here's a C program that demonstrates the
> >> same issue more reliably. Interestingly, it works when I run it in
> >> strace (probably just because of the strace overhead) but not when I
> >> run it normally.
> >>
> >> My point in all this is I'm confused why Cygwin would give up its
> >> handles to the Windows process before wait() has been called.
> >>
> >> (In fact, it's pretty confusing to have fopen returning EINVAL which
> >> according to [1] it should only be doing if the mode string were
> >> invalid.)
> >>
> >> Thanks,
> >> Erik
> >>
> >> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html
> >
> >
> > O.K., you may be on to something amiss in the Cygwin DLL. Thanks for the
> > STC in C; that'll help somebody looking further at this. I'm out of ideas.
> > It might be possible to reduce strace overhead somewhat by selecting a
> > smaller set of trace options than the default.
>
> Note: My previous test program had a bug in do_child() (not correctly
> terminating the argv array). The attached program fixes the bug.
> I've also attached a (truncated) strace log from it.
With apologies for re-raising a 2 year old thread; I've finally been
back to working on my port of psutil [1]. I was getting some
confusing errors reading the /proc/[pid]/stat files of recently
created processes that had quickly become zombified. I had completely
forgotten about this issue until I saw that trying to read the stat
file was resulting in EINVAL ("invalid argument") and something about
that ringed a bell.
So, I can confirm that this is still an issue. Apparently I wrote
that I had a patch to Cygwin for this. I have no idea where that
patch is but I'll look for it, or try to reproduce it. I think the
idea for the patch was to at least make a zombie process's stat file
readable so that the status flag ("Z") can be read, and maybe fill the
remaining fields with 0.
Once I find and/or reproduce that patch I'll submit it to cygwin-patches.
[1] https://psutil.readthedocs.io/en/latest/
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
prev parent reply other threads:[~2019-04-04 12:44 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-14 11:14 Erik Bray
2017-02-14 19:25 ` Mark Geisert
2017-02-17 11:00 ` Erik Bray
2017-02-17 15:06 ` Doug Henderson
2017-02-17 22:06 ` Mark Geisert
2017-02-20 9:46 ` Erik Bray
2017-02-20 10:54 ` Mark Geisert
2017-02-20 15:23 ` Erik Bray
2017-02-20 22:54 ` Mark Geisert
2017-02-21 11:59 ` Erik Bray
2019-04-04 12:44 ` E. Madison Bray [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOTD34YYHE6qHhHFwwXq1VAJ6ME4oyZMsa=BbQx8txsH4p3puA@mail.gmail.com' \
--to=erik.m.bray@gmail.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).