public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Re: Cygwin crashes in kill_pgrp, _pinfo truncation issue.
@ 2012-08-16 16:45 Andrey Khalyavin
  2012-08-16 18:44 ` Christopher Faylor
  0 siblings, 1 reply; 6+ messages in thread
From: Andrey Khalyavin @ 2012-08-16 16:45 UTC (permalink / raw)
  To: cygwin

On Wed, 15 Aug 2012 10:11:16 -0400, Christopher Faylor wrote:
>On Wed, Aug 15, 2012 at 04:54:42PM +0400, Andrey Khalyavin wrote:
>>I finally got a cygwin crash dump from our build bots. It shows, that
>>cygwin1.dll crashes in kill_pgrp function on line:
>>	  (pid > 1 && p->pgid != pid) ||
>>where p is a pointer to _pinfo. This function enumerates all _pinfo's
>>and executes this line for all of them which pass p->exists() check.
>>In crash dump p points to _pinfo that has process_state equal to
>>PID_IN_USE | PID_EXECED.
>
>Thanks for tracking this down.  I've added a check for "execed" to
>_pinfo::exists.
>
>cgf
I updated core libraries from 20120803 snapshot to 20120815 snapshot
and now bash crashes when I execute rm -rf dir. Reproducibility is
strange. It crashed for hours when I entered
cd /tmp
mkdir a
rm -rf a
commands but now suddenly stopped crashing in this case.
It is still crashes on rm -rf in the real script we use though.

Crash happens in setup_handler function on line
	  HANDLE hth = (HANDLE) *tls;
because tls->tid equals to zero. Definition of this operation is in
sygtls.h: operator HANDLE () const {return tid->win32_obj_id;}.
setup_handler is called from sigpacket::process which in turn
called from wait_sig. Signal number is 20, signal code is 28.
All fields of tls structure are zero with exception stacklock equal
to 1 and stackptr equal to address of tls->stack.

I don't understand what goes wrong here. Can you suggest what
I can look for in the debugger?

Andrey Khalyavin

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Cygwin crashes in kill_pgrp, _pinfo truncation issue.
@ 2012-08-23 11:35 Andrey Khalyavin
  2012-09-17 11:03 ` Andrey Khalyavin
  0 siblings, 1 reply; 6+ messages in thread
From: Andrey Khalyavin @ 2012-08-23 11:35 UTC (permalink / raw)
  To: cygwin

On Wed, 15 Aug 2012 10:11:16 -0400, Christopher Faylor wrote:
>On Wed, Aug 15, 2012 at 04:54:42PM +0400, Andrey Khalyavin wrote:
>>I finally got a cygwin crash dump from our build bots. It shows, that
>>cygwin1.dll crashes in kill_pgrp function on line:
>>	  (pid > 1 && p->pgid != pid) ||
>>where p is a pointer to _pinfo. This function enumerates all _pinfo's
>>and executes this line for all of them which pass p->exists() check.
>>In crash dump p points to _pinfo that has process_state equal to
>>PID_IN_USE | PID_EXECED.
>
>Thanks for tracking this down.  I've added a check for "execed" to
>_pinfo::exists.
>
>cgf

I got two more crash dumps from our bots running 20120816 snapshot.
Both of them crashed in this place. process_state equals to
PID_IN_USE | PIE_EXECED and your new check in _pinfo::exist is really there.
So the race condition on process_state field does happens in practice.

Andrey Khalyavin

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Cygwin crashes in kill_pgrp, _pinfo truncation issue.
@ 2012-08-15 13:10 Andrey Khalyavin
  2012-08-15 20:51 ` Christopher Faylor
  0 siblings, 1 reply; 6+ messages in thread
From: Andrey Khalyavin @ 2012-08-15 13:10 UTC (permalink / raw)
  To: cygwin

I finally got a cygwin crash dump from our build bots. It shows, that
cygwin1.dll crashes in kill_pgrp function on line:
	  (pid > 1 && p->pgid != pid) ||
where p is a pointer to _pinfo. This function enumerates all _pinfo's
and executes this line for all of them which pass p->exists() check.
In crash dump p points to _pinfo that has process_state equal to
PID_IN_USE | PID_EXECED. As far as I understand, such _pinfo's
have smaller size and so accessing pgid field leads to crash.
I found a past thread that describes very similar problem:
http://cygwin.com/ml/cygwin/2010-09/msg00390.html

We obviously need to add a check for PID_EXECED in this function. But
I'm worried that we can still have a race condition where
external process can truncate the structure just after our check.

Some ideas how to fix this race condition (if it exists):
1. Use ReadProcessMemory for memory access.
2. Move command line to the end of the structure and extend
PINFO_REDIR_SIZE to include all important fields.
3. Make proper synchronization. Not sure how it should be done in this
case. Since this is a memory mapped file, one may need
to use functions like LockFile.

Andrey Khalyavin

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-09-17  9:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-16 16:45 Cygwin crashes in kill_pgrp, _pinfo truncation issue Andrey Khalyavin
2012-08-16 18:44 ` Christopher Faylor
  -- strict thread matches above, loose matches on Subject: below --
2012-08-23 11:35 Andrey Khalyavin
2012-09-17 11:03 ` Andrey Khalyavin
2012-08-15 13:10 Andrey Khalyavin
2012-08-15 20:51 ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).