On Sep 7 19:02, Qian Hong wrote: > Hi Corinna, > > Sorry for delay, our progress on this issue is slow. > > Many thanks for your great information, it does help us became closer > to the reason. > > On Thu, Sep 3, 2015 at 6:55 PM, Corinna Vinschen > wrote: > > Comparing the straces, two interesting facts are conspicuous: > > - After the forked script returned, what happens in the parent is absolutly > > identical in both cases, up to a point during exit. This point is reached > > with line 1573 in the "bad" case and line 1580 in the "good" case. Then > > "something" happens: > > > > Thanks for the analysis, this does help us realized it should be a > very low level issue, in a level I wasn't doubt about. > [...] > > To me this looks like the "bad" script has been exited forcefully > > for some reason. > > > > Yeah, this is the hard part. Sebastian has some progress on it, I > quote his note here: > > === quote === > > Cygwin tries to forcibly kill a thread, the implementation for that is > available here: > https://github.com/Alexpux/Cygwin/blob/79511853f788111efd975651f87eabbd4a8cbf6d/winsup/cygwin/cygthread.cc#L296 Guys, no offense meant, but I'd really appreciate if you could peruse and refer to the original upstream Cygwin repo at https://sourceware.org/git/?p=newlib-cygwin.git This is what we're talking about in the first place so I'd like to stick to this, ok? > Excerpt from the log with annotations: > > --- snip --- > 0027:Call KERNEL32.TerminateThread(00000168,00000000) ret=610055ca > 0027: terminate_thread( handle=0168, exit_code=0 ) > 0027: terminate_thread() = 0 { self=0, last=0 } > 0027:Ret KERNEL32.TerminateThread() retval=00000001 ret=610055ca > // handle 00000168 corresponds to thread 0x0029 > > 0027:Call KERNEL32.WaitForSingleObject(00000168,ffffffff) ret=610055e0 > 0027: select( flags=2, cookie=0060ba6c, timeout=infinite, > prev_apc=0000, result={}, data={WAIT,handles={0168}} ) > 0027: select() = 0 { timeout=infinite, call={APC_NONE}, apc_handle=0000 } > 0027:Ret KERNEL32.WaitForSingleObject() retval=00000000 ret=610055e0 > // WaitForSingleObject doesn't block, so cygwin assumes the thread is gone > > [...] > > 0027:Call KERNEL32.VirtualFree(00a10000,00000000,00008000) ret=61005767 > // Cygwin tries to release the thread stack Ah, ok. What OS does Wine emulate here? Have a look at https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/cygthread.cc;h=e48a73e545e7ca90884fc891f1b188e0ab3bf863;hb=HEAD#l316 The terminate_thread_frees_stack flag is set to false for XP/2003 and to true for any newer OS. I guess this is a double-free because Wine's TerminateThread already freed the stack and Cygwin got the info it's supposedly running under XP/2003, so it tries to workaround the fact that TerminateThread on these systems didn't free the stack by themselves. > As a workaround NtFreeVirtualMemory can be replaced with a no-op > implementation returning STATUS_SUCCESS. I don't think this is the right thing to do. It's a hack covering the problem that TerminateThread should not free the thread stack if we're running an XP/2003 emulation. If my assumption here is incorrect, we need to know what address ret=61005767 is refering to. addr2line would help. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat