From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 76770 invoked by alias); 7 Sep 2015 11:03:01 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 76755 invoked by uid 89); 7 Sep 2015 11:03:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f171.google.com Received: from mail-wi0-f171.google.com (HELO mail-wi0-f171.google.com) (209.85.212.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 07 Sep 2015 11:02:58 +0000 Received: by wicgb1 with SMTP id gb1so41523463wic.1 for ; Mon, 07 Sep 2015 04:02:56 -0700 (PDT) X-Received: by 10.194.63.43 with SMTP id d11mr14412596wjs.98.1441623775840; Mon, 07 Sep 2015 04:02:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.144.196 with HTTP; Mon, 7 Sep 2015 04:02:15 -0700 (PDT) Reply-To: fracting@gmail.com In-Reply-To: <20150903105549.GT23669@calimero.vinschen.de> References: <20150903105549.GT23669@calimero.vinschen.de> From: Qian Hong Date: Mon, 07 Sep 2015 11:03:00 -0000 Message-ID: Subject: Re: [OT] Wine + Cygwin: `script -e` exit status forwarding randomly return zero for non zero child process To: cygwin Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg00104.txt.bz2 Hi Corinna, Sorry for delay, our progress on this issue is slow. Many thanks for your great information, it does help us became closer to the reason. On Thu, Sep 3, 2015 at 6:55 PM, Corinna Vinschen wrote: > Comparing the straces, two interesting facts are conspicuous: > - After the forked script returned, what happens in the parent is absolutly > identical in both cases, up to a point during exit. This point is reached > with line 1573 in the "bad" case and line 1580 in the "good" case. Then > "something" happens: > Thanks for the analysis, this does help us realized it should be a very low level issue, in a level I wasn't doubt about. > In the bad case the pty master thread gets an error condition returned > from DisconnectNamedPipe: > > 305 2754169 [ptym] script 25 fhandler_pty_master::pty_master_thread: > DisconnectNamedPipe, Win32 error 6 > Good catch. I tested again and again, and I found the DisconnectNamedPipe error doesn't happen in every bad log, so it is not related to this issue, but it is interesting, it certainly indicates something else is wrong, which is on my todo list. > To me this looks like the "bad" script has been exited forcefully > for some reason. > Yeah, this is the hard part. Sebastian has some progress on it, I quote his note here: === quote === Cygwin tries to forcibly kill a thread, the implementation for that is available here: https://github.com/Alexpux/Cygwin/blob/79511853f788111efd975651f87eabbd4a8cbf6d/winsup/cygwin/cygthread.cc#L296 Excerpt from the log with annotations: --- snip --- 0027:Call KERNEL32.TerminateThread(00000168,00000000) ret=610055ca 0027: terminate_thread( handle=0168, exit_code=0 ) 0027: terminate_thread() = 0 { self=0, last=0 } 0027:Ret KERNEL32.TerminateThread() retval=00000001 ret=610055ca // handle 00000168 corresponds to thread 0x0029 0027:Call KERNEL32.WaitForSingleObject(00000168,ffffffff) ret=610055e0 0027: select( flags=2, cookie=0060ba6c, timeout=infinite, prev_apc=0000, result={}, data={WAIT,handles={0168}} ) 0027: select() = 0 { timeout=infinite, call={APC_NONE}, apc_handle=0000 } 0027:Ret KERNEL32.WaitForSingleObject() retval=00000000 ret=610055e0 // WaitForSingleObject doesn't block, so cygwin assumes the thread is gone [...] 0027:Call KERNEL32.VirtualFree(00a10000,00000000,00008000) ret=61005767 // Cygwin tries to release the thread stack 0029:Ret KERNEL32.SetEvent() retval=00000001 ret=61005415 0029:Call KERNEL32.WaitForSingleObject(00000170,ffffffff) ret=6100543a 0029: select( flags=2, cookie=00a0b42c, timeout=infinite, prev_apc=0000, result={}, data={WAIT,handles={0170}} ) 0029: select() = PENDING { timeout=infinite, call={APC_NONE}, apc_handle=0000 } // Crash. There is no core dump, so probably it crashed somewhere inside of the pthread implementation. 0027: *killed* exit_code=0 0027: *sent signal* signal=3 0028: *killed* exit_code=0 0029: *killed* exit_code=0 002e: *killed* exit_code=0 0026: *process killed* --- snip --- As a workaround NtFreeVirtualMemory can be replaced with a no-op implementation returning STATUS_SUCCESS. === quote === > There's something else which occured to me while looking through both > straces: Are you aware that Windows PIDs are *always* multiples of 4? > > PID 0, 4, 8, 12, 16, ... exist > PID 1, 2, 3, 5, 6, 7, ... don't. > > Wine apparently doesn't follow this scheme. I would treat that as a bug. > I can easily imagine applications which rely on the fact that PIDs are > always multiples of four and use the lower two bits for dubious purposes. > I'd suggest to change that in Wine for compatibility reasons. Good catch, I didn't noticed about the number magic, many thanks. Will add this one to my todo list as well. It would takes some more time until we complete fix this bug, we'll update again once we finish it. Thank you again for your great help and enjoy your vocation! (Off-topic: vocation at Oktoberfest? :D http://www.oktoberfest.de/en/ ) -- Regards, Qian Hong - http://www.winehq.org -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple