public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Randomly hang when compiling Cygwin on Cygwin on Wine
@ 2015-10-28 13:28 Qian Hong
  2015-11-01 19:09 ` Qian Hong
  0 siblings, 1 reply; 4+ messages in thread
From: Qian Hong @ 2015-10-28 13:28 UTC (permalink / raw)
  To: cygwin

Dear folks,

I try to compile Cygwin on Cygwin on Wine in a loop, and found a
random hang. Usually, it happens more than 1/10.

The hang could happen in different processes, including `sh.exe`,
`gcc.exe`, `as.exe`, etc, but the backtraces are all similar.


Here is one example I get now:

(1) `pstree -p -l` suggest the hanging process might has PID 113:
?(1)─┬─bash(58)───make(93)───sh(89)───make(69)───sh(74)───sh(61)───make(9)───make(44)───sh(129)───sh(145)───make(94)───sh(130)───sh(104)───make(151)───sh(113)
     └─mintty(91)───bash(102)───pstree(141)

(2) `cat /proc/113/cmdline` show below information:

/bin/sh -c gcc -L/newlib-cygwin-all/newlib-cygwin-build/i686-pc-cygwin/winsup/cygwin
-isystem /newlib-cygwin-all/newlib-cygwin/winsup/cygwin/include
-B/newlib-cygwin-all/newlib-cygwin-build/i686-pc-cygwin/newlib/
-isystem /newlib-cygwin-all/newlib-cygwin-build/i686-pc-cygwin/newlib/targ-include
-isystem /newlib-cygwin-all/newlib-cygwin/newlib/libc/include
-I/newlib-cygwin-all/newlib-cygwin/winsup/cygwin/include
-DPACKAGE_NAME=\"newlib\" -DPACKAGE_TARNAME=\"newlib\"
-DPACKAGE_VERSION=\"2.2.0\" -DPACKAGE_STRING=\"newlib\ 2.2.0\"
-DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -I.
-I/newlib-cygwin-all/newlib-cygwin/newlib/libc/xdr -DHAVE_OPENDIR
-DHAVE_RENAME -DSIGNAL_PROVIDED -D_COMPILING_NEWLIB -DHAVE_BLKSIZE
-DHAVE_FCNTL -DMALLOC_PROVIDED -fno-builtin      -g -O2 -c -o
lib_a-xdr_float.o `test -f 'xdr_float.c' || echo
'/newlib-cygwin-all/newlib-cygwin/newlib/libc/xdr/'`xdr_float.c


(3) I cd to the build target directory and list all files:
/newlib-cygwin-all/newlib-cygwin-build/i686-pc-cygwin/newlib/libc/xdr$ ls
lib_a-dummy.o  lib_a-xdr.o  lib_a-xdr_private.o  Makefile

The above list result suggest that the build target
"lib_a-xdr_float.o" hasn't been generated, which suggested that the
`gcc` process in the above command hasn't run.

(4) Then I check the file descriptors opened by process 113:
$ file /proc/113/fd/*
/proc/113/fd/0: broken symbolic link to /dev/cons0
/proc/113/fd/1: broken symbolic link to /dev/cons0
/proc/113/fd/2: broken symbolic link to /dev/cons0
/proc/113/fd/3: symbolic link to pipe:[296]

(5) I use gdb to attach to the process 113:

[New Thread 64.0x3c]
[New Thread 64.0x4c]
[New Thread 64.0x51]
Reading symbols from /usr/bin/sh.exe...Reading symbols from
/usr/lib/debug//usr/bin/sh.exe.dbg...done.
done.

0xb77a0be6 in ?? ()
(gdb) bt
#0  0xb77a0be6 in __kernel_vsyscall+0xe in [vdso].so
#1  0x7bc82ea5 in server_select+0x315
[/home/fracting/src/wine-patched-auto/dlls/ntdll/server.c:642] in
ntdll
#2  0x7bc8ade9 in wait_objects+0x89
[/home/fracting/src/wine-patched-auto/dlls/ntdll/sync.c:1034] in ntdll
#3  0x7bc8d5e9  in __syscall_NtWaitForMultipleObjects+0x29
[/home/fracting/src/wine-patched-auto/dlls/ntdll/sync.c:1047] in ntdll
#4  0x7b87bbe5 in WaitForMultipleObjectsEx+0x1b5
[/home/fracting/src/wine-patched-auto/dlls/kernel32/sync.c:186] in
kernel32
#5  0x7b87ba21 in WaitForMultipleObjects+0x31
[/home/fracting/src/wine-patched-auto/dlls/kernel32/sync.c:146] in
kernel32
#6  0x61006354 in cygwait(void*, _LARGE_INTEGER*, unsigned int)@12
(object=<optimized out>, timeout=0x7bc4861c, mask=2076474397)
    at /usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/cygwait.cc:76
#7  0x6130a188 in ?? () from /usr/bin/cygwin1.dll
#8  0x00000000 in ?? ()

(gdb) thread 2
[Switching to thread 2 (Thread 64.0x4c)]
#0  0xb77a0be6 in ?? ()
(gdb) bt
#0  0xb77a0be6 in __kernel_vsyscall+0xe in [vdso].so
#1  0x7bc4aadc in __syscall_NtReadFile+0x41c
[/media/workspace/src/wine-patched-auto/dlls/ntdll/file.c:1104] in
ntdll
#2  0x7b83f63d in ReadFile+0x1c6
[/home/fracting/src/wine-patched-auto/dlls/kernel32/file.c:448] in
kernel32
#3  0x610efbae in wait_sig () at
/usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/sigproc.cc:1252
#4  0x611b6c40 in cygthread::main_thread_id () from /usr/bin/cygwin1.dll


(gdb) thread 3
[Switching to thread 3 (Thread 64.0x51)]
#0  0xb77a0be6 in ?? ()
(gdb) bt
#0  0xb77a0be6 in __kernel_vsyscall+0xe in [vdso].so
#1  0x7bc82ea5 in server_select+0x315
[/home/fracting/src/wine-patched-auto/dlls/ntdll/server.c:642] in
ntdll
#2  0x7bc8ade9 in wait_objects+0x89
[/home/fracting/src/wine-patched-auto/dlls/ntdll/sync.c:1034] in ntdll
#3  0x7bc8d5e9 in  __syscall_NtWaitForMultipleObjects+0x29
[/home/fracting/src/wine-patched-auto/dlls/ntdll/sync.c:1047] in ntdll
#4  0x7b87bbe5 in WaitForMultipleObjectsEx+0x1b5
[/home/fracting/src/wine-patched-auto/dlls/kernel32/sync.c:186] in
kernel32
#5  0x7b87b9a3 in WaitForSingleObject+0x2e
[/home/fracting/src/wine-patched-auto/dlls/kernel32/sync.c:126] in
kernel32
#6  0x6100548a in cygthread::stub(void*)@4 (arg=0xacce64) at
/usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/cygthread.cc:114
#7  0x61005400 in cygthread::release(bool) () at
/usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/cygthread.cc:292
#8  0x00acce64 in ?? ()
#9  0x00000051 in ?? ()

Here are the version information I have:

$ uname -a
CYGWIN_NT-5.1  2.2.1(0.289/5/3) 2015-08-20 11:40 i686 Cygwin

$ sh --version
GNU bash, version 4.3.42(4)-release (i686-pc-cygwin)

$ wine --version
wine-1.7.53-859-g6ec39dd (Staging)

This bug also happens with latest git version of Cygwin.


It might be either Wine bug or Cygwin bug, but currently I can't go
further, since:
- I can't reproduce this bug with strace.exe
- I also can't reproduce this bug with Wine's relay debug trace, which
records system API calls.

Could you suggest any hints where should I look deeper?

Thanks!

-- 
Regards,
Qian Hong

-
http://www.winehq.org

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Randomly hang when compiling Cygwin on Cygwin on Wine
  2015-10-28 13:28 Randomly hang when compiling Cygwin on Cygwin on Wine Qian Hong
@ 2015-11-01 19:09 ` Qian Hong
  2015-11-02 14:30   ` Corinna Vinschen
  0 siblings, 1 reply; 4+ messages in thread
From: Qian Hong @ 2015-11-01 19:09 UTC (permalink / raw)
  To: cygwin

Update:

I have another case of hang:

?(1)-+-bash(1419290)
     |-bash(1419318)---make(1423770)---sh(1423783)---sh(1423838)
     |-mintty(1249269)---bash(1249277)
     |-mintty(47)---bash(55)---bash(1132827)---make(1139169)---sh(1139182)---make(1139234)---sh(1206695)---sh(1206721)---make(1206764)---sh(1206770)---make(1206783)---perl(1249042)
     |-mintty(596773)---bash(596790)---pstree(1423865)
     `-mintty(1417449)---bash(1417457)


$ ps aux | grep 1249042
  1249042 1206783 1132827    1249050  pty0      197608   Oct 31 /usr/bin/perl
  1249077 1249042 1132827    1249092  pty0      197608   Oct 31
/usr/bin/nm <defunct>



Here the process perl(1249042) is waiting for a pipe, while its child
process nm(1249077) is in the status "defunct". nm(1249077) appears in
output of `ps aux`, but doesn't appear in output of `pstree -p -l`.

Is there any case, where a child process can exit unexpected, but the
parent process still incorrectly wait for the child process forever?

Thanks!

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Randomly hang when compiling Cygwin on Cygwin on Wine
  2015-11-01 19:09 ` Qian Hong
@ 2015-11-02 14:30   ` Corinna Vinschen
  2016-03-15 14:10     ` Qian Hong
  0 siblings, 1 reply; 4+ messages in thread
From: Corinna Vinschen @ 2015-11-02 14:30 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 1669 bytes --]

On Nov  2 03:08, Qian Hong wrote:
> Update:
> 
> I have another case of hang:
> 
> ?(1)-+-bash(1419290)
>      |-bash(1419318)---make(1423770)---sh(1423783)---sh(1423838)
>      |-mintty(1249269)---bash(1249277)
>      |-mintty(47)---bash(55)---bash(1132827)---make(1139169)---sh(1139182)---make(1139234)---sh(1206695)---sh(1206721)---make(1206764)---sh(1206770)---make(1206783)---perl(1249042)
>      |-mintty(596773)---bash(596790)---pstree(1423865)
>      `-mintty(1417449)---bash(1417457)
> 
> 
> $ ps aux | grep 1249042
>   1249042 1206783 1132827    1249050  pty0      197608   Oct 31 /usr/bin/perl
>   1249077 1249042 1132827    1249092  pty0      197608   Oct 31
> /usr/bin/nm <defunct>
> 
> 
> 
> Here the process perl(1249042) is waiting for a pipe, while its child
> process nm(1249077) is in the status "defunct". nm(1249077) appears in
> output of `ps aux`, but doesn't appear in output of `pstree -p -l`.
> 
> Is there any case, where a child process can exit unexpected, but the
> parent process still incorrectly wait for the child process forever?

I don't answer this question conclusively before consulting my lawyer.
It *might* be possible.  See fhandler_base_overlapped::close(),
fhandler_base_overlapped::check_later() and flush_async_io(), all three
in fhandler.cc.

No, I'm not happy with using overlapped IO for pipes.  I'd prefer to use
nonblocking pipes instead, but the most required resource to fix this,
which is least available, is time :(


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Randomly hang when compiling Cygwin on Cygwin on Wine
  2015-11-02 14:30   ` Corinna Vinschen
@ 2016-03-15 14:10     ` Qian Hong
  0 siblings, 0 replies; 4+ messages in thread
From: Qian Hong @ 2016-03-15 14:10 UTC (permalink / raw)
  To: cygwin

On Mon, Nov 2, 2015 at 10:30 PM, Corinna Vinschen
<corinna-cygwin@cygwin.com> wrote:
> I don't answer this question conclusively before consulting my lawyer.

Me too, but my lawyer was much more slower, his name is Mr. Flash [1]. ;)

> It *might* be possible.  See fhandler_base_overlapped::close(),
> fhandler_base_overlapped::check_later() and flush_async_io(), all three
> in fhandler.cc.

Thanks a lot, your information helps a lot.

The bug is hard, it takes a lot of time to reproduce, but finally we got it!

It is a race condition in the Wine side:

Firstly, the reader process calls NtReadFile(), in Wine's
implementation, NtReadFile() tries to read some data using
read_unix_fd() first, if there is enough data needed it will returns
STATUS_SUCCESS immediately, if no enough data available it will
register an async read service using register_async() and returns
STATUS_PENDING. Ideally this async service will either get some data
and wake up the reader process if more data coming form writer, or get
nothing but wake up the reader process telling the pipe is broken if
the writer close the pipe.

However, unfortunately, in some very rare case, the writer process
close the writer handle at a very special time point, a little bit
later then read_unix_fd() but a little bit sooner than
register_async(), as a result, the async read service in the reader
side has no chance to be noticed about closing of writer any more
because it is already closed.

That's why the reader process waits forever on the async overlapped event.

The solution we worked out is: If the writer is closed, then the
reader register_async() should fail with STATUS_PIPE_BROKEN
immediately, so the reader won't try to call WaitForSingleObject on
the async overlapped event at all.


Patch from Sebastian Lackner:
Subject: server: Do not allow to queue async operation for broken pipes.
https://github.com/wine-compholio/wine-staging/commit/7b40b6922a0c7a93ff62b69dd2b3c2a6076be851

We've tested this patches for a couple of weeks, and it has been
confirmed working, gcc compiles more than 64 times without any
hanging!

I'm happy to close this thread, next goal will be other Wine+Cygwin bugs... :)

[1] http://disney.wikia.com/wiki/Flash

-- 
Regards,
Qian Hong

-
http://www.winehq.org

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-15 14:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28 13:28 Randomly hang when compiling Cygwin on Cygwin on Wine Qian Hong
2015-11-01 19:09 ` Qian Hong
2015-11-02 14:30   ` Corinna Vinschen
2016-03-15 14:10     ` Qian Hong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).