[Please don't CC me, just send mail to the list. Thank you] On Nov 21 15:11, Mikulas Patocka wrote: > > Do you use a DLL built with optimization by any chance? I wouldn't take > > the backtraces too serious in that case. For debugging it helps a lot > > to use a Cygwin DLL built without -O2. > > I use optimization. The stacktrace may contain some other functions that > already finished but left the address on the stack. There may also be functions completely missing on the stack. I still suggest to build with -g only. > > Btw., are you testing on 32 or 64 bit? > > On 32-bit. The rebuild of cygwin1.dll requires large number of packages to > create the documentation (including tex and java) and I haven't bloated Java?!? > the 64-bit cygwin installation with them yet. I wish it were possible to > build the library without documentation and without such big dependecies. You don't have to build the docs to build the DLL. The make process continues even if building the docs fails. > > I'm testing on 64 bit. I can't reproduce your backtrace, but I can > > reproduce another one, which is related to thread_exit. At one point > > after a couple thousand runs through your testcase I have a variable > > number of threads hanging in thread_exit, and a timer thread which is > > unable to send its signal. the other threads all hang in thread_exit, > > waiting for a muto which is taken by a thread which doesn't exist > > anymore. > > So you can - just for debugging - add a counter to thread local storage > that is incremented when muto is taken and decremented when muto is > released. If the thread exists, test the counter, if it is non-zero, print > the backtrace or attach the debugger. For instance. > > That's a very serious downside of the muto implementation not > > being able to recognize being abandoned. I wonder if that shouldn't be > > using a real OS mutex. > > That would hide the problem that a thread is exiting with locked muto, but > not fix it. See exit_thread and cgf-000017 in DevNotes. This setup deliberately locks the muto and then calls ExitThread. The signal handler is supposed to unlock the muto when the __SIGTHREADEXIT signal comes in, but then it happens that it doesn't for some reason. It seems the problem here is that the SIGALRM is filling up the signal pipe so the __SIGTHREADEXIT signal is not actually delivered. I have a local workaround, but it seems to open a can of worms. I'm going to take a step back for now, and reevaluate what happens before trying to apply even more hacks. Ultimately the problem is that the cygtls area is accessed from other threads (mainly the signal thread) without locking, and worse, that the lock for the cygtls area is a member of _cygtls itself. The latter needs certainly a patch, and I'm contemplating to extend cygheap::threadlist to become a per-thread structure containing the _cygtls pointer, the thread ID, the main thread HANDLE, and the tls muto. This should allow to serialize access to the cygtls area in a way which avoids the aforementioned problems without a complete redesign. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat