From: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
To: Corinna Vinschen <corinna-cygwin@cygwin.com>
Cc: cygwin@cygwin.com
Subject: Re: Instability with signals and threads
Date: Fri, 21 Nov 2014 04:24:00 -0000 [thread overview]
Message-ID: <alpine.DEB.2.02.1411202055420.8559@artax.karlin.mff.cuni.cz> (raw)
Message-ID: <20141121042400.KGaROOXvivGQIGoiww6wbsFEJNNF6wLB1CcdCJy9XA8@z> (raw)
> Never mind that. I can fix your testcase by calling _my_tls.remove with
> INFINITE as parameter in both places. If I drop one of them, your
> testcase will invariable fail at one point. With both INFINITE params
> in place, your testcase is now running half an hour without problems.
For me, this change doesn't fix the testcase, it just reduces the
probability that it hangs.
With this change, the testcase still locks up, but with a different
stacktrace:
thread1:
Sleep
_yield
pthread::create
sigdelayed ??
_cygwin_exit_return ??
_cygtls::call2
thread2:
SetEvent
muto::release
init_cygheap::find_tls
_cygtls::init_thread
thread3:
WriteFile
sig_send
timer_thread
cygthread::callfunc
cygthread::stub
_cygtls::call2
thread4:
VirtualFree
thread_wrapper
thread5:
only ntdll stuff
So, apparently, there is another bug, where thread->cygtls isn't being set
and pthread::create loops endlessly calling yield.
> Thinking about it, the fact that _cygtls::remove allows to apply a
> non-INFINITE wait is rather strange, isn't it? Calling remove_tls with
> a 0 wait, it allows to return the function silently, without actually
> having removed the thread from the list. This is bound to go downhill
> at one point and looks like a kludge to me to circumvent some potential
> hang in another situation...
Looking at CVS history, the "wait" argument was added to cygtls.cc version
1.2 with a comment: "Add a 'wait' argument to control how long we wait for
a lock before removing." There is no explanation why is it needed.
> I'm not exactly sure if that works as intended. I will apply this patch
> and create a new Cygwin snapshot on https://cygwin.com/snapshots/ in a
> couple of minutes. I'd appreciate if you and others would give it an
> exhaustive test. New spurious hangs or SEGVs in other situations which
> so far worked fine would be good indicators for another problem in the
> code.
Yes, I think it's correct to remove the wait argument.
> Other than that, there's certainly some room for improvement. Calling
> threadlist[idx]->remove from the find_tls exception handler looks
> extremly hairy to me. I wonder if that should be called at all at this
> point, or if there shouldn't be better some "simplified" removal
> operation which doesn't require the _cygtls pointer. If the thread
> doesn't exist anymore, so does its _cygtls area.
I suggest to remove that exception handler at all. This thing can't ever
work reliably - it could reduce probability of crashes but not eliminate
them. Even if we handled the page fault correctly - what happens if some
other thread allocates a different object at the location that belonged to
the tls before? - then find_tls thinks that this different object is tls
and corrupts it.
I suggest to remove the exception handler and if it results in any
regressions, fix them properly.
Mikulas
> Thanks, Corinna --
> Corinna Vinschen
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next reply other threads:[~2014-11-20 20:24 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-20 20:24 Mikulas Patocka [this message]
2014-11-21 4:24 ` Mikulas Patocka
2014-11-21 10:28 ` Corinna Vinschen
2014-11-21 10:38 ` Corinna Vinschen
2014-11-21 14:43 ` Mikulas Patocka
2014-11-21 14:46 ` Mikulas Patocka
2014-11-21 14:47 ` Corinna Vinschen
2014-11-21 16:30 ` Marco Atzeri
2014-11-21 16:37 ` Corinna Vinschen
2014-11-21 20:36 ` Yaakov Selkowitz
2014-11-21 20:44 ` Corinna Vinschen
2014-11-28 21:13 ` Yaakov Selkowitz
2014-11-28 22:35 ` Corinna Vinschen
2014-12-04 9:42 ` Corinna Vinschen
-- strict thread matches above, loose matches on Subject: below --
2014-11-19 16:43 Mikulas Patocka
2014-11-19 17:30 ` Mikulas Patocka
2014-11-20 10:00 ` Corinna Vinschen
2014-11-20 16:22 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.02.1411202055420.8559@artax.karlin.mff.cuni.cz \
--to=mikulas@artax.karlin.mff.cuni.cz \
--cc=corinna-cygwin@cygwin.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).