From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12004 invoked by alias); 26 Oct 2014 11:58:54 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 11995 invoked by uid 89); 26 Oct 2014 11:58:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: limerock04.mail.cornell.edu Received: from limerock04.mail.cornell.edu (HELO limerock04.mail.cornell.edu) (128.84.13.244) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 26 Oct 2014 11:58:52 +0000 X-CornellRouted: This message has been Routed already. Received: from authusersmtp.mail.cornell.edu (granite3.serverfarm.cornell.edu [10.16.197.8]) by limerock04.mail.cornell.edu (8.14.4/8.14.4_cu) with ESMTP id s9QBwnhv014219 for ; Sun, 26 Oct 2014 07:58:50 -0400 Received: from [10.0.0.113] (50-247-204-241-static.hfc.comcastbusiness.net [50.247.204.241] (may be forged)) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.14.4/8.12.10) with ESMTP id s9QBwmBp002420 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for ; Sun, 26 Oct 2014 07:58:49 -0400 Message-ID: <544CE1F7.5050603@cornell.edu> Date: Sun, 26 Oct 2014 11:58:00 -0000 From: Ken Brown User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: cygwin@cygwin.com Subject: Re: Threads References: <54450835.3050602@cornell.edu> <5448E6F9.8040005@dronecode.org.uk> <5448EEBF.3020706@cornell.edu> <20141023153730.GC20607@calimero.vinschen.de> <544A327E.9090006@dronecode.org.uk> <20141024125416.GK20607@calimero.vinschen.de> <20141024135231.GL20607@calimero.vinschen.de> In-Reply-To: <20141024135231.GL20607@calimero.vinschen.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2014-10/txt/msg00437.txt.bz2 On 10/24/2014 9:52 AM, Corinna Vinschen wrote: > On Oct 24 14:54, Corinna Vinschen wrote: >> On Oct 24 12:05, Jon TURNEY wrote: >>> On 23/10/2014 16:37, Corinna Vinschen wrote: >>>> On Oct 23 08:04, Ken Brown wrote: >>>>> Yes, flags register corruption is exactly what Eli suggested in the other >>>>> bug report I cited. >>>> >>>> The aforementioned patch was supposed to fix this problem and it is >>>> definitely in the current 1.7.32 release... >>> >>> I didn't mean to suggest otherwise, just that perhaps a similar problem >>> exists now. >>> >>> So I made the attached test case to explore that. Maybe I've made an >>> obvious mistake with it, but on the face of it, it seems to demonstrate >>> something... >>> >>> jon@tambora / >>> $ gcc signal-stress.c -Wall -O0 -g >>> >>> jon@tambora / >>> $ ./a >>> failed: 2144210386 isn't equal to 2144210386, apparently >> >> So it checks i and j for equality, fails, and then comes up with >> "42 isn't equal to 42"? This is weird... >> >>> Note there is some odd load dependency. For me, it works fine when it's the >>> only thing running, but when I start up something CPU intensive, it often >>> fails... >> >> That's... interesting. I wonder if that only occurs in multi-core or >> multi-CPU environments. The fact that i and j are not the same when >> testing, but then are the same when printf is called looks like a >> out-of-order execution problem. >> >> Is it possible that we have to add CPU memory barriers to the sigdelayed >> function to avoid stuff like this? > > I discussed this with my college Kai Tietz (many thanks to him from > here), and we came up with a problem in sigdelayed in the 64 bit case: > pushf is called *after* aligning the stack with andq. This alignment > potentially changes the CPU flag values so the restored flags are > potentially not the flags when entering sigdelayed. > > I just applied a patch and created new snapshots on > https://cygwin.com/snapshots/ > > I couldn't reprocude the problem locally, so I'd be grateful if you > could test if that fixes the problem in your testcase, Jon. I tried Jon's testcase. With cygwin-1.7.33-0.1, it failed within a few minutes. With cygwin-1.7.33-0.2, I ran it for over an hour with no problem, with the system heavily loaded. So it looks good so far. > Ken, can you check if this snapshot helps emacs along, too? The people who have been reporting frequent crashes are aware of the fix. Now I just have to wait and hope I don't hear from them for a few days. Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple