From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id E55C03858429; Mon, 19 Sep 2022 03:38:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E55C03858429 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1663558714; bh=wzTNYUwiAaJDpd7EZiVYMsYoxA4xfP1TI1FuoA7UFLI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=o8dxcHa9Q80cUK/9rGRsoEA1qSpUhw+ci7tDIc1fKKpWSTAJ8zu1QrazwzHv9YU+J D/jEDxkItWxSTql148rdOdXwb4huGdmk85jrsbEnaXz0X2ytaP6W5mcN5NWnM/NmmA kMoFRRO5X/RNbwWtSqu7vqerHgBytqKYSg2JzdVU= From: "malteskarupke at fastmail dot fm" To: glibc-bugs@sourceware.org Subject: [Bug nptl/25847] pthread_cond_signal failed to wake up pthread_cond_wait due to a bug in undoing stealing Date: Mon, 19 Sep 2022 03:38:32 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: nptl X-Bugzilla-Version: 2.27 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: malteskarupke at fastmail dot fm X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: carlos at redhat dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D25847 --- Comment #47 from Malte Skarupke --- (In reply to Carlos O'Donell from comment #46) > I'll have a look at this when I'm back from GNU Tools Cauldron. It's > disappointing that there is still an interleaving that doesn't work. I've > had a community member testing Frank Barrus' patch and that did seem to be > correct. I like Frank Barrus' patch, too. I think it takes the condition variable in= a good direction, that might allow more cleanups. I was actually thinking of = also submitting a patch that goes in that direction. Where my patch from last y= ear would force signalers to wait more often, his patch claims to make it unnecessary for signalers to ever wait. So I like his direction better, I j= ust haven't taken the time yet to properly understand/review it. > Could you expand on this a bit? Do you mean to say your patch from last > September resolves all the issues you have seen, including the new one? The new issue turned out to be the same as the old issue. The mitigation pa= tch just makes it much less likely to happen. So yes, I think that my patch wou= ld resolve all the issues. That being said someone immediately showed up in my comments to say that they tried my patch and quickly got some unexplained hangs. I have not yet been able to check why that might be. It's one of tho= se "I only proved it correct" situations where I could have made a subtle mist= ake while translating the change from TLA+ back to C. Or maybe I need to re-che= ck the patch against a newer version of glibc, or maybe the person in my comme= nts made a mistake. At this point I know only this: - It solves the issue in TLA+ when run with the same number of threads - It passes the glibc tests - I have carefully thought about it and have good reasons to think that it should solve the issue because it can't happen any more that a waiter steal= s a signal from a future waiter, which was the source of the problem I'll try to find the time this week to look at the patch again in detail. --=20 You are receiving this mail because: You are on the CC list for the bug.=