From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2AB9D3857437; Mon, 29 May 2023 06:43:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2AB9D3857437 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685342605; bh=NVr7y8BA0dp3DR7YPTH8FTei7bopf/CCZ6EQc166IKQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=dXP2ImS+ex0Hq943tjlEfwONu9UmNO7ixVi2PZILDyScz4ZozZNNNcYi1HsN5QdBc zVDRQ+hcyOrK15sDx0L5CbdhXbO4NY9L9m+zmxifpl+MWA/WpVz6EyhiPU0b3d+b60 2czfn4R1Sz/VoPl0abKCxPFYi677F99lW+TevFps= From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/110016] Possible miscodegen when inlining std::condition_variable::wait predicate causes deadlock Date: Mon, 29 May 2023 06:43:24 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: 12.2.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110016 --- Comment #10 from Andrew Pinski --- (In reply to Andrew Pinski from comment #9) > So I think this is a bug in your code: >=20 > Inside substrate::threadPool_t::finish, > we have: >=20 > finished =3D true; > haveWork.notify_all(); >=20 > If I change it to: > { > std::lock_guard lock{workMutex}; > finished =3D true; > haveWork.notify_all(); > } >=20 > Then I don't get a deadlock at all. > As I mentioned, I did think there was a race condition. > Here is what I think happened: > Thread26: thread 1 > checks finished, still false sets finished to be true > calls wait calls notify_all > ... notify_all happens > finally gets into futex_wait syscall .... >=20 > And then thread26 never got the notification. >=20 > With my change the check for finished has to wait till thread1 lets go of > the mutex (and the other way around). I should note that adding -fsanitize=3Dthread increases the time between ch= ecking of finished and getting into the wait which increases the chances of the ra= ce condition of finished changing to true and notify_all happens before the wa= it has happened. I can prevent this race condition from showing up by adding: for(volatile int i =3D 0 ; i < 10000;i ++) ; Right before the call to finish method in main. because it forces a "sleep" before setting of finished variable which is enough time to pass on the oth= er thread to get into the wait state. Though the race condition might show up when adding work queue entries still and then calling finish method. It is not even an ABA issue that is mentioned in PR 98033 as the predicate never changes from false to true back to false. In this case it is false and then true, but we never saw the change because that thread never got a notification as it was not in wait when the notifications were sent out.=