From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21925 invoked by alias); 4 Aug 2012 11:37:51 -0000 Received: (qmail 21916 invoked by uid 22791); 4 Aug 2012 11:37:50 -0000 X-SWARE-Spam-Status: No, hits=-3.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,TW_BF X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 04 Aug 2012 11:37:36 +0000 From: "thiago at kde dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/54172] New: [4.7 Regression] __cxa_guard_acquire thread-safety issue Date: Sat, 04 Aug 2012 11:37:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Keywords: X-Bugzilla-Severity: major X-Bugzilla-Who: thiago at kde dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-08/txt/msg00207.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54172 Bug #: 54172 Summary: [4.7 Regression] __cxa_guard_acquire thread-safety issue Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: libstdc++ AssignedTo: unassigned@gcc.gnu.org ReportedBy: thiago@kde.org Created attachment 27936 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27936 Proposed fix. In commit 184110, the __cxa_guard_acquire implementation in libsupc++/guard.cc has been updated to use the new __atomic_* intrinsincs instead of the old __sync_* ones. I believe this has introduced a regression due to a race condition. == Proof == While debugging a program, I set a hardware watchpoint on a guard variable and set gdb to continue execution upon stop. The output was: Hardware watchpoint 1: _ZGVZN12_GLOBAL__N_121Q_QGS_textCodecsMutex13innerFunctionEvE6holder Old value = 0 New value = 256 0x000000381205f101 in __cxxabiv1::__cxa_guard_acquire (g=0x7ffff7dc9a60) at ../../../../libstdc++-v3/libsupc++/guard.cc:254 254 if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, Hardware watchpoint 1: _ZGVZN12_GLOBAL__N_121Q_QGS_textCodecsMutex13innerFunctionEvE6holder Old value = 256 New value = 1 __cxxabiv1::__cxa_guard_release (g=0x7ffff7dc9a60) at ../../../../libstdc++-v3/libsupc++/guard.cc:376 376 if ((old & waiting_bit) != 0) [Switching to Thread 0x7fffebfff700 (LWP 113412)] Hardware watchpoint 1: _ZGVZN12_GLOBAL__N_121Q_QGS_textCodecsMutex13innerFunctionEvE6holder Old value = 1 New value = 256 0x000000381205f101 in __cxxabiv1::__cxa_guard_acquire (g=0x7ffff7dc9a60) at ../../../../libstdc++-v3/libsupc++/guard.cc:254 254 if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, [New Thread 0x7ffff0a2d700 (LWP 113413)] Hardware watchpoint 1: _ZGVZN12_GLOBAL__N_121Q_QGS_textCodecsMutex13innerFunctionEvE6holder Old value = 256 New value = 1 __cxxabiv1::__cxa_guard_release (g=0x7ffff7dc9a60) at ../../../../libstdc++-v3/libsupc++/guard.cc:376 376 if ((old & waiting_bit) != 0) As can be seen by the output, the guard variable transitioned from 0 -> 256 -> 1 -> 256 -> 1. == Analysis == The code in guard.cc is: int expected(0); const int guard_bit = _GLIBCXX_GUARD_BIT; const int pending_bit = _GLIBCXX_GUARD_PENDING_BIT; const int waiting_bit = _GLIBCXX_GUARD_WAITING_BIT; while (1) { if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, __ATOMIC_ACQ_REL, __ATOMIC_RELAXED)) { // This thread should do the initialization. return 1; } if (expected == guard_bit) { // Already initialized. return 0; } if (expected == pending_bit) { int newv = expected | waiting_bit; if (!__atomic_compare_exchange_n(gi, &expected, newv, false, __ATOMIC_ACQ_REL, __ATOMIC_RELAXED)) continue; expected = newv; } syscall (SYS_futex, gi, _GLIBCXX_FUTEX_WAIT, expected, 0); } We have two threads running and they both reach __cxa_guard_acquire more or less at the same time. On one thread, the execution is the expected path: the first CAS succeeds and that transitions the guard variable from 0 to 256. That thread will initialise the static. In the second thread, the CAS fails, so it will proceed to the second CAS, trying to replace 256 with 768 (to indicate it's going to sleep). In the mean time, the first thread calls __cxa_guard_release, which exchanges the 256 with a 1. Therefore, on the second thread, the second CAS fails and now expected == 1 (it got updated). The continue makes it return to the first CAS with expected == 1 and that one succeeds, by replacing it from 1 to 256, which is wrong. == Solution == This issue appears to be caused by the new atomic intrinsics updating the expected variable and the looping. If the second CAS fails, the code needs to inspect the value set there to determine what to do next. The possible values are: 0: the other thread aborted, we should try again -> continue 1: initialisation completed, we should return 0 (256: can't happen) 768: yet another thread succeeded in setting the waiting bit, we should sleep The attached patch is a proposed solution to the problem, but I have not been able to test it yet.