From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31927 invoked by alias); 18 Mar 2014 15:31:56 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org Received: (qmail 31847 invoked by uid 48); 18 Mar 2014 15:31:48 -0000 From: "snyder at bnl dot gov" To: glibc-bugs@sourceware.org Subject: [Bug nptl/16657] Lock elision breaks pthread_mutex_detroy Date: Tue, 18 Mar 2014 15:31:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: nptl X-Bugzilla-Version: 2.18 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: snyder at bnl dot gov X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-03/txt/msg00128.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=16657 scott snyder changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |snyder at bnl dot gov --- Comment #9 from scott snyder --- Just switched to a new machine and promptly ran into this problem... The boost thread wrappers check the abort on a failing return from pthread_mutex_destroy: ~mutex() { BOOST_VERIFY(!posix::pthread_mutex_destroy(&m)); } Locking multiple times, as in the attached test case, is not necessary; i can reproduce the problem with sequence like this: pthread_mutex_t m; if (pthread_mutex_init(&m, NULL) != 0) abort(); pthread_mutex_trylock(&m); pthread_mutex_unlock (&m); if (pthread_mutex_destroy (&m) != 0) abort(); I think i see what the problem is. In pthread_mutex_lock, we have this sequence: if (__builtin_expect (type == PTHREAD_MUTEX_TIMED_NP, 1)) { FORCE_ELISION (mutex, goto elision); FORCE_ELISION sets the PTHREAD_MUTEX_ELISION_NP flag in the lock's type field, and then proceeds to do the elided lock. The owner/users fields are not updated in this case. In pthread_mutex_unlock the first test fails because elision bit is set; the second succeeds, and we do the elided unlock which again does not change the owner/users flags: if (__builtin_expect (type, PTHREAD_MUTEX_TIMED_NP) == PTHREAD_MUTEX_TIMED_NP) { /* Always reset the owner field. */ ... } else if (__builtin_expect (type == PTHREAD_MUTEX_TIMED_ELISION_NP, 1)) { /* Don't reset the owner/users fields for elision. */ return lll_unlock_elision (mutex->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)); In trylock, however, we use DO_ELISION rather than FORCE_ELISION: case PTHREAD_MUTEX_TIMED_ELISION_NP: elision: if (lll_trylock_elision (mutex->__data.__lock, mutex->__data.__elision) != 0) break; /* Don't record the ownership. */ return 0; case PTHREAD_MUTEX_TIMED_NP: if (DO_ELISION (mutex)) goto elision; /*FALL THROUGH*/ DO_ELISION does _not_ set the elision bit in the type field. If the lock was orginally free, the trylock succeeds, and does not adjust the owner/user fields. But then when we try to unlock after a trylock, the elision bit is still clear. So we take the code path starting at the comment /* Always reset the owner field. */ which proceeds to decrement the user field, corrupting the lock. This problem might be fixable by changing the DO_ELISION in pthread_mutex_trylock to FORCE_ELISION (though i haven't yet tried that). I have checked that if i do a pthread_mutex_lock/pthread_mutex_unlock on the lock before the first trylock, then the problem goes away, as expected from the above analysis (since the lock will set the elision flag). -- You are receiving this mail because: You are on the CC list for the bug.