public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* FAIL nptl/tst-robustpi4
@ 2017-01-26 15:29 Stefan Liebler
  2017-01-26 16:12 ` Carlos O'Donell
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Liebler @ 2017-01-26 15:29 UTC (permalink / raw)
  To: libc-alpha

Hi,

On s390, I've recognized a FAIL in nptl/tst-robustpi4 in about 16 of 
10000 iterations of this testcase.
Does anyone else see failures here, too?

If the test fails, I get:
tst-robustpi4: ../nptl/pthread_mutex_lock.c:424: 
__pthread_mutex_lock_full: Assertion `INTERNAL_SYSCALL_ERRNO (e, __err) 
!= ESRCH || !robust' failed.
Didn't expect signal from child: got `Aborted'


The mutex is a "robust pi" one, thus the futex-syscall returned ESRCH here.
However, the comment before the assertion claims:
/* ESRCH can happen only for non-robust PI mutexes where
    the owner of the lock died.  */

The coredumps show that the tf-thread has already finished
and the do_test-thread tried pthread_mutex_lock(&m1) and shall return 
EOWNERDEAD (see nptl/tst-robust1.c:202 e = LOCK (&m1);).
But instead the futex syscall returns ESRCH.
and (gdb) p/x m1->__data.__lock
= 0xc0000000
= FUTEX_OWNER_DIED | FUTEX_WAITERS

Furthermore, the coredumps always show an even value in round-variable.
Thus tf is not joined (tst-robust1.c:190) before calling 
pthread_mutex_lock(tst-robust1.c:202).

If the do_test-thread waits a bit (e.g. doing something in a loop) 
before locking the mutex, the tf-thread has already called 
__exit_thread(pthread_create.c:478).
Then m1->__data.__lock is already marked with FUTEX_OWNER_DIED | 0 
before calling the futex syscall in pthread_mutex_lock 
(pthread_mutex_lock.c:411).
Then the futex syscall takes over the mutex by setting __lock to 
FUTEX_OWNER_DIED | do_test-TID and is returning 0.
If I run the test with such a "wait-loop" for several times, I see no fails.

If the do_test-thread locks the mutex before __exit_thread() is called 
in tf-thread, the futex syscall sets FUTEX_WAITERS bit and blocks until 
tf-thread has exited.
Afterwards 0 is returned and m1->__data.__lock is 0xc0000000
= FUTEX_OWNER_DIED | FUTEX_WAITERS.
If I run the test with a "wait-loop" before pthread_testcancel in 
tf-thread for several times as described, I see no fails, too.

It seems as a race between futex- and exit-syscall causes ESRCH result 
from futex-syscall.

I see those fails with Linux 4.8 / 4.9 running in a z/VM guest
as well as with 4.6 on a LPAR (but less often).

Bye
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-29  8:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-26 15:29 FAIL nptl/tst-robustpi4 Stefan Liebler
2017-01-26 16:12 ` Carlos O'Donell
2017-01-26 16:22   ` Torvald Riegel
2018-06-29  6:55     ` FAIL nptl/tst-robustpi4 [BZ 23183] Stefan Liebler
2018-06-29  7:39       ` Florian Weimer
2018-06-29  8:22         ` Stefan Liebler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).