From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21928 invoked by alias); 16 Dec 2014 16:54:06 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 21867 invoked by uid 89); 16 Dec 2014 16:54:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 16 Dec 2014 16:54:02 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id sBGGs0UW024991 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Tue, 16 Dec 2014 11:54:01 -0500 Received: from brno.lan (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id sBGGrt54018425 for ; Tue, 16 Dec 2014 11:53:59 -0500 From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 3/5] libthread_db: Skip attaching to terminated and joined threads Date: Tue, 16 Dec 2014 16:54:00 -0000 Message-Id: <1418748834-27545-4-git-send-email-palves@redhat.com> In-Reply-To: <1418748834-27545-1-git-send-email-palves@redhat.com> References: <1418748834-27545-1-git-send-email-palves@redhat.com> X-SW-Source: 2014-12/txt/msg00450.txt.bz2 I wrote a test that attaches to a program that constantly spawns short-lived threads, which exposed several issues. This is one of them. On GNU/Linux, attaching to a multi-threaded program sometimes prints out warnings like: ... [New LWP 20700] warning: unable to open /proc file '/proc/-1/status' [New LWP 20850] [New LWP 21019] ... That happens because when a thread exits, and is joined, glibc does: nptl/pthread_join.c: pthread_join () { ... if (__glibc_likely (result == 0)) { /* We mark the thread as terminated and as joined. */ pd->tid = -1; ... /* Free the TCB. */ __free_tcb (pd); } So if we attach or interrupt the program (which does an implicit "info threads") at just the right (or rather, wrong) time, we can find and return threads in the libthread_db/pthreads thread list with kernel thread ID -1. I've filed glibc PR nptl/17707 for this. You'll find more info there. This patch handles this as a special case in GDB. This is actually more than just a cosmetic issue. lin_lwp_attach_lwp will think that this -1 is an LWP we're not attached to yet, and after failing to attach will try to check we were already attached to the process, using a waitpid call, which in this case ends up being "waitpid (-1, ...", which obviously results in GDB potentially discarding an event when it shouldn't... Tested on x86_64 Fedora 20, native and gdbserver. gdb/gdbserver/ 2014-12-16 Pedro Alves * thread-db.c (find_new_threads_callback): Ignore thread if the kernel thread ID is -1. gdb/ 2014-12-16 Pedro Alves * linux-nat.c (lin_lwp_attach_lwp): Assert that the lwp id we're about to wait for is > 0. * linux-thread-db.c (find_new_threads_callback): Ignore thread if the kernel thread ID is -1. --- gdb/gdbserver/thread-db.c | 11 +++++++++++ gdb/linux-nat.c | 1 + gdb/linux-thread-db.c | 11 +++++++++++ 3 files changed, 23 insertions(+) diff --git a/gdb/gdbserver/thread-db.c b/gdb/gdbserver/thread-db.c index ac94892..2d9980d 100644 --- a/gdb/gdbserver/thread-db.c +++ b/gdb/gdbserver/thread-db.c @@ -396,6 +396,17 @@ find_new_threads_callback (const td_thrhandle_t *th_p, void *data) if (err != TD_OK) error ("Cannot get thread info: %s", thread_db_err_str (err)); + if (ti.ti_lid == -1) + { + /* A thread with kernel thread ID -1 is either a thread that + exited and was joined, or a thread that is being created but + hasn't started yet, and that is reusing the tcb/stack of a + thread that previously exited and was joined. (glibc marks + terminated and joined threads with kernel thread ID -1. See + glibc PR17707. */ + return 0; + } + /* Check for zombies. */ if (ti.ti_state == TD_THR_UNKNOWN || ti.ti_state == TD_THR_ZOMBIE) return 0; diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index c6b5280..828064f 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -1023,6 +1023,7 @@ lin_lwp_attach_lwp (ptid_t ptid) /* See if we've got a stop for this new child pending. If so, we're already attached. */ + gdb_assert (lwpid > 0); new_pid = my_waitpid (lwpid, &status, WNOHANG); if (new_pid == -1 && errno == ECHILD) new_pid = my_waitpid (lwpid, &status, __WCLONE | WNOHANG); diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c index a405603..4b26984 100644 --- a/gdb/linux-thread-db.c +++ b/gdb/linux-thread-db.c @@ -1606,6 +1606,17 @@ find_new_threads_callback (const td_thrhandle_t *th_p, void *data) error (_("find_new_threads_callback: cannot get thread info: %s"), thread_db_err_str (err)); + if (ti.ti_lid == -1) + { + /* A thread with kernel thread ID -1 is either a thread that + exited and was joined, or a thread that is being created but + hasn't started yet, and that is reusing the tcb/stack of a + thread that previously exited and was joined. (glibc marks + terminated and joined threads with kernel thread ID -1. See + glibc PR17707. */ + return 0; + } + if (ti.ti_tid == 0) { /* A thread ID of zero means that this is the main thread, but -- 1.9.3