From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8853 invoked by alias); 15 Jul 2010 18:44:52 -0000 Received: (qmail 8842 invoked by uid 22791); 15 Jul 2010 18:44:50 -0000 X-SWARE-Spam-Status: No, hits=-5.7 required=5.0 tests=AWL,BAYES_00,FUZZY_AMBIEN,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TW_OC,T_FRT_PROFILE2,T_RP_MATCHES_RCVD,T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 15 Jul 2010 18:44:46 +0000 Received: from int-mx08.intmail.prod.int.phx2.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o6FIiaJO032723 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 15 Jul 2010 14:44:37 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx08.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o6FIiaTI023443; Thu, 15 Jul 2010 14:44:36 -0400 Received: from opsy.redhat.com (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id o6FIiXOl006696; Thu, 15 Jul 2010 14:44:34 -0400 Received: by opsy.redhat.com (Postfix, from userid 500) id 571BB3782A3; Thu, 15 Jul 2010 12:44:33 -0600 (MDT) From: Tom Tromey To: Thiago Jung Bauermann Cc: gdb@sourceware.org Subject: Re: GDB hangs with simple multi-threaded program on linux References: <1279208729.14577.21.camel@hactar> Date: Thu, 15 Jul 2010 18:44:00 -0000 In-Reply-To: <1279208729.14577.21.camel@hactar> (Thiago Jung Bauermann's message of "Thu, 15 Jul 2010 12:45:29 -0300") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Mailing-List: contact gdb-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sourceware.org X-SW-Source: 2010-07/txt/msg00048.txt.bz2 --=-=-= Content-length: 1032 >>>>> "Thiago" == Thiago Jung Bauermann writes: Thiago> I'm struggling with an issue which perhaps you already faced or Thiago> thought about... I asked around about this, and it turns out that we have a patch in the Fedora SRPM for it. The approach in this patch seems to be racy. Roland says we can do better if we enable exit tracing. I see this in linux-nat.c: /* Do not enable PTRACE_O_TRACEEXIT until GDB is more prepared to support read-only process state. */ I wonder what that means :-) Thiago> 1. Is it true that when the main thread exits but there are other Thiago> threads in the thread group, then no SIGCHLD is generated to notify GDB Thiago> that it exited (perhaps because such a SIGCHLD could be ambiguous and Thiago> mean that the whole process exited)? Yes, Roland said that no SIGCHLD is generated. Thiago> 2. Is there a way for GDB to wait on just the main thread instead of on Thiago> the whole process when it waits on a TID which is also the PID? I guess not. Tom --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=gdb-6.6-bz247354-leader-exit-fix.patch Content-Description: leader-exit-fix.patch Content-length: 4755 2007-07-08 Jan Kratochvil * linux-nat.c (linux_lwp_is_zombie): New function. (wait_lwp): Fix lockup on exit of the thread group leader. (linux_xfer_partial): Renamed to ... (linux_xfer_partial_lwp): ... here. (linux_xfer_partial): New function wrapping LINUX_XFER_PARTIAL_LWP. 2008-02-24 Jan Kratochvil Port to GDB-6.8pre. Index: gdb-6.8.50.20081209/gdb/linux-nat.c =================================================================== --- gdb-6.8.50.20081209.orig/gdb/linux-nat.c 2008-12-10 01:27:34.000000000 +0100 +++ gdb-6.8.50.20081209/gdb/linux-nat.c 2008-12-10 01:28:14.000000000 +0100 @@ -1981,6 +1981,31 @@ linux_handle_extended_wait (struct lwp_i _("unknown ptrace event %d"), event); } +static int +linux_lwp_is_zombie (long lwp) +{ + char buffer[MAXPATHLEN]; + FILE *procfile; + int retval = 0; + + sprintf (buffer, "/proc/%ld/status", lwp); + procfile = fopen (buffer, "r"); + if (procfile == NULL) + { + warning (_("unable to open /proc file '%s'"), buffer); + return 0; + } + while (fgets (buffer, sizeof (buffer), procfile) != NULL) + if (strcmp (buffer, "State:\tZ (zombie)\n") == 0) + { + retval = 1; + break; + } + fclose (procfile); + + return retval; +} + /* Wait for LP to stop. Returns the wait status, or 0 if the LWP has exited. */ @@ -1988,16 +2013,31 @@ static int wait_lwp (struct lwp_info *lp) { pid_t pid; - int status; + int status = 0; int thread_dead = 0; gdb_assert (!lp->stopped); gdb_assert (lp->status == 0); - pid = my_waitpid (GET_LWP (lp->ptid), &status, 0); - if (pid == -1 && errno == ECHILD) + /* Thread group leader may have exited but we would lock up by WAITPID as it + waits on all its threads; __WCLONE is not applicable for the leader. + The thread leader restrictions is only a performance optimization here. + LINUX_NAT_THREAD_ALIVE cannot be used here as it requires a STOPPED + process; it gets ESRCH both for the zombie and for running processes. */ + if (is_lwp (lp->ptid) && GET_PID (lp->ptid) == GET_LWP (lp->ptid) + && linux_lwp_is_zombie (GET_LWP (lp->ptid))) + { + thread_dead = 1; + if (debug_linux_nat) + fprintf_unfiltered (gdb_stdlog, "WL: Threads leader %s vanished.\n", + target_pid_to_str (lp->ptid)); + } + + if (!thread_dead) { - pid = my_waitpid (GET_LWP (lp->ptid), &status, __WCLONE); + pid = my_waitpid (GET_LWP (lp->ptid), &status, 0); + if (pid == -1 && errno == ECHILD) + pid = my_waitpid (GET_LWP (lp->ptid), &status, __WCLONE); if (pid == -1 && errno == ECHILD) { /* The thread has previously exited. We need to delete it @@ -4153,8 +4193,10 @@ linux_nat_xfer_osdata (struct target_ops return len; } +/* Transfer from the specific LWP currently set by PID of INFERIOR_PTID. */ + static LONGEST -linux_xfer_partial (struct target_ops *ops, enum target_object object, +linux_xfer_partial_lwp (struct target_ops *ops, enum target_object object, const char *annex, gdb_byte *readbuf, const gdb_byte *writebuf, ULONGEST offset, LONGEST len) { @@ -4201,6 +4243,45 @@ linux_xfer_partial (struct target_ops *o offset, len); } +/* nptl_db expects being able to transfer memory just by specifying PID. + After the thread group leader exists the Linux kernel turns the task + into zombie no longer permitting accesses to its memory. + Transfer the memory from an arbitrary LWP_LIST entry in such case. */ + +static LONGEST +linux_xfer_partial (struct target_ops *ops, enum target_object object, + const char *annex, gdb_byte *readbuf, + const gdb_byte *writebuf, ULONGEST offset, LONGEST len) +{ + LONGEST xfer; + struct lwp_info *lp; + /* Not using SAVE_INFERIOR_PTID already here for better performance. */ + struct cleanup *old_chain = NULL; + ptid_t inferior_ptid_orig = inferior_ptid; + + errno = 0; + xfer = linux_xfer_partial_lwp (ops, object, annex, readbuf, writebuf, + offset, len); + + for (lp = lwp_list; xfer == 0 && (errno == EACCES || errno == ESRCH) + && lp != NULL; lp = lp->next) + { + if (!is_lwp (lp->ptid) || ptid_equal (lp->ptid, inferior_ptid_orig)) + continue; + + if (old_chain == NULL) + old_chain = save_inferior_ptid (); + inferior_ptid = BUILD_LWP (GET_LWP (lp->ptid), GET_LWP (lp->ptid)); + errno = 0; + xfer = linux_xfer_partial_lwp (ops, object, annex, readbuf, writebuf, + offset, len); + } + + if (old_chain != NULL) + do_cleanups (old_chain); + return xfer; +} + /* Create a prototype generic GNU/Linux target. The client can override it with local methods. */ --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=gdb-6.6-bz247354-leader-exit-test.patch Content-Description: leader-exit-test.patch Content-length: 3546 2007-07-07 Jan Kratochvil * gdb.threads/leader-exit.c, gdb.threads/leader-exit.exp: New files. --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ ./gdb/testsuite/gdb.threads/leader-exit.c 7 Jul 2007 15:21:57 -0000 @@ -0,0 +1,47 @@ +/* Clean exit of the thread group leader should not break GDB. + + Copyright 2007 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, + Boston, MA 02111-1307, USA. */ + +#include +#include +#include + +static void *start (void *arg) +{ + for (;;) + pause (); + /* NOTREACHED */ + assert (0); + return arg; +} + +int main (void) +{ + pthread_t thread; + int i; + + i = pthread_create (&thread, NULL, start, NULL); /* create1 */ + assert (i == 0); + + pthread_exit (NULL); + /* NOTREACHED */ + assert (0); + return 0; +} --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ ./gdb/testsuite/gdb.threads/leader-exit.exp 7 Jul 2007 15:21:57 -0000 @@ -0,0 +1,64 @@ +# Copyright (C) 2007 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + +# Exit of the thread group leader should not break GDB. + +# This file was written by Jan Kratochvil . + +if $tracelevel then { + strace $tracelevel +} + +set testfile "leader-exit" +set srcfile ${testfile}.c +set binfile ${objdir}/${subdir}/${testfile} + +if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } { + return -1 +} + +gdb_exit +gdb_start +gdb_reinitialize_dir $srcdir/$subdir +gdb_load ${binfile} +gdb_run_cmd + +proc stop_process { description } { + global gdb_prompt + + # For this to work we must be sure to consume the "Continuing." + # message first, or GDB's signal handler may not be in place. + after 1000 {send_gdb "\003"} + gdb_expect { + -re "Program received signal SIGINT.*$gdb_prompt $" + { + pass $description + } + timeout + { + fail "$description (timeout)" + } + } +} + +# Prevent races. +sleep 8 + +stop_process "Threads could be stopped" + +gdb_test "info threads" \ + "\\* 2 Thread \[^\r\n\]* in \[^\r\n\]*" \ + "Single thread has been left" --=-=-=--