public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Kevin Buettner <kevinb@redhat.com>
To: gdb-patches@sourceware.org
Cc: Kevin Buettner <kevinb@redhat.com>
Subject: [PATCH 1/2] linux-nat.c, linux-fork.c: Fix detach bug when lwp has exited/terminated
Date: Sat, 11 Nov 2023 15:26:28 -0700	[thread overview]
Message-ID: <20231111223046.109727-2-kevinb@redhat.com> (raw)
In-Reply-To: <20231111223046.109727-1-kevinb@redhat.com>

When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps.  Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger.  The line in question looks
like this:

  gdb_assert (lwp_is_stopped (lwp));

For this case, the lwp isn't stopped - it's dead.

The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting.  Here's a link to the bug:

https://bugzilla.redhat.com/show_bug.cgi?id=2192169

The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably.  I have found it necessary to
run the command a bunch of times before seeing the bug.  (I usually
see it within 5-10 repetitions.)  If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!

  cat </dev/zero >/dev/null & \
  (sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
  sleep 1 ; \
  gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
      -ex c /usr/bin/cat `pgrep cat`

So, basically, the idea here is to kill both gdb and cat at roughly
the same time.  If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger.  The relevant part of the backtrace
looks like this:

  #8  0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
      at gdb/nat/x86-linux-dregs.c:146
  #9  0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
      at gdb/nat/x86-linux.c:81
  #10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
      this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
      at gdb/x86-linux-nat.h:70
  #11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
      at gdb/linux-nat.c:1374
  #12 0x000000000081a85f in linux_nat_target::detach (
      this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
      at gdb/linux-nat.c:1450
  #13 0x000000000083a23b in thread_db_target::detach (
      this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
      at gdb/linux-thread-db.c:1385
  #14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
      at gdb/target.c:2526
  #15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
      at gdb/top.c:1659
  #16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
      at gdb/top.c:1762
  #17 0x000000000070829c in async_sigterm_handler (arg=0x0)
      at gdb/event-top.c:1141

My colleague, Andrew Burgess, has done some recent work on other
problems with detach.  Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well.  In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally.  Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.

For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.

For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c.  With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.

I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger.  (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)

Finally, the #include for linux-fork.h had to be moved after that
of linux-nat.c so that the type of lwp_info would be known.

Lastly, I'll note that the checkpoint code needs some work with
regard to background execution.  This patch doesn't attempt to
fix that problem, but it doesn't make it any worse.  It does
slightly improve the situation with detach because, due to the
check noted above, linux_fork_detach() won't be called for the
wrong inferior when there are multiple inferiors.  (I haven't
checked closely, but I don't think that the rest of the checkpoint
code works correctly when there are multiple inferiors.)
---
 gdb/linux-fork.c | 22 ++++++++++++++++------
 gdb/linux-fork.h |  2 +-
 gdb/linux-nat.c  | 17 +++++++++++++++--
 3 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/gdb/linux-fork.c b/gdb/linux-fork.c
index 52e385411c7..73d086938bd 100644
--- a/gdb/linux-fork.c
+++ b/gdb/linux-fork.c
@@ -25,13 +25,14 @@
 #include "gdbcmd.h"
 #include "infcall.h"
 #include "objfiles.h"
-#include "linux-fork.h"
 #include "linux-nat.h"
+#include "linux-fork.h"
 #include "gdbthread.h"
 #include "source.h"
 
 #include "nat/gdb_ptrace.h"
 #include "gdbsupport/gdb_wait.h"
+#include "target/waitstatus.h"
 #include <dirent.h>
 #include <ctype.h>
 
@@ -361,15 +362,24 @@ linux_fork_mourn_inferior (void)
    the first available.  */
 
 void
-linux_fork_detach (int from_tty)
+linux_fork_detach (int from_tty, lwp_info *lp)
 {
+  gdb_assert (lp != nullptr);
+  gdb_assert (lp->ptid == inferior_ptid);
+
   /* OK, inferior_ptid is the one we are detaching from.  We need to
      delete it from the fork_list, and switch to the next available
-     fork.  */
+     fork.  But before doing the detach, do make sure that the lwp
+     hasn't exited or been terminated first.  */
 
-  if (ptrace (PTRACE_DETACH, inferior_ptid.pid (), 0, 0))
-    error (_("Unable to detach %s"),
-	   target_pid_to_str (inferior_ptid).c_str ());
+  if (lp->waitstatus.kind () != TARGET_WAITKIND_EXITED
+      && lp->waitstatus.kind () != TARGET_WAITKIND_THREAD_EXITED
+      && lp->waitstatus.kind () != TARGET_WAITKIND_SIGNALLED)
+    {
+      if (ptrace (PTRACE_DETACH, inferior_ptid.pid (), 0, 0))
+	error (_("Unable to detach %s"),
+	       target_pid_to_str (inferior_ptid).c_str ());
+    }
 
   delete_fork (inferior_ptid);
 
diff --git a/gdb/linux-fork.h b/gdb/linux-fork.h
index 5a593fca91e..6e8f5ab7d0b 100644
--- a/gdb/linux-fork.h
+++ b/gdb/linux-fork.h
@@ -25,7 +25,7 @@ extern void add_fork (pid_t);
 extern struct fork_info *find_fork_pid (pid_t);
 extern void linux_fork_killall (void);
 extern void linux_fork_mourn_inferior (void);
-extern void linux_fork_detach (int);
+extern void linux_fork_detach (int, lwp_info *);
 extern int forks_exist_p (void);
 extern int linux_fork_checkpointing_p (int);
 
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 1c9756c18bd..bf2da572de0 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1354,6 +1354,19 @@ detach_one_lwp (struct lwp_info *lp, int *signo_p)
       lp->signalled = 0;
     }
 
+  /* If the lwp has exited or was terminated due to a signal, there's
+     nothing left to do.  */
+  if (lp->waitstatus.kind () == TARGET_WAITKIND_EXITED
+      || lp->waitstatus.kind () == TARGET_WAITKIND_THREAD_EXITED
+      || lp->waitstatus.kind () == TARGET_WAITKIND_SIGNALLED)
+    {
+      linux_nat_debug_printf
+        ("Can't detach %s - it has exited or was terminated: %s.",
+	 lp->ptid.to_string ().c_str (),
+	 lp->waitstatus.to_string ().c_str ());
+      return;
+    }
+
   if (signo_p == NULL)
     {
       /* Pass on any pending signal for this LWP.  */
@@ -1427,13 +1440,13 @@ linux_nat_target::detach (inferior *inf, int from_tty)
   gdb_assert (num_lwps (pid) == 1
 	      || (target_is_non_stop_p () && num_lwps (pid) == 0));
 
-  if (forks_exist_p ())
+  if (pid == inferior_ptid.pid () && forks_exist_p ())
     {
       /* Multi-fork case.  The current inferior_ptid is being detached
 	 from, but there are other viable forks to debug.  Detach from
 	 the current fork, and context-switch to the first
 	 available.  */
-      linux_fork_detach (from_tty);
+      linux_fork_detach (from_tty, find_lwp_pid (ptid_t (pid)));
     }
   else
     {
-- 
2.41.0


  reply	other threads:[~2023-11-11 22:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-11 22:26 [PATCH 0/2] " Kevin Buettner
2023-11-11 22:26 ` Kevin Buettner [this message]
2023-11-13 10:58   ` [PATCH 1/2] linux-nat.c, linux-fork.c: " Andrew Burgess
2023-11-14  1:36     ` Kevin Buettner
2023-11-14  5:48       ` Kevin Buettner
2023-11-11 22:26 ` [PATCH 2/2] New test: gdb.base/process-dies-while-detaching.exp Kevin Buettner
2023-11-13 11:04   ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231111223046.109727-2-kevinb@redhat.com \
    --to=kevinb@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).