public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Pedro Alves <pedro@palves.net>
To: gdb-patches@sourceware.org
Subject: [PATCH v2 22/29] Report thread exit event for leader if reporting thread exit events
Date: Wed, 13 Jul 2022 23:24:26 +0100	[thread overview]
Message-ID: <20220713222433.374898-23-pedro@palves.net> (raw)
In-Reply-To: <20220713222433.374898-1-pedro@palves.net>

If GDB sets the GDB_TO_EXIT option on a thread, then if the thread
disappears from the thread list, GDB expects to shortly see a thread
exit event for it.  See e.g., here, in
remote_target::update_thread_list():

    /* Do not remove the thread if we've requested to be
       notified of its exit.  For example, the thread may be
       displaced stepping, infrun will need to handle the
       exit event, and displaced stepping info is recorded
       in the thread object.  If we deleted the thread now,
       we'd lose that info.  */
    if ((tp->thread_options () & GDB_TO_EXIT) != 0)
      continue;

There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders.  This can lead to
GDB getting confused.  For example, with a following patch that
enables GDB_TO_EXIT whenever schedlock is enabled, we'd see this
regression:

 $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
 ...
 Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
 ... some more cascading FAILs ...

gdb.log shows:

 (gdb) continue
 Continuing.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)

A passing run would have resulted in:

 (gdb) continue
 Continuing.
 No unwaited-for children left.
 (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits

note how the leader thread is not listed in the remote-reported XML
thread list below:

 (gdb) set debug remote 1
 (gdb) set debug infrun 1
 (gdb) info threads
   Id   Target Id                                Frame
 * 1    Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
   3    Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
 (gdb) c
 Continuing.
 [infrun] clear_proceed_status_thread: 1163850.1163850.0
 ...
     [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
     [remote] Sending packet: $QPassSignals:#f3
     [remote] Packet received: OK
     [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
     [remote] Packet received: OK
 ...
     [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
 ...
   [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
   [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
   [infrun] prepare_to_wait: prepare_to_wait
   [infrun] reset: reason=handling event
   [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
   [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
 [infrun] fetch_inferior_event: exit
 [infrun] fetch_inferior_event: enter
   [infrun] scoped_disable_commit_resumed: reason=handling event
   [infrun] random_pending_event_thread: None found.
   [remote] wait: enter
     [remote] Packet received: N
   [remote] wait: exit
   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
   [infrun] print_target_wait_results:   -1.0.0 [process -1],
   [infrun] print_target_wait_results:   status->kind = NO_RESUMED
   [infrun] handle_inferior_event: status->kind = NO_RESUMED
   [remote] Sending packet: $Hgp0.0#ad
   [remote] Packet received: OK
   [remote] Sending packet: $qXfer:threads:read::0,1000#92
   [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
   [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
 ...

... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event.  Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.

This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_TO_EXIT set, report a thread exit" invariant.
To do that, it takes a bit more code than one would imagine off hand,
due to the fact that we currently always report LWP exit pending
events as TARGET_WAITKIND_EXITED, and then decide whether to convert
it to TARGET_WAITKIND_THREAD_EXITED just before reporting the event to
GDBserver core.  For the zombie leader scenario described, we need to
record early on that we want to report a THREAD_EXITED event, and then
make sure that decision isn't lost along the way to reporting the
event to GDBserver core.

Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
---
 gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------
 gdbserver/linux-low.h  |  5 +--
 2 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 9858e228675..b40b50e64d2 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -279,7 +279,8 @@ int using_threads = 1;
 static int stabilizing_threads;
 
 static void unsuspend_all_lwps (struct lwp_info *except);
-static void mark_lwp_dead (struct lwp_info *lwp, int wstat);
+static void mark_lwp_dead (struct lwp_info *lwp, int wstat,
+			   bool thread_event);
 static int lwp_is_marked_dead (struct lwp_info *lwp);
 static int kill_lwp (unsigned long lwpid, int signo);
 static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info);
@@ -1800,10 +1801,12 @@ iterate_over_lwps (ptid_t filter,
   return get_thread_lwp (thread);
 }
 
-void
+bool
 linux_process_target::check_zombie_leaders ()
 {
-  for_each_process ([this] (process_info *proc)
+  bool new_pending_event = false;
+
+  for_each_process ([&] (process_info *proc)
     {
       pid_t leader_pid = pid_of (proc);
       lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid));
@@ -1872,9 +1875,19 @@ linux_process_target::check_zombie_leaders ()
 				"(it exited, or another thread execd), "
 				"deleting it.",
 				leader_pid);
-	  delete_lwp (leader_lp);
+
+	  thread_info *leader_thread = get_lwp_thread (leader_lp);
+	  if (report_exit_events_for (leader_thread))
+	    {
+	      mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true);
+	      new_pending_event = true;
+	    }
+	  else
+	    delete_lwp (leader_lp);
 	}
     });
+
+  return new_pending_event;
 }
 
 /* Callback for `find_thread'.  Returns the first LWP that is not
@@ -2334,7 +2347,7 @@ linux_process_target::filter_event (int lwpid, int wstat)
 	  /* Since events are serialized to GDB core, and we can't
 	     report this one right now.  Leave the status pending for
 	     the next time we're able to report it.  */
-	  mark_lwp_dead (child, wstat);
+	  mark_lwp_dead (child, wstat, false);
 	  return;
 	}
       else
@@ -2648,7 +2661,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid,
 
       /* Check for zombie thread group leaders.  Those can't be reaped
 	 until all other threads in the thread group are.  */
-      check_zombie_leaders ();
+      if (check_zombie_leaders ())
+	goto retry;
 
       auto not_stopped = [&] (thread_info *thread)
 	{
@@ -2895,6 +2909,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child,
   struct thread_info *thread = get_lwp_thread (event_child);
   ptid_t ptid = ptid_of (thread);
 
+  if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED)
+    {
+      /* We're reporting a thread exit for the leader.  The exit was
+	 detected by check_zombie_leaders.  */
+      gdb_assert (is_leader (thread));
+      gdb_assert (report_exit_events_for (thread));
+
+      delete_lwp (event_child);
+      return ptid;
+    }
+
   /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
      if a non-leader thread exits with a signal, we'd report it to the
      core which would interpret it as the whole-process exiting.
@@ -3014,7 +3039,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
     {
       if (WIFEXITED (w))
 	{
-	  ourstatus->set_exited (WEXITSTATUS (w));
+	  /* If we already have the exit recorded in waitstatus, use
+	     it.  This will happen when we detect a zombie leader ,
+	     when we had GDB_TO_EXIT enabled for it.  We want to
+	     report its exit as TARGET_WAITKIND_THREAD_EXITED, as the
+	     whole process hasn't exited yet.  */
+	  const target_waitstatus &ws = event_child->waitstatus;
+	  if (ws.kind () != TARGET_WAITKIND_IGNORE)
+	    {
+	      gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED
+			  || ws.kind () == TARGET_WAITKIND_THREAD_EXITED);
+	      *ourstatus = ws;
+	    }
+	  else
+	    ourstatus->set_exited (WEXITSTATUS (w));
 
 	  threads_debug_printf
 	    ("ret = %s, exited with retcode %d",
@@ -3722,8 +3760,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except)
   send_sigstop (thread, except);
 }
 
+/* Mark LWP dead, with WSTAT as exit status pending to report later.
+   If THREAD_EVENT is true, interpret WSTAT as a thread exit event
+   instead of a process exit event.  This is meaningful for the leader
+   thread, as we normally report a process-wide exit event when we see
+   the leader exit, and a thread exit event when we see any other
+   thread exit.  */
+
 static void
-mark_lwp_dead (struct lwp_info *lwp, int wstat)
+mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event)
 {
   /* Store the exit status for later.  */
   lwp->status_pending_p = 1;
@@ -3732,9 +3777,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat)
   /* Store in waitstatus as well, as there's nothing else to process
      for this event.  */
   if (WIFEXITED (wstat))
-    lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
+    {
+      if (thread_event)
+	lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat));
+      else
+	lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
+    }
   else if (WIFSIGNALED (wstat))
-    lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
+    {
+      gdb_assert (!thread_event);
+      lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
+    }
+  else
+    gdb_assert_not_reached ("unknown status kind");
 
   /* Prevent trying to stop it.  */
   lwp->stopped = 1;
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index 43c35009030..09a71b795d9 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -572,8 +572,9 @@ class linux_process_target : public process_stratum_target
 
   /* Detect zombie thread group leaders, and "exit" them.  We can't
      reap their exits until all other threads in the group have
-     exited.  */
-  void check_zombie_leaders ();
+     exited.  Returns true if we left any new event pending, false
+     otherwise.  */
+  bool check_zombie_leaders ();
 
   /* Convenience function that is called when we're about to return an
      event to the core.  If the event is an exit or signalled event,
-- 
2.36.0


  parent reply	other threads:[~2022-07-13 22:25 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 22:24 [PATCH v2 00/29] Step over thread clone and thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 01/29] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2022-07-20 20:21   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 02/29] linux-nat: introduce pending_status_str Pedro Alves
2022-07-20 20:38   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 03/29] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2022-07-21  0:45   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-21  1:35   ` Simon Marchi
2022-10-17 18:54     ` Pedro Alves
2022-10-18 12:40     ` [PATCH] Don't explicitly set clone child ptrace options (was: Re: [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED) Pedro Alves
2022-10-28 14:50       ` Simon Marchi
2022-12-12 20:13     ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-13 22:24 ` [PATCH v2 05/29] Support clone events in the remote protocol Pedro Alves
2022-07-21  2:25   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 06/29] Avoid duplicate QThreadEvents packets Pedro Alves
2022-07-21  2:30   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-21  3:14   ` Simon Marchi
2022-10-27 19:55     ` [PATCH] enum_flags to_string (was: Re: [PATCH v2 07/29] Thread options & clone events (core + remote)) Pedro Alves
2022-10-28 10:26       ` [PATCH v2] " Pedro Alves
2022-10-28 11:08         ` [PATCH v3] " Pedro Alves
2022-10-28 15:59           ` Simon Marchi
2022-10-28 18:23             ` Pedro Alves
2022-10-31 12:47               ` Simon Marchi
2022-11-07 17:26                 ` [PATCH v5] " Pedro Alves
2022-11-07 18:29                   ` Simon Marchi
2022-11-08 14:56                     ` Pedro Alves
2022-12-12 20:15     ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 08/29] Thread options & clone events (native Linux) Pedro Alves
2022-07-21 12:38   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 09/29] Thread options & clone events (Linux GDBserver) Pedro Alves
2022-07-21 13:11   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 10/29] gdbserver: Hide and don't detach pending clone children Pedro Alves
2022-07-13 22:24 ` [PATCH v2 11/29] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 12/29] Add test for stepping over clone syscall Pedro Alves
2022-07-13 22:24 ` [PATCH v2 13/29] all-stop/synchronous RSP support thread-exit events Pedro Alves
2022-07-13 22:24 ` [PATCH v2 14/29] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-07-13 22:24 ` [PATCH v2 15/29] Introduce GDB_TO_EXIT thread option, fix step-over-thread-exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 16/29] Implement GDB_TO_EXIT support for Linux GDBserver Pedro Alves
2022-07-13 22:24 ` [PATCH v2 17/29] Implement GDB_TO_EXIT support for native Linux Pedro Alves
2022-07-21 15:26   ` Simon Marchi
2022-12-12 20:17     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 18/29] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2022-07-21 18:12   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 19/29] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2022-07-21 18:21   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 20/29] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 21/29] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2022-07-14  5:28   ` Eli Zaretskii
2022-07-21 18:49   ` Simon Marchi
2022-12-12 20:19     ` Pedro Alves
2022-07-13 22:24 ` Pedro Alves [this message]
2022-07-13 22:24 ` [PATCH v2 23/29] Ignore failure to read PC when resuming Pedro Alves
2022-07-13 22:24 ` [PATCH v2 24/29] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2022-07-13 22:24 ` [PATCH v2 25/29] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 26/29] Document remote clone events, and QThreadOptions packet Pedro Alves
2022-07-14  5:27   ` Eli Zaretskii
2022-07-13 22:24 ` [PATCH v2 27/29] inferior::clear_thread_list always silent Pedro Alves
2022-07-13 22:24 ` [PATCH v2 28/29] Centralize "[Thread ...exited]" notifications Pedro Alves
2022-07-13 22:24 ` [PATCH v2 29/29] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2022-07-21 19:28 ` [PATCH v2 00/29] Step over thread clone and thread exit Simon Marchi
2022-10-03 13:46 ` Tom de Vries
2022-10-03 18:37   ` Tom de Vries
2022-12-12 20:20     ` Pedro Alves
2022-12-12 20:19   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220713222433.374898-23-pedro@palves.net \
    --to=pedro@palves.net \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).