public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 24/31] Report thread exit event for leader if reporting thread exit events
Date: Fri, 09 Jun 2023 14:11:01 +0100	[thread overview]
Message-ID: <87r0qkye56.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-25-pedro@palves.net>

Pedro Alves <pedro@palves.net> writes:

> If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
> thread disappears from the thread list, GDB expects to shortly see a
> thread exit event for it.  See e.g., here, in
> remote_target::update_thread_list():
>
>     /* Do not remove the thread if we've requested to be
>        notified of its exit.  For example, the thread may be
>        displaced stepping, infrun will need to handle the
>        exit event, and displaced stepping info is recorded
>        in the thread object.  If we deleted the thread now,
>        we'd lose that info.  */
>     if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
>       continue;
>
> There's one scenario that is deleting a thread from the
> remote/gdbserver thread list without ever reporting a corresponding
> thread exit event though -- check_zombie_leaders.  This can lead to
> GDB getting confused.  For example, with a following patch that
> enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
> this regression:
>
>  $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
>  ...
>  Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
>  FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
>  ... some more cascading FAILs ...
>
> gdb.log shows:
>
>  (gdb) continue
>  Continuing.
>  FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
>
> A passing run would have resulted in:
>
>  (gdb) continue
>  Continuing.
>  No unwaited-for children left.
>  (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits
>
> note how the leader thread is not listed in the remote-reported XML
> thread list below:
>
>  (gdb) set debug remote 1
>  (gdb) set debug infrun 1
>  (gdb) info threads
>    Id   Target Id                                Frame
>  * 1    Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
>    3    Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
>  (gdb) c
>  Continuing.
>  [infrun] clear_proceed_status_thread: 1163850.1163850.0
>  ...
>      [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
>      [remote] Sending packet: $QPassSignals:#f3
>      [remote] Packet received: OK
>      [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
>      [remote] Packet received: OK
>  ...
>      [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
>  ...
>    [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
>    [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
>    [infrun] prepare_to_wait: prepare_to_wait
>    [infrun] reset: reason=handling event
>    [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
>    [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
>  [infrun] fetch_inferior_event: exit
>  [infrun] fetch_inferior_event: enter
>    [infrun] scoped_disable_commit_resumed: reason=handling event
>    [infrun] random_pending_event_thread: None found.
>    [remote] wait: enter
>      [remote] Packet received: N
>    [remote] wait: exit
>    [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
>    [infrun] print_target_wait_results:   -1.0.0 [process -1],
>    [infrun] print_target_wait_results:   status->kind = NO_RESUMED
>    [infrun] handle_inferior_event: status->kind = NO_RESUMED
>    [remote] Sending packet: $Hgp0.0#ad
>    [remote] Packet received: OK
>    [remote] Sending packet: $qXfer:threads:read::0,1000#92
>    [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
>    [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
>  ...
>
> ... however, infrun decided there was a resumed thread still, so
> ignored the TARGET_WAITKIND_NO_RESUMED event.  Debugging GDB, we see
> that the "found resumed" thread that GDB finds, is the leader thread.
> Even though that thread is not on the remote-reported thread list, it
> is still on the GDB thread list, due to the special case in remote.c
> mentioned above.
>
> This commit addresses the issue by fixing GDBserver to report a thread
> exit event for the zombie leader too, i.e., making GDBserver respect
> the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
> invariant.  To do that, it takes a bit more code than one would
> imagine off hand, due to the fact that we currently always report LWP
> exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
> to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
> the event to GDBserver core.  For the zombie leader scenario
> described, we need to record early on that we want to report a
> THREAD_EXITED event, and then make sure that decision isn't lost along
> the way to reporting the event to GDBserver core.

LGTM.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

>
> Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
> ---
>  gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------
>  gdbserver/linux-low.h  |  5 +--
>  2 files changed, 68 insertions(+), 12 deletions(-)
>
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index bf6dc1d995a..c197846810c 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -279,7 +279,8 @@ int using_threads = 1;
>  static int stabilizing_threads;
>  
>  static void unsuspend_all_lwps (struct lwp_info *except);
> -static void mark_lwp_dead (struct lwp_info *lwp, int wstat);
> +static void mark_lwp_dead (struct lwp_info *lwp, int wstat,
> +			   bool thread_event);
>  static int lwp_is_marked_dead (struct lwp_info *lwp);
>  static int kill_lwp (unsigned long lwpid, int signo);
>  static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info);
> @@ -1800,10 +1801,12 @@ iterate_over_lwps (ptid_t filter,
>    return get_thread_lwp (thread);
>  }
>  
> -void
> +bool
>  linux_process_target::check_zombie_leaders ()
>  {
> -  for_each_process ([this] (process_info *proc)
> +  bool new_pending_event = false;
> +
> +  for_each_process ([&] (process_info *proc)
>      {
>        pid_t leader_pid = pid_of (proc);
>        lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid));
> @@ -1872,9 +1875,19 @@ linux_process_target::check_zombie_leaders ()
>  				"(it exited, or another thread execd), "
>  				"deleting it.",
>  				leader_pid);
> -	  delete_lwp (leader_lp);
> +
> +	  thread_info *leader_thread = get_lwp_thread (leader_lp);
> +	  if (report_exit_events_for (leader_thread))
> +	    {
> +	      mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true);
> +	      new_pending_event = true;
> +	    }
> +	  else
> +	    delete_lwp (leader_lp);
>  	}
>      });
> +
> +  return new_pending_event;
>  }
>  
>  /* Callback for `find_thread'.  Returns the first LWP that is not
> @@ -2333,7 +2346,7 @@ linux_process_target::filter_event (int lwpid, int wstat)
>  	  /* Since events are serialized to GDB core, and we can't
>  	     report this one right now.  Leave the status pending for
>  	     the next time we're able to report it.  */
> -	  mark_lwp_dead (child, wstat);
> +	  mark_lwp_dead (child, wstat, false);
>  	  return;
>  	}
>        else
> @@ -2646,7 +2659,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid,
>  
>        /* Check for zombie thread group leaders.  Those can't be reaped
>  	 until all other threads in the thread group are.  */
> -      check_zombie_leaders ();
> +      if (check_zombie_leaders ())
> +	goto retry;
>  
>        auto not_stopped = [&] (thread_info *thread)
>  	{
> @@ -2893,6 +2907,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child,
>    struct thread_info *thread = get_lwp_thread (event_child);
>    ptid_t ptid = ptid_of (thread);
>  
> +  if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED)
> +    {
> +      /* We're reporting a thread exit for the leader.  The exit was
> +	 detected by check_zombie_leaders.  */
> +      gdb_assert (is_leader (thread));
> +      gdb_assert (report_exit_events_for (thread));
> +
> +      delete_lwp (event_child);
> +      return ptid;
> +    }
> +
>    /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
>       if a non-leader thread exits with a signal, we'd report it to the
>       core which would interpret it as the whole-process exiting.
> @@ -3012,7 +3037,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
>      {
>        if (WIFEXITED (w))
>  	{
> -	  ourstatus->set_exited (WEXITSTATUS (w));
> +	  /* If we already have the exit recorded in waitstatus, use
> +	     it.  This will happen when we detect a zombie leader,
> +	     when we had GDB_THREAD_OPTION_EXIT enabled for it.  We
> +	     want to report its exit as TARGET_WAITKIND_THREAD_EXITED,
> +	     as the whole process hasn't exited yet.  */
> +	  const target_waitstatus &ws = event_child->waitstatus;
> +	  if (ws.kind () != TARGET_WAITKIND_IGNORE)
> +	    {
> +	      gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED
> +			  || ws.kind () == TARGET_WAITKIND_THREAD_EXITED);
> +	      *ourstatus = ws;
> +	    }
> +	  else
> +	    ourstatus->set_exited (WEXITSTATUS (w));
>  
>  	  threads_debug_printf
>  	    ("ret = %s, exited with retcode %d",
> @@ -3720,8 +3758,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except)
>    send_sigstop (thread, except);
>  }
>  
> +/* Mark LWP dead, with WSTAT as exit status pending to report later.
> +   If THREAD_EVENT is true, interpret WSTAT as a thread exit event
> +   instead of a process exit event.  This is meaningful for the leader
> +   thread, as we normally report a process-wide exit event when we see
> +   the leader exit, and a thread exit event when we see any other
> +   thread exit.  */
> +
>  static void
> -mark_lwp_dead (struct lwp_info *lwp, int wstat)
> +mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event)
>  {
>    /* Store the exit status for later.  */
>    lwp->status_pending_p = 1;
> @@ -3730,9 +3775,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat)
>    /* Store in waitstatus as well, as there's nothing else to process
>       for this event.  */
>    if (WIFEXITED (wstat))
> -    lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
> +    {
> +      if (thread_event)
> +	lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat));
> +      else
> +	lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
> +    }
>    else if (WIFSIGNALED (wstat))
> -    lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
> +    {
> +      gdb_assert (!thread_event);
> +      lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
> +    }
> +  else
> +    gdb_assert_not_reached ("unknown status kind");
>  
>    /* Prevent trying to stop it.  */
>    lwp->stopped = 1;
> diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
> index 33a14e15173..ffbc3c6f095 100644
> --- a/gdbserver/linux-low.h
> +++ b/gdbserver/linux-low.h
> @@ -574,8 +574,9 @@ class linux_process_target : public process_stratum_target
>  
>    /* Detect zombie thread group leaders, and "exit" them.  We can't
>       reap their exits until all other threads in the group have
> -     exited.  */
> -  void check_zombie_leaders ();
> +     exited.  Returns true if we left any new event pending, false
> +     otherwise.  */
> +  bool check_zombie_leaders ();
>  
>    /* Convenience function that is called when we're about to return an
>       event to the core.  If the event is an exit or signalled event,
> -- 
> 2.36.0


  reply	other threads:[~2023-06-09 13:11 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:07       ` Andrew Burgess
2023-03-22 21:29         ` Andrew Burgess
2023-03-23 15:15           ` Pedro Alves
2023-03-27 12:40             ` Andrew Burgess
2023-03-27 16:21               ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:19       ` Andrew Burgess
2023-03-27 18:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50   ` Andrew Burgess
2023-04-04 13:57     ` Pedro Alves
2023-04-14 19:29       ` Pedro Alves
2023-05-26 15:04         ` Andrew Burgess
2023-11-13 14:04           ` Pedro Alves
2023-05-26 14:45       ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38   ` Andrew Burgess
2023-03-10 17:16     ` Pedro Alves
2023-03-21 16:06       ` Andrew Burgess
2023-11-13 14:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46   ` Andrew Burgess
2023-11-13 14:05     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07   ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25   ` Lancelot SIX
2023-03-10 19:16     ` Pedro Alves
2023-06-06 13:29       ` Andrew Burgess
2023-11-13 14:07         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12   ` Andrew Burgess
2023-11-13 14:07     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10   ` Andrew Burgess
2023-11-13 14:08     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42   ` Andrew Burgess
2023-11-13 14:09     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52   ` Andrew Burgess
2023-11-13 14:11     ` Pedro Alves
2023-12-15 18:15       ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11   ` Andrew Burgess [this message]
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33   ` Andrew Burgess
2023-11-13 14:13     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12  9:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53   ` Andrew Burgess
2023-11-13 14:13     ` Pedro Alves
2023-06-12 12:06   ` Andrew Burgess
2023-11-13 14:15     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20   ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05   ` Andrew Burgess
2023-03-10 17:21     ` Pedro Alves
2023-02-16 15:40   ` Andrew Burgess
2023-06-12 12:23     ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12   ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0qkye56.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).