From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 24/31] Report thread exit event for leader if reporting thread exit events
Date: Fri, 09 Jun 2023 14:11:01 +0100 [thread overview]
Message-ID: <87r0qkye56.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-25-pedro@palves.net>
Pedro Alves <pedro@palves.net> writes:
> If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
> thread disappears from the thread list, GDB expects to shortly see a
> thread exit event for it. See e.g., here, in
> remote_target::update_thread_list():
>
> /* Do not remove the thread if we've requested to be
> notified of its exit. For example, the thread may be
> displaced stepping, infrun will need to handle the
> exit event, and displaced stepping info is recorded
> in the thread object. If we deleted the thread now,
> we'd lose that info. */
> if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
> continue;
>
> There's one scenario that is deleting a thread from the
> remote/gdbserver thread list without ever reporting a corresponding
> thread exit event though -- check_zombie_leaders. This can lead to
> GDB getting confused. For example, with a following patch that
> enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
> this regression:
>
> $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
> ...
> Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
> FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
> ... some more cascading FAILs ...
>
> gdb.log shows:
>
> (gdb) continue
> Continuing.
> FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
>
> A passing run would have resulted in:
>
> (gdb) continue
> Continuing.
> No unwaited-for children left.
> (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits
>
> note how the leader thread is not listed in the remote-reported XML
> thread list below:
>
> (gdb) set debug remote 1
> (gdb) set debug infrun 1
> (gdb) info threads
> Id Target Id Frame
> * 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
> 3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
> (gdb) c
> Continuing.
> [infrun] clear_proceed_status_thread: 1163850.1163850.0
> ...
> [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
> [remote] Sending packet: $QPassSignals:#f3
> [remote] Packet received: OK
> [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
> [remote] Packet received: OK
> ...
> [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
> ...
> [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
> [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
> [infrun] prepare_to_wait: prepare_to_wait
> [infrun] reset: reason=handling event
> [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
> [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
> [infrun] fetch_inferior_event: exit
> [infrun] fetch_inferior_event: enter
> [infrun] scoped_disable_commit_resumed: reason=handling event
> [infrun] random_pending_event_thread: None found.
> [remote] wait: enter
> [remote] Packet received: N
> [remote] wait: exit
> [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
> [infrun] print_target_wait_results: -1.0.0 [process -1],
> [infrun] print_target_wait_results: status->kind = NO_RESUMED
> [infrun] handle_inferior_event: status->kind = NO_RESUMED
> [remote] Sending packet: $Hgp0.0#ad
> [remote] Packet received: OK
> [remote] Sending packet: $qXfer:threads:read::0,1000#92
> [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
> [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
> ...
>
> ... however, infrun decided there was a resumed thread still, so
> ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see
> that the "found resumed" thread that GDB finds, is the leader thread.
> Even though that thread is not on the remote-reported thread list, it
> is still on the GDB thread list, due to the special case in remote.c
> mentioned above.
>
> This commit addresses the issue by fixing GDBserver to report a thread
> exit event for the zombie leader too, i.e., making GDBserver respect
> the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
> invariant. To do that, it takes a bit more code than one would
> imagine off hand, due to the fact that we currently always report LWP
> exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
> to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
> the event to GDBserver core. For the zombie leader scenario
> described, we need to record early on that we want to report a
> THREAD_EXITED event, and then make sure that decision isn't lost along
> the way to reporting the event to GDBserver core.
LGTM.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Thanks,
Andrew
>
> Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
> ---
> gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------
> gdbserver/linux-low.h | 5 +--
> 2 files changed, 68 insertions(+), 12 deletions(-)
>
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index bf6dc1d995a..c197846810c 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -279,7 +279,8 @@ int using_threads = 1;
> static int stabilizing_threads;
>
> static void unsuspend_all_lwps (struct lwp_info *except);
> -static void mark_lwp_dead (struct lwp_info *lwp, int wstat);
> +static void mark_lwp_dead (struct lwp_info *lwp, int wstat,
> + bool thread_event);
> static int lwp_is_marked_dead (struct lwp_info *lwp);
> static int kill_lwp (unsigned long lwpid, int signo);
> static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info);
> @@ -1800,10 +1801,12 @@ iterate_over_lwps (ptid_t filter,
> return get_thread_lwp (thread);
> }
>
> -void
> +bool
> linux_process_target::check_zombie_leaders ()
> {
> - for_each_process ([this] (process_info *proc)
> + bool new_pending_event = false;
> +
> + for_each_process ([&] (process_info *proc)
> {
> pid_t leader_pid = pid_of (proc);
> lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid));
> @@ -1872,9 +1875,19 @@ linux_process_target::check_zombie_leaders ()
> "(it exited, or another thread execd), "
> "deleting it.",
> leader_pid);
> - delete_lwp (leader_lp);
> +
> + thread_info *leader_thread = get_lwp_thread (leader_lp);
> + if (report_exit_events_for (leader_thread))
> + {
> + mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true);
> + new_pending_event = true;
> + }
> + else
> + delete_lwp (leader_lp);
> }
> });
> +
> + return new_pending_event;
> }
>
> /* Callback for `find_thread'. Returns the first LWP that is not
> @@ -2333,7 +2346,7 @@ linux_process_target::filter_event (int lwpid, int wstat)
> /* Since events are serialized to GDB core, and we can't
> report this one right now. Leave the status pending for
> the next time we're able to report it. */
> - mark_lwp_dead (child, wstat);
> + mark_lwp_dead (child, wstat, false);
> return;
> }
> else
> @@ -2646,7 +2659,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid,
>
> /* Check for zombie thread group leaders. Those can't be reaped
> until all other threads in the thread group are. */
> - check_zombie_leaders ();
> + if (check_zombie_leaders ())
> + goto retry;
>
> auto not_stopped = [&] (thread_info *thread)
> {
> @@ -2893,6 +2907,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child,
> struct thread_info *thread = get_lwp_thread (event_child);
> ptid_t ptid = ptid_of (thread);
>
> + if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED)
> + {
> + /* We're reporting a thread exit for the leader. The exit was
> + detected by check_zombie_leaders. */
> + gdb_assert (is_leader (thread));
> + gdb_assert (report_exit_events_for (thread));
> +
> + delete_lwp (event_child);
> + return ptid;
> + }
> +
> /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
> if a non-leader thread exits with a signal, we'd report it to the
> core which would interpret it as the whole-process exiting.
> @@ -3012,7 +3037,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
> {
> if (WIFEXITED (w))
> {
> - ourstatus->set_exited (WEXITSTATUS (w));
> + /* If we already have the exit recorded in waitstatus, use
> + it. This will happen when we detect a zombie leader,
> + when we had GDB_THREAD_OPTION_EXIT enabled for it. We
> + want to report its exit as TARGET_WAITKIND_THREAD_EXITED,
> + as the whole process hasn't exited yet. */
> + const target_waitstatus &ws = event_child->waitstatus;
> + if (ws.kind () != TARGET_WAITKIND_IGNORE)
> + {
> + gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED
> + || ws.kind () == TARGET_WAITKIND_THREAD_EXITED);
> + *ourstatus = ws;
> + }
> + else
> + ourstatus->set_exited (WEXITSTATUS (w));
>
> threads_debug_printf
> ("ret = %s, exited with retcode %d",
> @@ -3720,8 +3758,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except)
> send_sigstop (thread, except);
> }
>
> +/* Mark LWP dead, with WSTAT as exit status pending to report later.
> + If THREAD_EVENT is true, interpret WSTAT as a thread exit event
> + instead of a process exit event. This is meaningful for the leader
> + thread, as we normally report a process-wide exit event when we see
> + the leader exit, and a thread exit event when we see any other
> + thread exit. */
> +
> static void
> -mark_lwp_dead (struct lwp_info *lwp, int wstat)
> +mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event)
> {
> /* Store the exit status for later. */
> lwp->status_pending_p = 1;
> @@ -3730,9 +3775,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat)
> /* Store in waitstatus as well, as there's nothing else to process
> for this event. */
> if (WIFEXITED (wstat))
> - lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
> + {
> + if (thread_event)
> + lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat));
> + else
> + lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
> + }
> else if (WIFSIGNALED (wstat))
> - lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
> + {
> + gdb_assert (!thread_event);
> + lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
> + }
> + else
> + gdb_assert_not_reached ("unknown status kind");
>
> /* Prevent trying to stop it. */
> lwp->stopped = 1;
> diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
> index 33a14e15173..ffbc3c6f095 100644
> --- a/gdbserver/linux-low.h
> +++ b/gdbserver/linux-low.h
> @@ -574,8 +574,9 @@ class linux_process_target : public process_stratum_target
>
> /* Detect zombie thread group leaders, and "exit" them. We can't
> reap their exits until all other threads in the group have
> - exited. */
> - void check_zombie_leaders ();
> + exited. Returns true if we left any new event pending, false
> + otherwise. */
> + bool check_zombie_leaders ();
>
> /* Convenience function that is called when we're about to return an
> event to the core. If the event is an exit or signalled event,
> --
> 2.36.0
next prev parent reply other threads:[~2023-06-09 13:11 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:07 ` Andrew Burgess
2023-03-22 21:29 ` Andrew Burgess
2023-03-23 15:15 ` Pedro Alves
2023-03-27 12:40 ` Andrew Burgess
2023-03-27 16:21 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:19 ` Andrew Burgess
2023-03-27 18:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50 ` Andrew Burgess
2023-04-04 13:57 ` Pedro Alves
2023-04-14 19:29 ` Pedro Alves
2023-05-26 15:04 ` Andrew Burgess
2023-11-13 14:04 ` Pedro Alves
2023-05-26 14:45 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38 ` Andrew Burgess
2023-03-10 17:16 ` Pedro Alves
2023-03-21 16:06 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07 ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25 ` Lancelot SIX
2023-03-10 19:16 ` Pedro Alves
2023-06-06 13:29 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10 ` Andrew Burgess
2023-11-13 14:08 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42 ` Andrew Burgess
2023-11-13 14:09 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52 ` Andrew Burgess
2023-11-13 14:11 ` Pedro Alves
2023-12-15 18:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11 ` Andrew Burgess [this message]
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33 ` Andrew Burgess
2023-11-13 14:13 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12 9:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53 ` Andrew Burgess
2023-11-13 14:13 ` Pedro Alves
2023-06-12 12:06 ` Andrew Burgess
2023-11-13 14:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05 ` Andrew Burgess
2023-03-10 17:21 ` Pedro Alves
2023-02-16 15:40 ` Andrew Burgess
2023-06-12 12:23 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12 ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r0qkye56.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).