From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops
Date: Thu, 08 Jun 2023 16:49:34 +0100 [thread overview]
Message-ID: <87a5xaymwh.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-22-pedro@palves.net>
Pedro Alves <pedro@palves.net> writes:
> Running the
> gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
> added later in the series against gdbserver, after the
> TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
> into an infinite loop in stop_all_threads, leading to a timeout:
>
> FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)
>
> The is really a latent bug, and it is about the fact that
> stop_all_threads stops listening to events from a target as soon as it
> sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
> TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows
> how to handle delayed no-resumed events, but stop_all_threads was
> never taught to.
>
> In more detail, here's what happens with that testcase:
>
> #1 - Multiple threads report breakpoint hits to gdb.
>
> #2 - gdb picks one events, and it's for thread 1. All other stops are
> left pending. thread 1 needs to move past a breakpoint, so gdb
> stops all threads to start an inline step over for thread 1.
> While stopping threads, some of the threads that were still
> running report events that are also left pending.
>
> #2 - gdb steps thread 1
>
> #3 - Thread 1 exits while stepping (it steps over an exit syscall),
> gdbserver reports thread exit for thread 1
>
> #4 - Thread 1 was the last resumed thread, so gdbserver also reports
> no-resumed:
>
> [remote] Notification received: Stop:w0;p3445d0.3445d3
> [remote] Sending packet: $vStopped#55
> [remote] Packet received: N
> [remote] Sending packet: $vStopped#55
> [remote] Packet received: OK
>
> #5 - gdb processes the thread exit for thread 1, finishes the step
> over and restarts threads.
>
> #6 - gdb picks the next event to process out of one of the resumed
> threads with pending events:
>
> [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11
>
> #7 - This is again a breakpoint hit and the breakpoint needs to be
> stepped over too, so gdb starts a step-over dance again.
>
> #8 - We reach stop_all_threads, which finds that some threads need to
> be stopped.
>
> #9 - wait_one finally consumes the no-resumed event queue by #4.
> Seeing this, wait_one disable target async, to stop listening for
> events out of the remote target.
>
> #10 - We still haven't seen all the stops expected, so
> stop_all_threads tries another iteration.
>
> #11 - Because the remote target is no longer async, and there are no
> other targets, wait_one return no-resumed immediately without
> polling the remote target.
>
> #12 - We still haven't seen all the stops expected, so
> stop_all_threads tries another iteration. goto #11, looping
> forever.
>
> Fix this by explicitly enabling/re-enabling target async on targets
> that can async, before waiting for stops.
>
> Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f
> ---
> gdb/infrun.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 81 insertions(+)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 2866962d2dc..31321d758da 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -5011,6 +5011,8 @@ wait_one ()
> if (nfds == 0)
> {
> /* No waitable targets left. All must be stopped. */
> + infrun_debug_printf ("no waitable targets left");
> +
> target_waitstatus ws;
> ws.set_no_resumed ();
> return {nullptr, minus_one_ptid, std::move (ws)};
> @@ -5269,6 +5271,83 @@ handle_one (const wait_one_event &event)
> return false;
> }
>
> +/* Helper for stop_all_threads. wait_one waits for events until it
> + sees a TARGET_WAITKIND_NO_RESUMED event. When it sees one, it
> + disables target_async for the target to stop waiting for events
> + from it. TARGET_WAITKIND_NO_RESUMED can be delayed though,
> + consider, debugging against gdbserver:
> +
> + #1 - Threads 1-5 are running, and thread 1 hits a breakpoint.
> +
> + #2 - gdb processes the breakpoint hit for thread 1, stops all
> + threads, and steps thread 1 over the breakpoint. while
> + stopping threads, some other threads reported interesting
> + events, which were left pending in the thread's objects
> + (infrun's queue).
> +
> + #2 - Thread 1 exits (it stepped an exit syscall), and gdbserver
> + reports the thread exit for thread 1. The event ends up in
> + remote's stop reply queue.
> +
> + #3 - That was the last resumed thread, so gdbserver reports
> + no-resumed, and that event also ends up in remote's stop
> + reply queue, queued after the thread exit from #2.
> +
> + #4 - gdb processes the thread exit event, which finishes the
> + step-over, and so gdb restarts all threads (threads with
> + pending events are left marked resumed, but aren't set
> + executing). The no-resumed event is still left pending in
> + the remote stop reply queue.
> +
> + #5 - Since there are now resumed threads with pending breakpoint
> + hits, gdb picks one at random to process next.
> +
> + #5 - gdb picks the breakpoint hit for thread 2 this time, and that
> + breakpoint also needs to be stepped over, so gdb stops all
> + threads again.
> +
> + #6 - stop_all_threads counts number of expected stops and calls
> + wait_one once for each.
> +
> + #7 - The first wait_one call collects the no-resumed event from #3
> + above.
> +
> + #9 - Seeing the no-resumed event, wait_one disables target async
> + for the remote target, to stop waiting for events from it.
> + wait_one from here on always return no-resumed directly
> + without reaching the target.
> +
> + #10 - stop_all_threads still hasn't seen all the stops it expects,
> + so it does another pass.
> +
> + #11 - Since the remote target is not async (disabled in #9),
> + wait_one doesn't wait on it, so it won't see the expected
> + stops, and instead returns no-resumed directly.
> +
> + #12 - stop_all_threads still haven't seen all the stops, so it
> + does another pass. goto #b, looping forever.
s/#b/#11/
Otherwise, LGTM.
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Thanks,
Andrew
> +
> + To handle this, we explicitly (re-)enable target async on all
> + targets that can async every time stop_all_threads goes wait for
> + the expected stops. */
> +
> +static void
> +reenable_target_async ()
> +{
> + for (inferior *inf : all_inferiors ())
> + {
> + process_stratum_target *target = inf->process_target ();
> + if (target != nullptr
> + && target->threads_executing
> + && target->can_async_p ()
> + && !target->is_async_p ())
> + {
> + switch_to_inferior_no_thread (inf);
> + target_async (1);
> + }
> + }
> +}
> +
> /* See infrun.h. */
>
> void
> @@ -5395,6 +5474,8 @@ stop_all_threads (const char *reason, inferior *inf)
> if (pass > 0)
> pass = -1;
>
> + reenable_target_async ();
> +
> for (int i = 0; i < waits_needed; i++)
> {
> wait_one_event event = wait_one ();
> --
> 2.36.0
next prev parent reply other threads:[~2023-06-08 15:49 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:07 ` Andrew Burgess
2023-03-22 21:29 ` Andrew Burgess
2023-03-23 15:15 ` Pedro Alves
2023-03-27 12:40 ` Andrew Burgess
2023-03-27 16:21 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:19 ` Andrew Burgess
2023-03-27 18:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50 ` Andrew Burgess
2023-04-04 13:57 ` Pedro Alves
2023-04-14 19:29 ` Pedro Alves
2023-05-26 15:04 ` Andrew Burgess
2023-11-13 14:04 ` Pedro Alves
2023-05-26 14:45 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38 ` Andrew Burgess
2023-03-10 17:16 ` Pedro Alves
2023-03-21 16:06 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07 ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25 ` Lancelot SIX
2023-03-10 19:16 ` Pedro Alves
2023-06-06 13:29 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10 ` Andrew Burgess
2023-11-13 14:08 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42 ` Andrew Burgess
2023-11-13 14:09 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52 ` Andrew Burgess
2023-11-13 14:11 ` Pedro Alves
2023-12-15 18:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49 ` Andrew Burgess [this message]
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33 ` Andrew Burgess
2023-11-13 14:13 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12 9:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53 ` Andrew Burgess
2023-11-13 14:13 ` Pedro Alves
2023-06-12 12:06 ` Andrew Burgess
2023-11-13 14:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05 ` Andrew Burgess
2023-03-10 17:21 ` Pedro Alves
2023-02-16 15:40 ` Andrew Burgess
2023-06-12 12:23 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12 ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a5xaymwh.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).