public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops
Date: Thu, 08 Jun 2023 16:49:34 +0100	[thread overview]
Message-ID: <87a5xaymwh.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-22-pedro@palves.net>

Pedro Alves <pedro@palves.net> writes:

> Running the
> gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
> added later in the series against gdbserver, after the
> TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
> into an infinite loop in stop_all_threads, leading to a timeout:
>
>   FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)
>
> The is really a latent bug, and it is about the fact that
> stop_all_threads stops listening to events from a target as soon as it
> sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
> TARGET_WAITKIND_NO_RESUMED may be delayed.  handle_no_resumed knows
> how to handle delayed no-resumed events, but stop_all_threads was
> never taught to.
>
> In more detail, here's what happens with that testcase:
>
> #1 - Multiple threads report breakpoint hits to gdb.
>
> #2 - gdb picks one events, and it's for thread 1.  All other stops are
>      left pending.  thread 1 needs to move past a breakpoint, so gdb
>      stops all threads to start an inline step over for thread 1.
>      While stopping threads, some of the threads that were still
>      running report events that are also left pending.
>
> #2 - gdb steps thread 1
>
> #3 - Thread 1 exits while stepping (it steps over an exit syscall),
>      gdbserver reports thread exit for thread 1
>
> #4 - Thread 1 was the last resumed thread, so gdbserver also reports
>      no-resumed:
>
>     [remote]   Notification received: Stop:w0;p3445d0.3445d3
>     [remote] Sending packet: $vStopped#55
>     [remote] Packet received: N
>     [remote] Sending packet: $vStopped#55
>     [remote] Packet received: OK
>
> #5 - gdb processes the thread exit for thread 1, finishes the step
>      over and restarts threads.
>
> #6 - gdb picks the next event to process out of one of the resumed
>      threads with pending events:
>
>     [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11
>
> #7 - This is again a breakpoint hit and the breakpoint needs to be
>      stepped over too, so gdb starts a step-over dance again.
>
> #8 - We reach stop_all_threads, which finds that some threads need to
>      be stopped.
>
> #9 - wait_one finally consumes the no-resumed event queue by #4.
>      Seeing this, wait_one disable target async, to stop listening for
>      events out of the remote target.
>
> #10 - We still haven't seen all the stops expected, so
>       stop_all_threads tries another iteration.
>
> #11 - Because the remote target is no longer async, and there are no
>       other targets, wait_one return no-resumed immediately without
>       polling the remote target.
>
> #12 - We still haven't seen all the stops expected, so
>       stop_all_threads tries another iteration.  goto #11, looping
>       forever.
>
> Fix this by explicitly enabling/re-enabling target async on targets
> that can async, before waiting for stops.
>
> Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f
> ---
>  gdb/infrun.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 81 insertions(+)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 2866962d2dc..31321d758da 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -5011,6 +5011,8 @@ wait_one ()
>        if (nfds == 0)
>  	{
>  	  /* No waitable targets left.  All must be stopped.  */
> +	  infrun_debug_printf ("no waitable targets left");
> +
>  	  target_waitstatus ws;
>  	  ws.set_no_resumed ();
>  	  return {nullptr, minus_one_ptid, std::move (ws)};
> @@ -5269,6 +5271,83 @@ handle_one (const wait_one_event &event)
>    return false;
>  }
>  
> +/* Helper for stop_all_threads.  wait_one waits for events until it
> +   sees a TARGET_WAITKIND_NO_RESUMED event.  When it sees one, it
> +   disables target_async for the target to stop waiting for events
> +   from it.  TARGET_WAITKIND_NO_RESUMED can be delayed though,
> +   consider, debugging against gdbserver:
> +
> +    #1 - Threads 1-5 are running, and thread 1 hits a breakpoint.
> +
> +    #2 - gdb processes the breakpoint hit for thread 1, stops all
> +	 threads, and steps thread 1 over the breakpoint.  while
> +	 stopping threads, some other threads reported interesting
> +	 events, which were left pending in the thread's objects
> +	 (infrun's queue).
> +
> +    #2 - Thread 1 exits (it stepped an exit syscall), and gdbserver
> +	 reports the thread exit for thread 1.	The event ends up in
> +	 remote's stop reply queue.
> +
> +    #3 - That was the last resumed thread, so gdbserver reports
> +	 no-resumed, and that event also ends up in remote's stop
> +	 reply queue, queued after the thread exit from #2.
> +
> +    #4 - gdb processes the thread exit event, which finishes the
> +	 step-over, and so gdb restarts all threads (threads with
> +	 pending events are left marked resumed, but aren't set
> +	 executing).  The no-resumed event is still left pending in
> +	 the remote stop reply queue.
> +
> +    #5 - Since there are now resumed threads with pending breakpoint
> +	 hits, gdb picks one at random to process next.
> +
> +    #5 - gdb picks the breakpoint hit for thread 2 this time, and that
> +	 breakpoint also needs to be stepped over, so gdb stops all
> +	 threads again.
> +
> +    #6 - stop_all_threads counts number of expected stops and calls
> +	 wait_one once for each.
> +
> +    #7 - The first wait_one call collects the no-resumed event from #3
> +	 above.
> +
> +    #9 - Seeing the no-resumed event, wait_one disables target async
> +	 for the remote target, to stop waiting for events from it.
> +	 wait_one from here on always return no-resumed directly
> +	 without reaching the target.
> +
> +    #10 - stop_all_threads still hasn't seen all the stops it expects,
> +	  so it does another pass.
> +
> +    #11 - Since the remote target is not async (disabled in #9),
> +	  wait_one doesn't wait on it, so it won't see the expected
> +	  stops, and instead returns no-resumed directly.
> +
> +    #12 - stop_all_threads still haven't seen all the stops, so it
> +	  does another pass.  goto #b, looping forever.

s/#b/#11/

Otherwise, LGTM.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

> +
> +   To handle this, we explicitly (re-)enable target async on all
> +   targets that can async every time stop_all_threads goes wait for
> +   the expected stops.  */
> +
> +static void
> +reenable_target_async ()
> +{
> +  for (inferior *inf : all_inferiors ())
> +    {
> +      process_stratum_target *target = inf->process_target ();
> +      if (target != nullptr
> +	  && target->threads_executing
> +	  && target->can_async_p ()
> +	  && !target->is_async_p ())
> +	{
> +	  switch_to_inferior_no_thread (inf);
> +	  target_async (1);
> +	}
> +    }
> +}
> +
>  /* See infrun.h.  */
>  
>  void
> @@ -5395,6 +5474,8 @@ stop_all_threads (const char *reason, inferior *inf)
>  	  if (pass > 0)
>  	    pass = -1;
>  
> +	  reenable_target_async ();
> +
>  	  for (int i = 0; i < waits_needed; i++)
>  	    {
>  	      wait_one_event event = wait_one ();
> -- 
> 2.36.0


  reply	other threads:[~2023-06-08 15:49 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:07       ` Andrew Burgess
2023-03-22 21:29         ` Andrew Burgess
2023-03-23 15:15           ` Pedro Alves
2023-03-27 12:40             ` Andrew Burgess
2023-03-27 16:21               ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:19       ` Andrew Burgess
2023-03-27 18:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50   ` Andrew Burgess
2023-04-04 13:57     ` Pedro Alves
2023-04-14 19:29       ` Pedro Alves
2023-05-26 15:04         ` Andrew Burgess
2023-11-13 14:04           ` Pedro Alves
2023-05-26 14:45       ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38   ` Andrew Burgess
2023-03-10 17:16     ` Pedro Alves
2023-03-21 16:06       ` Andrew Burgess
2023-11-13 14:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46   ` Andrew Burgess
2023-11-13 14:05     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07   ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25   ` Lancelot SIX
2023-03-10 19:16     ` Pedro Alves
2023-06-06 13:29       ` Andrew Burgess
2023-11-13 14:07         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12   ` Andrew Burgess
2023-11-13 14:07     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10   ` Andrew Burgess
2023-11-13 14:08     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42   ` Andrew Burgess
2023-11-13 14:09     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52   ` Andrew Burgess
2023-11-13 14:11     ` Pedro Alves
2023-12-15 18:15       ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49   ` Andrew Burgess [this message]
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33   ` Andrew Burgess
2023-11-13 14:13     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12  9:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53   ` Andrew Burgess
2023-11-13 14:13     ` Pedro Alves
2023-06-12 12:06   ` Andrew Burgess
2023-11-13 14:15     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20   ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05   ` Andrew Burgess
2023-03-10 17:21     ` Pedro Alves
2023-02-16 15:40   ` Andrew Burgess
2023-06-12 12:23     ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12   ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5xaymwh.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).