public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 25/31] Ignore failure to read PC when resuming
Date: Sat, 10 Jun 2023 11:33:14 +0100	[thread overview]
Message-ID: <87ilbvy5cl.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-26-pedro@palves.net>

Pedro Alves <pedro@palves.net> writes:

> If GDB sets a GDB_THREAD_OPTION_EXIT option on a thread, and the
> thread exits, the server reports the corresponding thread exit event,
> and forgets about the thread, i.e., removes the exited thread from its
> thread list.
>
> On the GDB side, GDB set the GDB_THREAD_OPTION_EXIT option on a
> thread, GDB delays deleting the thread from its thread list until it
> sees the corresponding thread exit event, as that event needs special
> handling in infrun.
>
> When a thread disappears from the target, but it still exists on GDB's
> thread list, in all-stop RSP mode, it can happen that GDB ends up
> trying to resume such an already-exited-thread that GDB doesn't yet
> know is gone.  When that happens, against GDBserver, typically the
> ongoing execution command fails with this error:

I'm slightly confused here.  If GDB doesn't know the thread has exited
doesn't that mean the server hasn't yet reported the exit, and so should
be holding onto the thread?

I wanted to investigate this a bit more to try and understand more about
what's going on, but I couldn't find a test that was triggering the code
added in this patch.  Do you know if there's a test I can run to see
this issue?

>
>  ...
>  PC register is not available
>  (gdb)
>
> At the remote protocol level, we may see e.g., this:
>
>       [remote] Packet received: w0;p97479.978d2
>     [remote] wait: exit
>     [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
>     [infrun] print_target_wait_results:   619641.620754.0 [Thread 619641.620754],
>     [infrun] print_target_wait_results:   status->kind = THREAD_EXITED, exit_status = 0
>     [infrun] handle_inferior_event: status->kind = THREAD_EXITED, exit_status = 0
>     [infrun] context_switch: Switching context from 0.0.0 to 619641.620754.0
>     [infrun] clear_proceed_status_thread: 619641.620754.0
>
> GDB saw an exit event for thread 619641.620754.  After processing it,
> infrun decides to re-resume the target again.  To do that, infrun
> picks some other thread that isn't exited yet from GDB's perspective,
> switches to it, and calls keep_going.  Below, infrun happens to pick
> thread p97479.97479, the leader, which also exited, but GDB doesn't
> know yet:
>
> ...
>     [remote] Sending packet: $Hgp97479.97479#75
>     [remote] Packet received: OK
>     [remote] Sending packet: $g#67
>     [remote] Packet received: xxxxxxxxxxxxxxxxx (...snip...) [1120 bytes omitted]
>     [infrun] reset: reason=handling event
>     [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target remote, no resumed threads
>   [infrun] fetch_inferior_event: exit
>   PC register is not available
>   (gdb)
>
> The Linux backends, both in GDB and in GDBserver, already silently
> ignore failures to resume, with the understanding that we'll see an
> exit event soon.  Core of GDB doesn't do that yet, though.
>
> This patch is a small step in that direction.  It swallows the error
> when thrown from within resume_1.  There are likely are spots where we
> will need similar treatment, but we can tackle them as we find them.
>
> After this patch, we'll see something like this instead:
>
>     [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [640478.640478.0] at 0x0
>     [infrun] do_target_resume: resume_ptid=640478.0.0, step=0, sig=GDB_SIGNAL_0
>     [remote] Sending packet: $vCont;c:p9c5de.-1#78

I'm confuse by this example.  I would have expected it to start off with
the same intro as the above, that is, send the '$g#67' packet, get back
the xxxx...etc... but then do things differently.

>     [infrun] prepare_to_wait: prepare_to_wait
>     [infrun] reset: reason=handling event
>     [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
>     [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
>   [infrun] fetch_inferior_event: exit
>   [infrun] fetch_inferior_event: enter
>     [infrun] scoped_disable_commit_resumed: reason=handling event
>     [infrun] random_pending_event_thread: None found.
>     [remote] wait: enter
>       [remote] Packet received: W0;process:9c5de
>     [remote] wait: exit
>     [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
>     [infrun] print_target_wait_results:   640478.0.0 [process 640478],
>     [infrun] print_target_wait_results:   status->kind = EXITED, exit_status = 0
>     [infrun] handle_inferior_event: status->kind = EXITED, exit_status = 0
>   [Inferior 1 (process 640478) exited normally]
>     [infrun] stop_waiting: stop_waiting
>     [infrun] reset: reason=handling event
>   (gdb) [infrun] fetch_inferior_event: exit
>
> Change-Id: I7f1c7610923435c4e98e70acc5ebe5ebbac581e2
> ---
>  gdb/infrun.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 09391d85256..21e5aa0f50e 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2595,7 +2595,28 @@ resume_1 (enum gdb_signal sig)
>        step = false;
>      }
>  
> -  CORE_ADDR pc = regcache_read_pc (regcache);
> +  CORE_ADDR pc = 0;

I don't think we should be picking some arbitrary $pc value (0 in this
case) and just using that as a default, instead, I think it would be
better to change the type of pc to gdb::optional<CORE_ADDR>, and then
update the rest of this function to only do the $pc relevant parts if we
have a $pc value.

> +  try
> +    {
> +      pc = regcache_read_pc (regcache);
> +    }
> +  catch (const gdb_exception_error &err)
> +    {
> +      /* Swallow errors as it may be that the current thread exited
> +	 and we've haven't seen its exit status yet.  Let the
> +	 resumption continue and we'll collect the exit event
> +	 shortly.  */
> +      if (err.error == TARGET_CLOSE_ERROR)
> +	throw;
> +
> +      if (debug_infrun)
> +	{
> +	  string_file buf;
> +	  exception_print (&buf, err);
> +	  infrun_debug_printf ("resume: swallowing error: %s",
> +			       buf.string ().c_str ());
> +	}

I guess this is the best we can probably do without changing the remote
protocol. My worry would be that there could be other reasons that the
read of $pc fails, which we are now just ignoring.  It looks like you
already ran into one such case with TARGET_CLOSE_ERROR, but maybe
there's others?

It almost feels like the ideal solution would invert the logic, so we
could write:

  catch (const gdb_exception_error &err)
    {
      /* I just invent a new error type here...  */
      if (err.err != INFERIOR_EXITED_ERROR)
        throw;

      // ... etc ...
    }

To use something like this we could have the H packet send back
something other then "OK" when GDB asks to switch to a thread that has
already exited, maybe send back the stop reply could be made to work?

I say all that really just to check if you agree or not.  I think for
now I'd be happy to go with what you present here, I think the gains
this series brings to GDB are worth some rough edges that we might want
to address in the future.

Would love to hear your thoughts,

Thanks,
Andrew

> +    }
>  
>    infrun_debug_printf ("step=%d, signal=%s, trap_expected=%d, "
>  		       "current thread [%s] at %s",
> -- 
> 2.36.0


  reply	other threads:[~2023-06-10 10:33 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:07       ` Andrew Burgess
2023-03-22 21:29         ` Andrew Burgess
2023-03-23 15:15           ` Pedro Alves
2023-03-27 12:40             ` Andrew Burgess
2023-03-27 16:21               ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00   ` Andrew Burgess
2023-03-10 17:15     ` Pedro Alves
2023-03-16 16:19       ` Andrew Burgess
2023-03-27 18:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50   ` Andrew Burgess
2023-04-04 13:57     ` Pedro Alves
2023-04-14 19:29       ` Pedro Alves
2023-05-26 15:04         ` Andrew Burgess
2023-11-13 14:04           ` Pedro Alves
2023-05-26 14:45       ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38   ` Andrew Burgess
2023-03-10 17:16     ` Pedro Alves
2023-03-21 16:06       ` Andrew Burgess
2023-11-13 14:05         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46   ` Andrew Burgess
2023-11-13 14:05     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07   ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25   ` Lancelot SIX
2023-03-10 19:16     ` Pedro Alves
2023-06-06 13:29       ` Andrew Burgess
2023-11-13 14:07         ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12   ` Andrew Burgess
2023-11-13 14:07     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10   ` Andrew Burgess
2023-11-13 14:08     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42   ` Andrew Burgess
2023-11-13 14:09     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52   ` Andrew Burgess
2023-11-13 14:11     ` Pedro Alves
2023-12-15 18:15       ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24   ` Andrew Burgess
2023-11-13 14:12     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33   ` Andrew Burgess [this message]
2023-11-13 14:13     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12  9:53   ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53   ` Andrew Burgess
2023-11-13 14:13     ` Pedro Alves
2023-06-12 12:06   ` Andrew Burgess
2023-11-13 14:15     ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20   ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05   ` Andrew Burgess
2023-03-10 17:21     ` Pedro Alves
2023-02-16 15:40   ` Andrew Burgess
2023-06-12 12:23     ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12   ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ilbvy5cl.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).