From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH 25/31] Ignore failure to read PC when resuming
Date: Sat, 10 Jun 2023 11:33:14 +0100 [thread overview]
Message-ID: <87ilbvy5cl.fsf@redhat.com> (raw)
In-Reply-To: <20221212203101.1034916-26-pedro@palves.net>
Pedro Alves <pedro@palves.net> writes:
> If GDB sets a GDB_THREAD_OPTION_EXIT option on a thread, and the
> thread exits, the server reports the corresponding thread exit event,
> and forgets about the thread, i.e., removes the exited thread from its
> thread list.
>
> On the GDB side, GDB set the GDB_THREAD_OPTION_EXIT option on a
> thread, GDB delays deleting the thread from its thread list until it
> sees the corresponding thread exit event, as that event needs special
> handling in infrun.
>
> When a thread disappears from the target, but it still exists on GDB's
> thread list, in all-stop RSP mode, it can happen that GDB ends up
> trying to resume such an already-exited-thread that GDB doesn't yet
> know is gone. When that happens, against GDBserver, typically the
> ongoing execution command fails with this error:
I'm slightly confused here. If GDB doesn't know the thread has exited
doesn't that mean the server hasn't yet reported the exit, and so should
be holding onto the thread?
I wanted to investigate this a bit more to try and understand more about
what's going on, but I couldn't find a test that was triggering the code
added in this patch. Do you know if there's a test I can run to see
this issue?
>
> ...
> PC register is not available
> (gdb)
>
> At the remote protocol level, we may see e.g., this:
>
> [remote] Packet received: w0;p97479.978d2
> [remote] wait: exit
> [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
> [infrun] print_target_wait_results: 619641.620754.0 [Thread 619641.620754],
> [infrun] print_target_wait_results: status->kind = THREAD_EXITED, exit_status = 0
> [infrun] handle_inferior_event: status->kind = THREAD_EXITED, exit_status = 0
> [infrun] context_switch: Switching context from 0.0.0 to 619641.620754.0
> [infrun] clear_proceed_status_thread: 619641.620754.0
>
> GDB saw an exit event for thread 619641.620754. After processing it,
> infrun decides to re-resume the target again. To do that, infrun
> picks some other thread that isn't exited yet from GDB's perspective,
> switches to it, and calls keep_going. Below, infrun happens to pick
> thread p97479.97479, the leader, which also exited, but GDB doesn't
> know yet:
>
> ...
> [remote] Sending packet: $Hgp97479.97479#75
> [remote] Packet received: OK
> [remote] Sending packet: $g#67
> [remote] Packet received: xxxxxxxxxxxxxxxxx (...snip...) [1120 bytes omitted]
> [infrun] reset: reason=handling event
> [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target remote, no resumed threads
> [infrun] fetch_inferior_event: exit
> PC register is not available
> (gdb)
>
> The Linux backends, both in GDB and in GDBserver, already silently
> ignore failures to resume, with the understanding that we'll see an
> exit event soon. Core of GDB doesn't do that yet, though.
>
> This patch is a small step in that direction. It swallows the error
> when thrown from within resume_1. There are likely are spots where we
> will need similar treatment, but we can tackle them as we find them.
>
> After this patch, we'll see something like this instead:
>
> [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [640478.640478.0] at 0x0
> [infrun] do_target_resume: resume_ptid=640478.0.0, step=0, sig=GDB_SIGNAL_0
> [remote] Sending packet: $vCont;c:p9c5de.-1#78
I'm confuse by this example. I would have expected it to start off with
the same intro as the above, that is, send the '$g#67' packet, get back
the xxxx...etc... but then do things differently.
> [infrun] prepare_to_wait: prepare_to_wait
> [infrun] reset: reason=handling event
> [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
> [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
> [infrun] fetch_inferior_event: exit
> [infrun] fetch_inferior_event: enter
> [infrun] scoped_disable_commit_resumed: reason=handling event
> [infrun] random_pending_event_thread: None found.
> [remote] wait: enter
> [remote] Packet received: W0;process:9c5de
> [remote] wait: exit
> [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
> [infrun] print_target_wait_results: 640478.0.0 [process 640478],
> [infrun] print_target_wait_results: status->kind = EXITED, exit_status = 0
> [infrun] handle_inferior_event: status->kind = EXITED, exit_status = 0
> [Inferior 1 (process 640478) exited normally]
> [infrun] stop_waiting: stop_waiting
> [infrun] reset: reason=handling event
> (gdb) [infrun] fetch_inferior_event: exit
>
> Change-Id: I7f1c7610923435c4e98e70acc5ebe5ebbac581e2
> ---
> gdb/infrun.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 09391d85256..21e5aa0f50e 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2595,7 +2595,28 @@ resume_1 (enum gdb_signal sig)
> step = false;
> }
>
> - CORE_ADDR pc = regcache_read_pc (regcache);
> + CORE_ADDR pc = 0;
I don't think we should be picking some arbitrary $pc value (0 in this
case) and just using that as a default, instead, I think it would be
better to change the type of pc to gdb::optional<CORE_ADDR>, and then
update the rest of this function to only do the $pc relevant parts if we
have a $pc value.
> + try
> + {
> + pc = regcache_read_pc (regcache);
> + }
> + catch (const gdb_exception_error &err)
> + {
> + /* Swallow errors as it may be that the current thread exited
> + and we've haven't seen its exit status yet. Let the
> + resumption continue and we'll collect the exit event
> + shortly. */
> + if (err.error == TARGET_CLOSE_ERROR)
> + throw;
> +
> + if (debug_infrun)
> + {
> + string_file buf;
> + exception_print (&buf, err);
> + infrun_debug_printf ("resume: swallowing error: %s",
> + buf.string ().c_str ());
> + }
I guess this is the best we can probably do without changing the remote
protocol. My worry would be that there could be other reasons that the
read of $pc fails, which we are now just ignoring. It looks like you
already ran into one such case with TARGET_CLOSE_ERROR, but maybe
there's others?
It almost feels like the ideal solution would invert the logic, so we
could write:
catch (const gdb_exception_error &err)
{
/* I just invent a new error type here... */
if (err.err != INFERIOR_EXITED_ERROR)
throw;
// ... etc ...
}
To use something like this we could have the H packet send back
something other then "OK" when GDB asks to switch to a thread that has
already exited, maybe send back the stop reply could be made to work?
I say all that really just to check if you agree or not. I think for
now I'd be happy to go with what you present here, I think the gains
this series brings to GDB are worth some rough edges that we might want
to address in the future.
Would love to hear your thoughts,
Thanks,
Andrew
> + }
>
> infrun_debug_printf ("step=%d, signal=%s, trap_expected=%d, "
> "current thread [%s] at %s",
> --
> 2.36.0
next prev parent reply other threads:[~2023-06-10 10:33 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 20:30 [PATCH 00/31] Step over thread clone and thread exit Pedro Alves
2022-12-12 20:30 ` [PATCH 01/31] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2023-02-03 10:44 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:07 ` Andrew Burgess
2023-03-22 21:29 ` Andrew Burgess
2023-03-23 15:15 ` Pedro Alves
2023-03-27 12:40 ` Andrew Burgess
2023-03-27 16:21 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 02/31] linux-nat: introduce pending_status_str Pedro Alves
2023-02-03 12:00 ` Andrew Burgess
2023-03-10 17:15 ` Pedro Alves
2023-03-16 16:19 ` Andrew Burgess
2023-03-27 18:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 03/31] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-03-21 14:50 ` Andrew Burgess
2023-04-04 13:57 ` Pedro Alves
2023-04-14 19:29 ` Pedro Alves
2023-05-26 15:04 ` Andrew Burgess
2023-11-13 14:04 ` Pedro Alves
2023-05-26 14:45 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 04/31] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-02-04 15:38 ` Andrew Burgess
2023-03-10 17:16 ` Pedro Alves
2023-03-21 16:06 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 05/31] Support clone events in the remote protocol Pedro Alves
2023-03-22 15:46 ` Andrew Burgess
2023-11-13 14:05 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 06/31] Avoid duplicate QThreadEvents packets Pedro Alves
2023-05-26 15:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 07/31] enum_flags to_string Pedro Alves
2023-01-30 20:07 ` Simon Marchi
2022-12-12 20:30 ` [PATCH 08/31] Thread options & clone events (core + remote) Pedro Alves
2023-01-31 12:25 ` Lancelot SIX
2023-03-10 19:16 ` Pedro Alves
2023-06-06 13:29 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 09/31] Thread options & clone events (native Linux) Pedro Alves
2023-06-06 13:43 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 10/31] Thread options & clone events (Linux GDBserver) Pedro Alves
2023-06-06 14:12 ` Andrew Burgess
2023-11-13 14:07 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 11/31] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-06-07 16:10 ` Andrew Burgess
2023-11-13 14:08 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 12/31] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-06-07 17:08 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 13/31] Add test for stepping over clone syscall Pedro Alves
2023-06-07 17:42 ` Andrew Burgess
2023-11-13 14:09 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 14/31] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-06-07 17:52 ` Andrew Burgess
2023-11-13 14:11 ` Pedro Alves
2023-12-15 18:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 15/31] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-12-12 20:30 ` [PATCH 16/31] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-06-08 12:27 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 17/31] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-06-08 13:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 18/31] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-06-08 14:14 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 19/31] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-06-08 14:17 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 20/31] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-06-08 15:29 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-06-08 15:49 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 22/31] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-06-08 18:16 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 23/31] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-06-08 18:24 ` Andrew Burgess
2023-11-13 14:12 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 24/31] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-06-09 13:11 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 25/31] Ignore failure to read PC when resuming Pedro Alves
2023-06-10 10:33 ` Andrew Burgess [this message]
2023-11-13 14:13 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 26/31] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-06-10 10:33 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 27/31] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-06-12 9:53 ` Andrew Burgess
2022-12-12 20:30 ` [PATCH 28/31] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-06-05 15:53 ` Andrew Burgess
2023-11-13 14:13 ` Pedro Alves
2023-06-12 12:06 ` Andrew Burgess
2023-11-13 14:15 ` Pedro Alves
2022-12-12 20:30 ` [PATCH 29/31] inferior::clear_thread_list always silent Pedro Alves
2023-06-12 12:20 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 30/31] Centralize "[Thread ...exited]" notifications Pedro Alves
2023-02-04 16:05 ` Andrew Burgess
2023-03-10 17:21 ` Pedro Alves
2023-02-16 15:40 ` Andrew Burgess
2023-06-12 12:23 ` Andrew Burgess
2022-12-12 20:31 ` [PATCH 31/31] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-06-12 13:12 ` Andrew Burgess
2023-01-24 19:47 ` [PATCH v3 00/31] Step over thread clone and thread exit Pedro Alves
2023-11-13 14:24 ` [PATCH " Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ilbvy5cl.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).