From: Andrew Burgess <andrew.burgess@embecosm.com>
To: Pedro Alves <pedro@palves.net>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/7] Fix spurious unhandled remote %Stop notifications
Date: Sat, 12 Dec 2020 22:13:50 +0000 [thread overview]
Message-ID: <20201212221350.GE2945@embecosm.com> (raw)
In-Reply-To: <20200706190252.22552-2-pedro@palves.net>
* Pedro Alves <pedro@palves.net> [2020-07-06 20:02:46 +0100]:
> In non-stop mode, remote targets mark an async event source whose
> callback is supposed to result in calling remote_target::wait_ns to
> either process the event queue, or acknowledge an incoming %Stop
> notification.
>
> The callback in question is remote_async_inferior_event_handler, where
> we call inferior_event_handler, to end up in fetch_inferior_event ->
> target_wait -> remote_target::wait -> remote_target::wait_ns.
>
> A problem here however is that when debugging multiple targets,
> fetch_inferior_event can pull events out of any target picked at
> random, for event fairness. This means that when
> remote_async_inferior_event_handler returns, remote_target::wait may
> have not been called at all, and thus pending notifications may have
> not been acked. Because async event sources auto-clear, when
> remote_async_inferior_event_handler returns the async event handler is
> no longer marked, so the event loop won't automatically call
> remote_async_inferior_event_handler again to try to process the
> pending remote notifications/queue. The result is that stop events
> may end up not processed, e.g., "interrupt -a" seemingly not managing
> to stop all threads.
>
> Fix this by making remote_async_inferior_event_handler mark the event
> handler again before returning, if necessary.
>
> Maybe a better fix would be to make async event handlers not
> auto-clear themselves, make that the responsibility of the callback,
> so that the event loop would keep calling the callback automatically.
> Or, we could try making so that fetch_inferior_event would optionally
> handle events only for the target that it got passed down via
> parameter. However, I don't think now just before branching is the
> time to try to do any such change.
>
> gdb/ChangeLog:
>
> PR gdb/26199
> * remote.c (remote_target::open_1): Pass remote target pointer as
> data to create_async_event_handler.
> (remote_async_inferior_event_handler): Mark async event handler
> before returning if the remote target still has either pending
> events or unacknowledged notifications.
> ---
> gdb/remote.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/gdb/remote.c b/gdb/remote.c
> index f7f99dc24f..59075cb09f 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -5605,7 +5605,7 @@ remote_target::open_1 (const char *name, int from_tty, int extended_p)
>
> /* Register extra event sources in the event loop. */
> rs->remote_async_inferior_event_token
> - = create_async_event_handler (remote_async_inferior_event_handler, NULL);
> + = create_async_event_handler (remote_async_inferior_event_handler, remote);
> rs->notif_state = remote_notif_state_allocate (remote);
>
> /* Reset the target state; these things will be queried either by
> @@ -14164,6 +14164,19 @@ static void
> remote_async_inferior_event_handler (gdb_client_data data)
> {
> inferior_event_handler (INF_REG_EVENT);
> +
> + remote_target *remote = (remote_target *) data;
> + remote_state *rs = remote->get_remote_state ();
> +
> + /* inferior_event_handler may have consumed an event pending on the
> + infrun side without calling target_wait on the REMOTE target, or
> + may have pulled an event out of a different target. Keep trying
> + for this remote target as long it still has either pending events
> + or unacknowledged notifications. */
> +
> + if (rs->notif_state->pending_event[notif_client_stop.id] != NULL
> + || !rs->stop_reply_queue.empty ())
> + mark_async_event_handler (rs->remote_async_inferior_event_token);
> }
Pedro,
This patch introduced a use after free issue here. This can be seen
by running the test:
make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver gdb.base/inferior-died.exp"
For me this fails maybe 1 in 5 times. I've done some initial
investigation at the problem is obvious one you see the following
stack trace:
#0 remote_state::~remote_state (this=0x338d548, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:1097
#1 0x0000000000acf3b3 in remote_target::~remote_target (this=0x338d530, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:4078
#2 0x0000000000acf3f6 in remote_target::~remote_target (this=0x338d530, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:4097
#3 0x0000000000acf2fa in remote_target::close (this=0x338d530) at ../../src.dev-3/gdb/remote.c:4075
#4 0x0000000000c75bfd in target_close (targ=0x338d530) at ../../src.dev-3/gdb/target.c:3126
#5 0x0000000000c62ca4 in decref_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:545
#6 0x0000000000c62ec7 in target_stack::unpush (this=0x3666d50, t=0x338d530) at ../../src.dev-3/gdb/target.c:633
#7 0x0000000000c7796c in inferior::unpush_target (this=0x3666ba0, t=0x338d530) at ../../src.dev-3/gdb/inferior.h:357
#8 0x0000000000c62de1 in unpush_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:595
#9 0x0000000000c62ee7 in unpush_target_and_assert (target=0x338d530) at ../../src.dev-3/gdb/target.c:643
#10 0x0000000000c62fb6 in pop_all_targets_at_and_above (stratum=process_stratum) at ../../src.dev-3/gdb/target.c:666
#11 0x0000000000ad21a4 in remote_unpush_target (target=0x338d530) at ../../src.dev-3/gdb/remote.c:5524
#12 0x0000000000adc619 in remote_target::mourn_inferior (this=0x338d530) at ../../src.dev-3/gdb/remote.c:9962
#13 0x0000000000c65e79 in target_mourn_inferior (ptid=...) at ../../src.dev-3/gdb/target.c:2136
#14 0x000000000086d0a5 in handle_inferior_event (ecs=0x7fffffffb3b0) at ../../src.dev-3/gdb/infrun.c:5234
#15 0x0000000000869beb in fetch_inferior_event () at ../../src.dev-3/gdb/infrun.c:3863
#16 0x000000000084a922 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src.dev-3/gdb/inf-loop.c:42
#17 0x0000000000ae73a9 in remote_async_inferior_event_handler (data=0x338d530) at ../../src.dev-3/gdb/remote.c:14177
#18 0x00000000004ea759 in check_async_event_handlers () at ../../src.dev-3/gdb/async-event.c:328
#19 0x0000000001449e7a in gdb_do_one_event () at ../../src.dev-3/gdbsupport/event-loop.cc:216
#20 0x00000000009102d0 in start_event_loop () at ../../src.dev-3/gdb/main.c:347
#21 0x00000000009103f0 in captured_command_loop () at ../../src.dev-3/gdb/main.c:407
#22 0x0000000000911be6 in captured_main (data=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1239
#23 0x0000000000911c4c in gdb_main (args=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1254
#24 0x000000000041755d in main (argc=5, argv=0x7fffffffb748) at ../../src.dev-3/gdb/gdb.c:32
The inferior event being processed is the inferior exited event, this
is the last remote inferior, and so the remote target is unpushed.
GDB then returns to remote_async_inferior_event_handler where we hit
the code you added above which proceeds to make use of the remote
target :-/
Like I say, the problem is now obvious, but the solution less so!
Reading what you originally wrote in the patch I wondered about the
idea of having it be the call back that is responsible for marking the
async event handler as clear.
I haven't tried to fix this yet, but thought I'd share my findings so
far with you.
Thanks,
Andrew
next prev parent reply other threads:[~2020-12-12 22:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-06 19:02 [PATCH 0/7] GDB busy loop when interrupting non-stop program (PR 26199) Pedro Alves
2020-07-06 19:02 ` [PATCH 1/7] Fix spurious unhandled remote %Stop notifications Pedro Alves
2020-12-12 22:13 ` Andrew Burgess [this message]
2020-12-13 0:46 ` Simon Marchi
2020-07-06 19:02 ` [PATCH 2/7] Fix latent bug in target_pass_ctrlc Pedro Alves
2020-07-06 19:02 ` [PATCH 3/7] Avoid constant stream of TARGET_WAITKIND_NO_RESUMED Pedro Alves
2020-07-06 19:02 ` [PATCH 4/7] Fix handle_no_resumed w/ multiple targets Pedro Alves
2020-07-06 19:02 ` [PATCH 5/7] Make handle_no_resumed transfer terminal Pedro Alves
2020-07-06 19:02 ` [PATCH 6/7] Testcase for previous handle_no_resumed fixes Pedro Alves
2020-07-06 19:02 ` [PATCH 7/7] Fix GDB busy loop when interrupting non-stop program (PR 26199) Pedro Alves
2020-07-06 21:28 ` [PATCH 0/7] " Simon Marchi
2020-07-07 0:25 ` Pedro Alves
2020-07-07 1:27 ` Pedro Alves
2020-07-07 1:29 ` Pedro Alves
2020-07-10 23:02 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201212221350.GE2945@embecosm.com \
--to=andrew.burgess@embecosm.com \
--cc=gdb-patches@sourceware.org \
--cc=pedro@palves.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).