public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <andrew.burgess@embecosm.com>
To: Pedro Alves <pedro@palves.net>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/7] Fix spurious unhandled remote %Stop notifications
Date: Sat, 12 Dec 2020 22:13:50 +0000	[thread overview]
Message-ID: <20201212221350.GE2945@embecosm.com> (raw)
In-Reply-To: <20200706190252.22552-2-pedro@palves.net>

* Pedro Alves <pedro@palves.net> [2020-07-06 20:02:46 +0100]:

> In non-stop mode, remote targets mark an async event source whose
> callback is supposed to result in calling remote_target::wait_ns to
> either process the event queue, or acknowledge an incoming %Stop
> notification.
> 
> The callback in question is remote_async_inferior_event_handler, where
> we call inferior_event_handler, to end up in fetch_inferior_event ->
> target_wait -> remote_target::wait -> remote_target::wait_ns.
> 
> A problem here however is that when debugging multiple targets,
> fetch_inferior_event can pull events out of any target picked at
> random, for event fairness.  This means that when
> remote_async_inferior_event_handler returns, remote_target::wait may
> have not been called at all, and thus pending notifications may have
> not been acked.  Because async event sources auto-clear, when
> remote_async_inferior_event_handler returns the async event handler is
> no longer marked, so the event loop won't automatically call
> remote_async_inferior_event_handler again to try to process the
> pending remote notifications/queue.  The result is that stop events
> may end up not processed, e.g., "interrupt -a" seemingly not managing
> to stop all threads.
> 
> Fix this by making remote_async_inferior_event_handler mark the event
> handler again before returning, if necessary.
> 
> Maybe a better fix would be to make async event handlers not
> auto-clear themselves, make that the responsibility of the callback,
> so that the event loop would keep calling the callback automatically.
> Or, we could try making so that fetch_inferior_event would optionally
> handle events only for the target that it got passed down via
> parameter.  However, I don't think now just before branching is the
> time to try to do any such change.
> 
> gdb/ChangeLog:
> 
> 	PR gdb/26199
> 	* remote.c (remote_target::open_1): Pass remote target pointer as
> 	data to create_async_event_handler.
> 	(remote_async_inferior_event_handler): Mark async event handler
> 	before returning if the remote target still has either pending
> 	events or unacknowledged notifications.
> ---
>  gdb/remote.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index f7f99dc24f..59075cb09f 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -5605,7 +5605,7 @@ remote_target::open_1 (const char *name, int from_tty, int extended_p)
>  
>    /* Register extra event sources in the event loop.  */
>    rs->remote_async_inferior_event_token
> -    = create_async_event_handler (remote_async_inferior_event_handler, NULL);
> +    = create_async_event_handler (remote_async_inferior_event_handler, remote);
>    rs->notif_state = remote_notif_state_allocate (remote);
>  
>    /* Reset the target state; these things will be queried either by
> @@ -14164,6 +14164,19 @@ static void
>  remote_async_inferior_event_handler (gdb_client_data data)
>  {
>    inferior_event_handler (INF_REG_EVENT);
> +
> +  remote_target *remote = (remote_target *) data;
> +  remote_state *rs = remote->get_remote_state ();
> +
> +  /* inferior_event_handler may have consumed an event pending on the
> +     infrun side without calling target_wait on the REMOTE target, or
> +     may have pulled an event out of a different target.  Keep trying
> +     for this remote target as long it still has either pending events
> +     or unacknowledged notifications.  */
> +
> +  if (rs->notif_state->pending_event[notif_client_stop.id] != NULL
> +      || !rs->stop_reply_queue.empty ())
> +    mark_async_event_handler (rs->remote_async_inferior_event_token);
>  }

Pedro,

This patch introduced a use after free issue here.  This can be seen
by running the test:

  make check-gdb RUNTESTFLAGS="--target_board=native-gdbserver gdb.base/inferior-died.exp"

For me this fails maybe 1 in 5 times.  I've done some initial
investigation at the problem is obvious one you see the following
stack trace:

  #0  remote_state::~remote_state (this=0x338d548, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:1097
  #1  0x0000000000acf3b3 in remote_target::~remote_target (this=0x338d530, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:4078
  #2  0x0000000000acf3f6 in remote_target::~remote_target (this=0x338d530, __in_chrg=<optimized out>) at ../../src.dev-3/gdb/remote.c:4097
  #3  0x0000000000acf2fa in remote_target::close (this=0x338d530) at ../../src.dev-3/gdb/remote.c:4075
  #4  0x0000000000c75bfd in target_close (targ=0x338d530) at ../../src.dev-3/gdb/target.c:3126
  #5  0x0000000000c62ca4 in decref_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:545
  #6  0x0000000000c62ec7 in target_stack::unpush (this=0x3666d50, t=0x338d530) at ../../src.dev-3/gdb/target.c:633
  #7  0x0000000000c7796c in inferior::unpush_target (this=0x3666ba0, t=0x338d530) at ../../src.dev-3/gdb/inferior.h:357
  #8  0x0000000000c62de1 in unpush_target (t=0x338d530) at ../../src.dev-3/gdb/target.c:595
  #9  0x0000000000c62ee7 in unpush_target_and_assert (target=0x338d530) at ../../src.dev-3/gdb/target.c:643
  #10 0x0000000000c62fb6 in pop_all_targets_at_and_above (stratum=process_stratum) at ../../src.dev-3/gdb/target.c:666
  #11 0x0000000000ad21a4 in remote_unpush_target (target=0x338d530) at ../../src.dev-3/gdb/remote.c:5524
  #12 0x0000000000adc619 in remote_target::mourn_inferior (this=0x338d530) at ../../src.dev-3/gdb/remote.c:9962
  #13 0x0000000000c65e79 in target_mourn_inferior (ptid=...) at ../../src.dev-3/gdb/target.c:2136
  #14 0x000000000086d0a5 in handle_inferior_event (ecs=0x7fffffffb3b0) at ../../src.dev-3/gdb/infrun.c:5234
  #15 0x0000000000869beb in fetch_inferior_event () at ../../src.dev-3/gdb/infrun.c:3863
  #16 0x000000000084a922 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src.dev-3/gdb/inf-loop.c:42
  #17 0x0000000000ae73a9 in remote_async_inferior_event_handler (data=0x338d530) at ../../src.dev-3/gdb/remote.c:14177
  #18 0x00000000004ea759 in check_async_event_handlers () at ../../src.dev-3/gdb/async-event.c:328
  #19 0x0000000001449e7a in gdb_do_one_event () at ../../src.dev-3/gdbsupport/event-loop.cc:216
  #20 0x00000000009102d0 in start_event_loop () at ../../src.dev-3/gdb/main.c:347
  #21 0x00000000009103f0 in captured_command_loop () at ../../src.dev-3/gdb/main.c:407
  #22 0x0000000000911be6 in captured_main (data=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1239
  #23 0x0000000000911c4c in gdb_main (args=0x7fffffffb640) at ../../src.dev-3/gdb/main.c:1254
  #24 0x000000000041755d in main (argc=5, argv=0x7fffffffb748) at ../../src.dev-3/gdb/gdb.c:32

The inferior event being processed is the inferior exited event, this
is the last remote inferior, and so the remote target is unpushed.
GDB then returns to remote_async_inferior_event_handler where we hit
the code you added above which proceeds to make use of the remote
target :-/

Like I say, the problem is now obvious, but the solution less so!

Reading what you originally wrote in the patch I wondered about the
idea of having it be the call back that is responsible for marking the
async event handler as clear.

I haven't tried to fix this yet, but thought I'd share my findings so
far with you.

Thanks,
Andrew

  reply	other threads:[~2020-12-12 22:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06 19:02 [PATCH 0/7] GDB busy loop when interrupting non-stop program (PR 26199) Pedro Alves
2020-07-06 19:02 ` [PATCH 1/7] Fix spurious unhandled remote %Stop notifications Pedro Alves
2020-12-12 22:13   ` Andrew Burgess [this message]
2020-12-13  0:46     ` Simon Marchi
2020-07-06 19:02 ` [PATCH 2/7] Fix latent bug in target_pass_ctrlc Pedro Alves
2020-07-06 19:02 ` [PATCH 3/7] Avoid constant stream of TARGET_WAITKIND_NO_RESUMED Pedro Alves
2020-07-06 19:02 ` [PATCH 4/7] Fix handle_no_resumed w/ multiple targets Pedro Alves
2020-07-06 19:02 ` [PATCH 5/7] Make handle_no_resumed transfer terminal Pedro Alves
2020-07-06 19:02 ` [PATCH 6/7] Testcase for previous handle_no_resumed fixes Pedro Alves
2020-07-06 19:02 ` [PATCH 7/7] Fix GDB busy loop when interrupting non-stop program (PR 26199) Pedro Alves
2020-07-06 21:28 ` [PATCH 0/7] " Simon Marchi
2020-07-07  0:25   ` Pedro Alves
2020-07-07  1:27     ` Pedro Alves
2020-07-07  1:29       ` Pedro Alves
2020-07-10 23:02 ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201212221350.GE2945@embecosm.com \
    --to=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).