public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Reduce back and forth with target when there are pending statuses
@ 2021-03-21  0:02 Pedro Alves
  2021-03-21  0:02 ` [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have " Pedro Alves
  2021-03-21  0:02 ` [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed Pedro Alves
  0 siblings, 2 replies; 16+ messages in thread
From: Pedro Alves @ 2021-03-21  0:02 UTC (permalink / raw)
  To: gdb-patches

I've picked up Simon's commit-resume patches (with his permission),
and ran with them.  Here's a new iteration.  This version tests
regression free against native GNU/Linux, gdbserver and also against
with "maint set target-non-stop on".  "maint set target-non-stop on"
is required to enable the commit_resume paths against gdbserver.
Testing in that mode exposed a number of problems, all addressed in
this revision.

Simon's original cover letter said (and it's still applicable):

 ~~~
 The goal of this series is to make it able for infrun to tell the
 target when it needs it to commit resumed threads to the execution
 device, and when it's ok to defer doing that.  When infrun knows it
 will do more operations (for example, it is resuming/stopping many
 threads, or it knows it has pending events to process), it uses that
 to let the target defer moving resumed threads to the execution device
 until all these operations are done.  This is explained in details in
 the patches.
 ~~~

Given the further reduction of vCont packets compared to current
pristine master, I'm also seeing this as another step toward making
the non-stop variant of the remote protocol being the default.  We
should still probably try to coalesce stops, though...

For review convenience, I've noted my changes compared to Simon's in
each of the patches.  (I'll strip that info before pushing.)

Simon Marchi (2):
  gdb: generalize commit_resume, avoid commit-resuming when threads have
    pending statuses
  gdb: defer commit resume until all available events are consumed

 gdb/async-event.c            |   9 ++
 gdb/async-event.h            |   3 +
 gdb/infcmd.c                 |  16 +++
 gdb/infrun.c                 | 236 ++++++++++++++++++++++++++++++++---
 gdb/infrun.h                 |  98 +++++++++++++++
 gdb/linux-nat.c              |  30 +++++
 gdb/linux-nat.h              |   1 +
 gdb/mi/mi-main.c             |   4 +
 gdb/process-stratum-target.h |  32 +++++
 gdb/record-full.c            |  10 +-
 gdb/remote.c                 | 170 ++++++++++++++++++++-----
 gdb/target-delegates.c       |  45 +++++--
 gdb/target.c                 |  30 ++---
 gdb/target.h                 |  47 ++++---
 gdb/top.c                    |   7 ++
 15 files changed, 637 insertions(+), 101 deletions(-)


base-commit: 219f56b4842f4e72540082af96e47efa0f0f0a82
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-03-21  0:02 [PATCH v4 0/2] Reduce back and forth with target when there are pending statuses Pedro Alves
@ 2021-03-21  0:02 ` Pedro Alves
  2021-03-25 15:16   ` Simon Marchi
  2021-07-11 19:42   ` Jonah Graham
  2021-03-21  0:02 ` [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed Pedro Alves
  1 sibling, 2 replies; 16+ messages in thread
From: Pedro Alves @ 2021-03-21  0:02 UTC (permalink / raw)
  To: gdb-patches

From: Simon Marchi <simon.marchi@efficios.com>

New in v4 (Pedro):

 - Fixed "phony stop" generation -- if a thread had been resumed with
   a signal, we would lose the signal.  This was caught by the
   testsuite against gdbserver + "maint set target-non-stop on".

 - Add reset_and_commit() method to scoped_disable_commit_resumed,
   which gets rid of the need for creating a bunch of scopes and
   intending a bunch of code.  I originally had this idea while
   looking at the optional<scoped_disable_commit_resumed> in infrun.c
   in the previous versions of this patch, and ended up generalizing
   it and using it throughout.

 - Add scoped_enable_commit_resumed, the counterpart of
   scoped_disable_commit_resumed: re-enables commit-resumed, and
   commit-resumes.  Called from wait_sync_command_done, which is a
   "sync" point where we need forward progress.  This fixes
   regressions introduced in previous iterations, with "maint set
   target-non-stop on".

 - Renamed and inverted logic of the defer_enable_commit_resumed
   global.  It's now called 'enable_commit_resumed'.  The negated
   logic got really confusing, particularly with
   scoped_enable_commit_resumed, which led to double negation.

 - No need for
   scoped_disable_commit_resumed::maybe_set_commit_resumed_all_process_targets
   to be a class method.  Now a static free function.

 - replace assert in maybe_set_commit_resumed_all_process_targets with
   continue.  Testing with "maint set target-non-stop on" ran against
   this assertion (e.g., fork tests, which end up with multiple
   inferiors sharing the same process_stratum target).

 - add and use target_commit_resumed

 - all_process_targets removed, replaced with iterating over inferiors
   in the callers.  all_process_targets was iterating over inferiors
   itself anyway, and it was building a std::set, which was iterated
   on by the callers, and then immediately discarded.  Because the
   std::set was never reused, this ends up being a pessimization.
   Also, given the potential for targets on the target stack higher
   than process_stratum stratum wanting to override commit_resumed(),
   it's just better to iterate over inferiors, or IOW, iterate over
   target stacks.

 - Also considering the potential for targets higher in the target
   stack wanting to override target_commit_resumed, this version does
   not depend on the patch to move the target_ops::commit_resume
   method to process_stratum_target anymore.  Note all targets on a
   stack can use the same process_stratum_target::commit_resumed_state
   flag, just like they share the
   process_stratum_target::threads_executing flag.

 - Misc cleanup, renaming, refactoring, etc. throughout.

The rationale for this patch comes from the ROCm port [1], the goal
being to reduce the number of back and forths between GDB and the
target when doing successive operations.  I'll start with explaining
the rationale and then go over the implementation.  In the ROCm / GPU
world, the term "wave" is somewhat equivalent to a "thread" in GDB.
So if you read if from a GPU stand point, just s/thread/wave/.

ROCdbgapi, the library used by GDB [2] to communicate with the GPU
target, gives the illusion that it's possible for the debugger to
control (start and stop) individual threads.  But in reality, this is
not how it works.  Under the hood, all threads of a queue are
controlled as a group.  To stop one thread in a group of running ones,
the state of all threads is retrieved from the GPU, all threads are
destroyed, and all threads but the one we want to stop are re-created
from the saved state.  The net result, from the point of view of GDB,
is that the library stopped one thread.  The same thing goes if we
want to resume one thread while others are running: the state of all
running threads is retrieved from the GPU, they are all destroyed, and
they are all re-created, including the thread we want to resume.

This leads to some inefficiencies when combined with how GDB works,
here are two examples:

 - Stopping all threads: because the target operates in non-stop mode,
   when the user interface mode is all-stop, GDB must stop all threads
   individually when presenting a stop.  Let's suppose we have 1000
   threads and the user does ^C.  GDB asks the target to stop one
   thread.  Behind the scenes, the library retrieves 1000 thread
   states and restores the 999 others still running ones.  GDB asks
   the target to stop another one.  The target retrieves 999 thread
   states and restores the 998 remaining ones.  That means that to
   stop 1000 threads, we did 1000 back and forths with the GPU.  It
   would have been much better to just retrieve the states once and
   stop there.

 - Resuming with pending events: suppose the 1000 threads hit a
   breakpoint at the same time.  The breakpoint is conditional and
   evaluates to true for the first thread, to false for all others.
   GDB pulls one event (for the first thread) from the target, decides
   that it should present a stop, so stops all threads using
   stop_all_threads.  All these other threads have a breakpoint event
   to report, which is saved in `thread_info::suspend::waitstatus` for
   later.  When the user does "continue", GDB resumes that one thread
   that did hit the breakpoint.  It then processes the pending events
   one by one as if they just arrived.  It picks one, evaluates the
   condition to false, and resumes the thread.  It picks another one,
   evaluates the condition to false, and resumes the thread.  And so
   on.  In between each resumption, there is a full state retrieval
   and re-creation.  It would be much nicer if we could wait a little
   bit before sending those threads on the GPU, until it processed all
   those pending events.

To address this kind of performance issue, ROCdbgapi has a concept
called "forward progress required", which is a boolean state that
allows its user (i.e. GDB) to say "I'm doing a bunch of operations,
you can hold off putting the threads on the GPU until I'm done" (the
"forward progress not required" state).  Turning forward progress back
on indicates to the library that all threads that are supposed to be
running should now be really running on the GPU.

It turns out that GDB has a similar concept, though not as general,
commit_resume.  One difference is that commit_resume is not stateful:
the target can't look up "does the core need me to schedule resumed
threads for execution right now".  It is also specifically linked to
the resume method, it is not used in other contexts.  The target
accumulates resumption requests through target_ops::resume calls, and
then commits those resumptions when target_ops::commit_resume is
called.  The target has no way to check if it's ok to leave resumed
threads stopped in other target methods.

To bridge the gap, this patch generalizes the commit_resume concept in
GDB to match the forward progress concept of ROCdbgapi.  The current
name (commit_resume) can be interpreted as "commit the previous resume
calls".  I renamed the concept to "commit_resumed", as in "commit the
threads that are resumed".

In the new version, we have two things in process_stratum_target:

 - the commit_resumed_state field: indicates whether GDB requires this
   target to have resumed threads committed to the execution
   target/device.  If false, the target is allowed to leave resumed
   threads un-committed at the end of whatever method it is executing.

 - the commit_resumed method: called when commit_resumed_state
   transitions from false to true.  While commit_resumed_state was
   false, the target may have left some resumed threads un-committed.
   This method being called tells it that it should commit them back
   to the execution device.

Let's take the "Stopping all threads" scenario from above and see how
it would work with the ROCm target with this change.  Before stopping
all threads, GDB would set the target's commit_resumed_state field to
false.  It would then ask the target to stop the first thread.  The
target would retrieve all threads' state from the GPU and mark that
one as stopped.  Since commit_resumed_state is false, it leaves all
the other threads (still resumed) stopped.  GDB would then proceed to
call target_stop for all the other threads.  Since resumed threads are
not committed, this doesn't do any back and forth with the GPU.

To simplify the implementation of targets, this patch makes it so that
when calling certain target methods, the contract between the core and
the targets guarantees that commit_resumed_state is false.  This way,
the target doesn't need two paths, one commit_resumed_state == true
and one for commit_resumed_state == false.  It can just assert that
commit_resumed_state is false and work with that assumption.  This
also helps catch places where we forgot to disable
commit_resumed_state before calling the method, which represents a
probable optimization opportunity.  I added assertions in the target
method wrappers (target_resume and friends) to have some confidence
that this contract between the core and the targets is respected.

The scoped_disable_commit_resumed type is used to disable the commit
resumed state of all process targets on construction, and selectively
re-enable it on destruction (see below for criteria).  Note that it
only sets the process_stratum_target::commit_resumed_state flag.  A
subsequent call to maybe_call_commit_resumed_all_process_targets is
necessary to call the process_stratum_target::commit_resumed method on
all process targets that got their commit_resumed_state flag turned
back on.  This separation is because we don't want to call the
commit_resumed methods in scoped_disable_commit_resumed's destructor,
as they can throw.

On destruction, commit-resumed is not re-enabled for a given target
if:

 1. this target has no threads resumed, or

 2. this target has at least one resumed thread with a pending status
    known to the core (saved in thread_info::suspend::waitstatus).

The first point is not technically necessary, because a proper
commit_resumed implementation would be a no-op if the target has no
resumed threads.  But since we have a flag do to a quick check, I
think it doesn't hurt.

The second point is more important: together with the
scoped_disable_commit_resumed instance added in fetch_inferior_event,
it makes it so the "Resuming with pending events" described above is
handled efficiently.  Here's what happens in that case:

 1. The user types "continue".

 2. Upon destruction, the scoped_disable_commit_resumed in the
    `proceed` function does not enable commit-resumed, as it sees some
    threads have pending statuses.

 3. fetch_inferior_event is called to handle another event, the
    breakpoint hit evaluates to false, and that thread is resumed.
    Because there are still more threads with pending statuses, the
    destructor of scoped_disable_commit_resumed in
    fetch_inferior_event still doesn't enable commit-resumed.

 4. Rinse and repeat step 3, until the last pending status is handled
    by fetch_inferior_event.  In that case,
    scoped_disable_commit_resumed's destructor sees there are no more
    threads with pending statues, so it asks the target to commit
    resumed threads.

This allows us to avoid all unnecessary back and forths, there is a
single commit_resumed call once all pending statuses are processed.

This change required remote_target::remote_stop_ns to learn how to
handle stopping threads that were resumed but pending vCont.  The
simplest example where that happens is when using the remote target in
all-stop, but with "maint set target-non-stop on", to force it to
operate in non-stop mode under the hood.  If two threads hit a
breakpoint at the same time, GDB will receive two stop replies.  It
will present the stop for one thread and save the other one in
thread_info::suspend::waitstatus.

Before this patch, when doing "continue", GDB first resumes the thread
without a pending status:

    Sending packet: $vCont;c:p172651.172676#f3

It then consumes the pending status in the next fetch_inferior_event
call:

    [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
    [infrun] target_wait (-1.0.0, status) =
    [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
    [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP

It then realizes it needs to stop all threads to present the stop, so
stops the thread it just resumed:

    [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
    [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
    remote_stop called
    Sending packet: $vCont;t:p172651.172676#04

This is an unnecessary resume/stop.  With this patch, we don't commit
resumed threads after proceeding, because of the pending status:

    [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus

When GDB handles the pending status and stop_all_threads runs, we stop a
resumed but pending vCont thread:

    remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)

That thread was never actually resumed on the remote stub / gdbserver,
so we shouldn't send a packet to the remote side asking to stop the
thread.

Note that there are paths that resume the target and then do a
synchronous blocking wait, in sort of nested event loop, via
wait_sync_command_done.  For example, inferior function calls, or any
run control command issued from a breakpoint command list.  We handle
that making wait_sync_command_one a "sync" point -- force forward
progress, or IOW, force-enable commit-resumed state.

gdb/ChangeLog:

	* infcmd.c (run_command_1, attach_command, detach_command)
	(interrupt_target_1): Use scoped_disable_commit_resumed.
	* infrun.c (do_target_resume): Remove
	target_commit_resume call.
	(commit_resume_all_targets): Remove.
	(maybe_set_commit_resumed_all_targets): New.
	(maybe_call_commit_resumed_all_targets): New.
	(enable_commit_resumed): New.
	(scoped_disable_commit_resumed::scoped_disable_commit_resumed)
	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed)
	(scoped_disable_commit_resumed::reset)
	(scoped_disable_commit_resumed::reset_and_commit)
	(scoped_enable_commit_resumed::scoped_enable_commit_resumed)
	(scoped_enable_commit_resumed::~scoped_enable_commit_resumed):
	New.
	(proceed): Use scoped_disable_commit_resumed and
	maybe_call_commit_resumed_all_targets.
	(fetch_inferior_event): Use scoped_disable_commit_resumed.
	* infrun.h (struct scoped_disable_commit_resumed): New.
	(maybe_call_commit_resumed_all_process_targets): New.
	(struct scoped_enable_commit_resumed): New.
	* mi/mi-main.c (exec_continue): Use scoped_disable_commit_resumed.
	* process-stratum-target.h (class process_stratum_target):
	<commit_resumed_state>: New.
	* record-full.c (record_full_wait_1): Change commit_resumed_state
	around calling commit_resumed.
	* remote.c (class remote_target) <commit_resume>: Rename to...
	<commit_resumed>: ... this.
	(struct stop_reply): Move up.
	(remote_target::commit_resume): Rename to...
	(remote_target::commit_resumed): ... this.  Check if there is any
	thread pending vCont resume.
	(remote_target::remote_stop_ns): Generate stop replies for resumed
	but pending vCont threads.
	(remote_target::wait_ns): Add gdb_assert.
	* target-delegates.c: Regenerate.
	* target.c (target_wait, target_resume): Assert that the current
	process_stratum target isn't in commit-resumed state.
	(defer_target_commit_resume): Remove.
	(target_commit_resume): Remove.
	(make_scoped_defer_target_commit_resume): Remove.
	(target_stop): Assert that the current process_stratum target
	isn't in commit-resumed state.
	* target.h (struct target_ops) <commit_resume>: Rename to ...
	 <commit_resumed>: ... this.
	(target_commit_resume): Remove.
	(target_commit_resumed): New.
	(make_scoped_defer_target_commit_resume): Remove.
	* top.c (wait_sync_command_done): Use
	scoped_enable_commit_resumed.

[1] https://github.com/ROCm-Developer-Tools/ROCgdb/
[2] https://github.com/ROCm-Developer-Tools/ROCdbgapi

Change-Id: I836135531a29214b21695736deb0a81acf8cf566
---
 gdb/infcmd.c                 |  16 +++
 gdb/infrun.c                 | 224 ++++++++++++++++++++++++++++++++---
 gdb/infrun.h                 |  98 +++++++++++++++
 gdb/mi/mi-main.c             |   4 +
 gdb/process-stratum-target.h |  32 +++++
 gdb/record-full.c            |  10 +-
 gdb/remote.c                 | 149 ++++++++++++++++++-----
 gdb/target-delegates.c       |  18 +--
 gdb/target.c                 |  30 ++---
 gdb/target.h                 |  33 +++---
 gdb/top.c                    |   7 ++
 11 files changed, 518 insertions(+), 103 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 80e6ad3048f..1d33b5712a4 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -419,6 +419,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
 
   dont_repeat ();
 
+  scoped_disable_commit_resumed disable_commit_resumed ("running");
+
   kill_if_already_running (from_tty);
 
   init_wait_for_inferior ();
@@ -538,6 +540,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
   /* Since there was no error, there's no need to finish the thread
      states here.  */
   finish_state.release ();
+
+  disable_commit_resumed.reset_and_commit ();
 }
 
 static void
@@ -2565,6 +2569,8 @@ attach_command (const char *args, int from_tty)
 
   dont_repeat ();		/* Not for the faint of heart */
 
+  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
+
   if (gdbarch_has_global_solist (target_gdbarch ()))
     /* Don't complain if all processes share the same symbol
        space.  */
@@ -2673,6 +2679,8 @@ attach_command (const char *args, int from_tty)
     }
   else
     attach_post_wait (args, from_tty, mode);
+
+  disable_commit_resumed.reset_and_commit ();
 }
 
 /* We had just found out that the target was already attached to an
@@ -2746,6 +2754,8 @@ detach_command (const char *args, int from_tty)
   if (inferior_ptid == null_ptid)
     error (_("The program is not being run."));
 
+  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
+
   query_if_trace_running (from_tty);
 
   disconnect_tracing ();
@@ -2779,6 +2789,8 @@ detach_command (const char *args, int from_tty)
 
   if (!was_non_stop_p)
     restart_after_all_stop_detach (as_process_stratum_target (target_ref.get ()));
+
+  disable_commit_resumed.reset_and_commit ();
 }
 
 /* Disconnect from the current target without resuming it (leaving it
@@ -2827,6 +2839,8 @@ stop_current_target_threads_ns (ptid_t ptid)
 void
 interrupt_target_1 (bool all_threads)
 {
+  scoped_disable_commit_resumed disable_commit_resumed ("interrupting");
+
   if (non_stop)
     {
       if (all_threads)
@@ -2844,6 +2858,8 @@ interrupt_target_1 (bool all_threads)
     }
   else
     target_interrupt ();
+
+  disable_commit_resumed.reset_and_commit ();
 }
 
 /* interrupt [-a]
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3b65a6de9fe..0af2f940105 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
 
   target_resume (resume_ptid, step, sig);
 
-  target_commit_resume ();
-
   if (target_can_async_p ())
     target_async (1);
 }
@@ -2760,28 +2758,208 @@ schedlock_applies (struct thread_info *tp)
 					    execution_direction)));
 }
 
-/* Calls target_commit_resume on all targets.  */
+/* Set process_stratum_target::COMMIT_RESUMED_STATE in all target
+   stacks that have threads executing and don't have threads with
+   pending events.  */
 
 static void
-commit_resume_all_targets ()
+maybe_set_commit_resumed_all_targets ()
+{
+  for (inferior *inf : all_non_exited_inferiors ())
+    {
+      process_stratum_target *proc_target = inf->process_target ();
+
+      if (proc_target->commit_resumed_state)
+	{
+	  /* We already set this in a previous iteration, via another
+	     inferior sharing the process_stratum target.  */
+	  continue;
+	}
+
+      /* If the target has no resumed threads, it would be useless to
+	 ask it to commit the resumed threads.  */
+      if (!proc_target->threads_executing)
+	{
+	  infrun_debug_printf ("not re-enabling forward progress for target "
+			       "%s, no resumed threads",
+			       proc_target->shortname ());
+	  continue;
+	}
+
+      /* As an optimization, if a thread from this target has some
+	 status to report, handle it before requiring the target to
+	 commit its resumed threads: handling the status might lead to
+	 resuming more threads.  */
+      bool has_thread_with_pending_status = false;
+      for (thread_info *thread : all_non_exited_threads (proc_target))
+	if (thread->resumed && thread->suspend.waitstatus_pending_p)
+	  {
+	    has_thread_with_pending_status = true;
+	    break;
+	  }
+
+      if (has_thread_with_pending_status)
+	{
+	  infrun_debug_printf ("not requesting commit-resumed for target %s, a"
+			       " thread has a pending waitstatus",
+			       proc_target->shortname ());
+	  continue;
+	}
+
+      infrun_debug_printf ("enabling commit-resumed for target %s",
+			   proc_target->shortname ());
+
+      proc_target->commit_resumed_state = true;
+    }
+}
+
+/* See infrun.h.  */
+
+void
+maybe_call_commit_resumed_all_targets ()
 {
   scoped_restore_current_thread restore_thread;
 
-  /* Map between process_target and a representative inferior.  This
-     is to avoid committing a resume in the same target more than
-     once.  Resumptions must be idempotent, so this is an
-     optimization.  */
-  std::unordered_map<process_stratum_target *, inferior *> conn_inf;
+  for (inferior *inf : all_non_exited_inferiors ())
+    {
+      process_stratum_target *proc_target = inf->process_target ();
+
+      if (!proc_target->commit_resumed_state)
+	continue;
+
+      switch_to_inferior_no_thread (inf);
+
+      infrun_debug_printf ("calling commit_resumed for target %s",
+			   proc_target->shortname());
+
+      target_commit_resumed ();
+    }
+}
+
+/* To track nesting of scoped_disable_commit_resumed objects, ensuring
+   that only the outermost one attempts to re-enable
+   commit-resumed.  */
+static bool enable_commit_resumed = true;
+
+/* See infrun.h.  */
+
+scoped_disable_commit_resumed::scoped_disable_commit_resumed
+  (const char *reason)
+  : m_reason (reason),
+    m_prev_enable_commit_resumed (enable_commit_resumed)
+{
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  enable_commit_resumed = false;
 
   for (inferior *inf : all_non_exited_inferiors ())
-    if (inf->has_execution ())
-      conn_inf[inf->process_target ()] = inf;
+    {
+      process_stratum_target *proc_target = inf->process_target ();
+
+      if (m_prev_enable_commit_resumed)
+	{
+	  /* This is the outermost instance: force all
+	     COMMIT_RESUMED_STATE to false.  */
+	  proc_target->commit_resumed_state = false;
+	}
+      else
+	{
+	  /* This is not the outermost instance, we expect
+	     COMMIT_RESUMED_STATE to have been cleared by the
+	     outermost instance.  */
+	  gdb_assert (!proc_target->commit_resumed_state);
+	}
+    }
+}
+
+/* See infrun.h.  */
 
-  for (const auto &ci : conn_inf)
+void
+scoped_disable_commit_resumed::reset ()
+{
+  if (m_reset)
+    return;
+  m_reset = true;
+
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  gdb_assert (!enable_commit_resumed);
+
+  enable_commit_resumed = m_prev_enable_commit_resumed;
+
+  if (m_prev_enable_commit_resumed)
     {
-      inferior *inf = ci.second;
-      switch_to_inferior_no_thread (inf);
-      target_commit_resume ();
+      /* This is the outermost instance, re-enable
+         COMMIT_RESUMED_STATE on the targets where it's possible.  */
+      maybe_set_commit_resumed_all_targets ();
+    }
+  else
+    {
+      /* This is not the outermost instance, we expect
+	 COMMIT_RESUMED_STATE to still be false.  */
+      for (inferior *inf : all_non_exited_inferiors ())
+	{
+	  process_stratum_target *proc_target = inf->process_target ();
+	  gdb_assert (!proc_target->commit_resumed_state);
+	}
+    }
+}
+
+/* See infrun.h.  */
+
+scoped_disable_commit_resumed::~scoped_disable_commit_resumed ()
+{
+  reset ();
+}
+
+/* See infrun.h.  */
+
+void
+scoped_disable_commit_resumed::reset_and_commit ()
+{
+  reset ();
+  maybe_call_commit_resumed_all_targets ();
+}
+
+/* See infrun.h.  */
+
+scoped_enable_commit_resumed::scoped_enable_commit_resumed
+  (const char *reason)
+  : m_reason (reason),
+    m_prev_enable_commit_resumed (enable_commit_resumed)
+{
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  if (!enable_commit_resumed)
+    {
+      enable_commit_resumed = true;
+
+      /* Re-enable COMMIT_RESUMED_STATE on the targets where it's
+	 possible.  */
+      maybe_set_commit_resumed_all_targets ();
+
+      maybe_call_commit_resumed_all_targets ();
+    }
+}
+
+/* See infrun.h.  */
+
+scoped_enable_commit_resumed::~scoped_enable_commit_resumed ()
+{
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  gdb_assert (enable_commit_resumed);
+
+  enable_commit_resumed = m_prev_enable_commit_resumed;
+
+  if (!enable_commit_resumed)
+    {
+      /* Force all COMMIT_RESUMED_STATE back to false.  */
+      for (inferior *inf : all_non_exited_inferiors ())
+	{
+	  process_stratum_target *proc_target = inf->process_target ();
+	  proc_target->commit_resumed_state = false;
+	}
     }
 }
 
@@ -3005,7 +3183,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   cur_thr->prev_pc = regcache_read_pc_protected (regcache);
 
   {
-    scoped_restore save_defer_tc = make_scoped_defer_target_commit_resume ();
+    scoped_disable_commit_resumed disable_commit_resumed ("proceeding");
 
     started = start_step_over ();
 
@@ -3075,7 +3253,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
       }
   }
 
-  commit_resume_all_targets ();
+  maybe_call_commit_resumed_all_targets ();
 
   finish_state.release ();
 
@@ -3877,8 +4055,16 @@ fetch_inferior_event ()
       = make_scoped_restore (&execution_direction,
 			     target_execution_direction ());
 
+    /* Allow targets to pause their resumed threads while we handle
+       the event.  */
+    scoped_disable_commit_resumed disable_commit_resumed ("handling event");
+
     if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
-      return;
+      {
+	infrun_debug_printf ("do_target_wait returned no event");
+	disable_commit_resumed.reset_and_commit ();
+	return;
+      }
 
     gdb_assert (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
 
@@ -3969,6 +4155,8 @@ fetch_inferior_event ()
     /* No error, don't finish the thread states yet.  */
     finish_state.release ();
 
+    disable_commit_resumed.reset_and_commit ();
+
     /* This scope is used to ensure that readline callbacks are
        reinstalled here.  */
   }
diff --git a/gdb/infrun.h b/gdb/infrun.h
index e643c84dd80..220ccc79e8b 100644
--- a/gdb/infrun.h
+++ b/gdb/infrun.h
@@ -273,4 +273,102 @@ extern void all_uis_on_sync_execution_starting (void);
    detach.  */
 extern void restart_after_all_stop_detach (process_stratum_target *proc_target);
 
+/* RAII object to temporarily disable the requirement for target
+   stacks to commit their resumed threads.
+
+   On construction, set process_stratum_target::commit_resumed_state
+   to false for all process_stratum targets in all target
+   stacks.
+
+   On destruction (or if reset_and_commit() is called), set
+   process_stratum_target::commit_resumed_state to true for all
+   process_stratum targets in all target stacks, except those that:
+
+     - have no resumed threads
+     - have a resumed thread with a pending status
+
+   target_commit_resumed is not called in the destructor, because its
+   implementations could throw, and we don't to swallow that error in
+   a destructor.  Instead, the caller should call the
+   reset_and_commit_resumed() method so that an eventual exception can
+   propagate.  "reset" in the method name refers to the fact that this
+   method has the same effect as the destructor, in addition to
+   committing resumes.
+
+   The creation of nested scoped_disable_commit_resumed objects is
+   tracked, such that only the outermost instance actually does
+   something, for cases like this:
+
+     void
+     inner_func ()
+     {
+       scoped_disable_commit_resumed disable;
+
+       // do stuff
+
+       disable.reset_and_commit ();
+     }
+
+     void
+     outer_func ()
+     {
+       scoped_disable_commit_resumed disable;
+
+       for (... each thread ...)
+	 inner_func ();
+
+       disable.reset_and_commit ();
+     }
+
+   In this case, we don't want the `disable` destructor in
+   `inner_func` to require targets to commit resumed threads, so that
+   the `reset_and_commit()` call in `inner_func` doesn't actually
+   resume threads.  */
+
+struct scoped_disable_commit_resumed
+{
+  explicit scoped_disable_commit_resumed (const char *reason);
+  ~scoped_disable_commit_resumed ();
+
+  DISABLE_COPY_AND_ASSIGN (scoped_disable_commit_resumed);
+
+  /* Undoes the disabling done by the ctor, and calls
+     maybe_call_commit_resumed_all_targets().  */
+  void reset_and_commit ();
+
+private:
+  /* Undoes the disabling done by the ctor.  */
+  void reset ();
+
+  /* Whether this object has been reset.  */
+  bool m_reset = false;
+
+  const char *m_reason;
+  bool m_prev_enable_commit_resumed;
+};
+
+/* Call target_commit_resumed method on all target stacks whose
+   process_stratum target layer has COMMIT_RESUME_STATE set.  */
+
+extern void maybe_call_commit_resumed_all_targets ();
+
+/* RAII object to temporarily enable the requirement for target stacks
+   to commit their resumed threads.  This is the inverse of
+   scoped_disable_commit_resumed.  The constructor calls the
+   maybe_call_commit_resumed_all_targets function itself, since it's
+   OK to throw from a constructor.  */
+
+struct scoped_enable_commit_resumed
+{
+  explicit scoped_enable_commit_resumed (const char *reason);
+  ~scoped_enable_commit_resumed ();
+
+  DISABLE_COPY_AND_ASSIGN (scoped_enable_commit_resumed);
+
+private:
+  const char *m_reason;
+  bool m_prev_enable_commit_resumed;
+};
+
+
 #endif /* INFRUN_H */
diff --git a/gdb/mi/mi-main.c b/gdb/mi/mi-main.c
index 9a14d78e1e2..a10ac839967 100644
--- a/gdb/mi/mi-main.c
+++ b/gdb/mi/mi-main.c
@@ -266,6 +266,8 @@ exec_continue (char **argv, int argc)
 {
   prepare_execution_command (current_top_target (), mi_async_p ());
 
+  scoped_disable_commit_resumed disable_commit_resumed ("mi continue");
+
   if (non_stop)
     {
       /* In non-stop mode, 'resume' always resumes a single thread.
@@ -311,6 +313,8 @@ exec_continue (char **argv, int argc)
 	  continue_1 (1);
 	}
     }
+
+  disable_commit_resumed.reset_and_commit ();
 }
 
 static void
diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
index b513c26ffc2..6fddbba90c7 100644
--- a/gdb/process-stratum-target.h
+++ b/gdb/process-stratum-target.h
@@ -72,6 +72,38 @@ class process_stratum_target : public target_ops
 
   /* The connection number.  Visible in "info connections".  */
   int connection_number = 0;
+
+  /* Whether resumed threads must be committed to the target.
+
+     When true, resumed threads must be committed to the execution
+     target.
+
+     When false, the target may leave resumed threads stopped when
+     it's convenient or efficient to do so.  When the core requires
+     resumed threads to be committed again, this is set back to true
+     and calls the `commit_resumed` method to allow the target to do
+     so.
+
+     To simplify the implementation of targets, the following methods
+     are guaranteed to be called with COMMIT_RESUMED_STATE set to
+     false:
+
+       - resume
+       - stop
+       - wait
+
+     Knowing this, the target doesn't need to implement different
+     behaviors depending on the COMMIT_RESUMED_STATE, and can simply
+     assume that it is false.
+
+     Targets can take advantage of this to batch resumption requests,
+     for example.  In that case, the target doesn't actually resume in
+     its `resume` implementation.  Instead, it takes note of the
+     resumption intent in `resume` and defers the actual resumption to
+     `commit_resumed`.  For example, the remote target uses this to
+     coalesce multiple resumption requests in a single vCont
+     packet.  */
+  bool commit_resumed_state = false;
 };
 
 /* Downcast TARGET to process_stratum_target.  */
diff --git a/gdb/record-full.c b/gdb/record-full.c
index 2373741470c..750dcbe66b6 100644
--- a/gdb/record-full.c
+++ b/gdb/record-full.c
@@ -1242,11 +1242,11 @@ record_full_wait_1 (struct target_ops *ops,
 			   break;
   			}
 
+		      process_stratum_target *proc_target
+			= current_inferior ()->process_target ();
+
 		      if (gdbarch_software_single_step_p (gdbarch))
 			{
-			  process_stratum_target *proc_target
-			    = current_inferior ()->process_target ();
-
 			  /* Try to insert the software single step breakpoint.
 			     If insert success, set step to 0.  */
 			  set_executing (proc_target, inferior_ptid, false);
@@ -1263,7 +1263,9 @@ record_full_wait_1 (struct target_ops *ops,
 					    "issuing one more step in the "
 					    "target beneath\n");
 		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
-		      ops->beneath ()->commit_resume ();
+		      proc_target->commit_resumed_state = true;
+		      proc_target->commit_resumed ();
+		      proc_target->commit_resumed_state = false;
 		      continue;
 		    }
 		}
diff --git a/gdb/remote.c b/gdb/remote.c
index a752bd9a4cc..5b2bc240bcb 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -425,7 +425,7 @@ class remote_target : public process_stratum_target
   void detach (inferior *, int) override;
   void disconnect (const char *, int) override;
 
-  void commit_resume () override;
+  void commit_resumed () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
 
@@ -6473,6 +6473,36 @@ get_remote_inferior (inferior *inf)
   return static_cast<remote_inferior *> (inf->priv.get ());
 }
 
+struct stop_reply : public notif_event
+{
+  ~stop_reply ();
+
+  /* The identifier of the thread about this event  */
+  ptid_t ptid;
+
+  /* The remote state this event is associated with.  When the remote
+     connection, represented by a remote_state object, is closed,
+     all the associated stop_reply events should be released.  */
+  struct remote_state *rs;
+
+  struct target_waitstatus ws;
+
+  /* The architecture associated with the expedited registers.  */
+  gdbarch *arch;
+
+  /* Expedited registers.  This makes remote debugging a bit more
+     efficient for those targets that provide critical registers as
+     part of their normal status mechanism (as another roundtrip to
+     fetch them is avoided).  */
+  std::vector<cached_reg_t> regcache;
+
+  enum target_stop_reason stop_reason;
+
+  CORE_ADDR watch_data_address;
+
+  int core;
+};
+
 /* Class used to track the construction of a vCont packet in the
    outgoing packet buffer.  This is used to send multiple vCont
    packets if we have more actions than would fit a single packet.  */
@@ -6577,7 +6607,7 @@ vcont_builder::push_action (ptid_t ptid, bool step, gdb_signal siggnal)
 /* to_commit_resume implementation.  */
 
 void
-remote_target::commit_resume ()
+remote_target::commit_resumed ()
 {
   int any_process_wildcard;
   int may_global_wildcard_vcont;
@@ -6652,6 +6682,8 @@ remote_target::commit_resume ()
      disable process and global wildcard resumes appropriately.  */
   check_pending_events_prevent_wildcard_vcont (&may_global_wildcard_vcont);
 
+  bool any_pending_vcont_resume = false;
+
   for (thread_info *tp : all_non_exited_threads (this))
     {
       remote_thread_info *priv = get_remote_thread_info (tp);
@@ -6668,6 +6700,9 @@ remote_target::commit_resume ()
 	  continue;
 	}
 
+      if (priv->get_resume_state () == resume_state::RESUMED_PENDING_VCONT)
+	any_pending_vcont_resume = true;
+
       /* If a thread is the parent of an unfollowed fork, then we
 	 can't do a global wildcard, as that would resume the fork
 	 child.  */
@@ -6675,6 +6710,11 @@ remote_target::commit_resume ()
 	may_global_wildcard_vcont = 0;
     }
 
+  /* We didn't have any resumed thread pending a vCont resume, so nothing to
+     do.  */
+  if (!any_pending_vcont_resume)
+    return;
+
   /* Now let's build the vCont packet(s).  Actions must be appended
      from narrower to wider scopes (thread -> process -> global).  If
      we end up with too many actions for a single packet vcont_builder
@@ -6695,6 +6735,13 @@ remote_target::commit_resume ()
 
       gdb_assert (!thread_is_in_step_over_chain (tp));
 
+      /* We should never be commit-resuming a thread that has a stop reply.
+         Otherwise, we would end up reporting a stop event for a thread while
+	 it is running on the remote target.  */
+      remote_state *rs = get_remote_state ();
+      for (const auto &stop_reply : rs->stop_reply_queue)
+	gdb_assert (stop_reply->ptid != tp->ptid);
+
       const resumed_pending_vcont_info &info
 	= remote_thr->resumed_pending_vcont_info ();
 
@@ -6760,6 +6807,74 @@ remote_target::remote_stop_ns (ptid_t ptid)
   char *p = rs->buf.data ();
   char *endp = p + get_remote_packet_size ();
 
+  /* If any thread that needs to stop was resumed but pending a vCont
+     resume, generate a phony stop_reply.  However, first check
+     whether the thread wasn't resumed with a signal.  Generating a
+     phony stop in that case would result in losing the signal.  */
+  bool needs_commit = false;
+  for (thread_info *tp : all_non_exited_threads (this, ptid))
+    {
+      remote_thread_info *remote_thr = get_remote_thread_info (tp);
+
+      if (remote_thr->get_resume_state ()
+	  == resume_state::RESUMED_PENDING_VCONT)
+	{
+	  const resumed_pending_vcont_info &info
+	    = remote_thr->resumed_pending_vcont_info ();
+	  if (info.sig != GDB_SIGNAL_0)
+	    {
+	      /* This signal must be forwarded to the inferior.  We
+		 could commit-resume just this thread, but its simpler
+		 to just commit-resume everything.  */
+	      needs_commit = true;
+	      break;
+	    }
+	}
+    }
+
+  if (needs_commit)
+    commit_resumed ();
+  else
+    for (thread_info *tp : all_non_exited_threads (this, ptid))
+      {
+	remote_thread_info *remote_thr = get_remote_thread_info (tp);
+
+	if (remote_thr->get_resume_state ()
+	    == resume_state::RESUMED_PENDING_VCONT)
+	  {
+	    remote_debug_printf ("Enqueueing phony stop reply for thread pending "
+				 "vCont-resume (%d, %ld, %ld)", tp->ptid.pid(),
+				 tp->ptid.lwp (), tp->ptid.tid ());
+
+	    /* Check that the thread wasn't resumed with a signal.
+	       Generating a phony stop would result in losing the
+	       signal.  */
+	    const resumed_pending_vcont_info &info
+	      = remote_thr->resumed_pending_vcont_info ();
+	    gdb_assert (info.sig == GDB_SIGNAL_0);
+
+	    stop_reply *sr = new stop_reply ();
+	    sr->ptid = tp->ptid;
+	    sr->rs = rs;
+	    sr->ws.kind = TARGET_WAITKIND_STOPPED;
+	    sr->ws.value.sig = GDB_SIGNAL_0;
+	    sr->arch = tp->inf->gdbarch;
+	    sr->stop_reason = TARGET_STOPPED_BY_NO_REASON;
+	    sr->watch_data_address = 0;
+	    sr->core = 0;
+	    this->push_stop_reply (sr);
+
+	    /* Pretend that this thread was actually resumed on the
+	       remote target, then stopped.  If we leave it in the
+	       RESUMED_PENDING_VCONT state and the commit_resumed
+	       method is called while the stop reply is still in the
+	       queue, we'll end up reporting a stop event to the core
+	       for that thread while it is running on the remote
+	       target... that would be bad.  */
+	    remote_thr->set_resumed ();
+	  }
+      }
+
   /* FIXME: This supports_vCont_probed check is a workaround until
      packet_support is per-connection.  */
   if (packet_support (PACKET_vCont) == PACKET_SUPPORT_UNKNOWN
@@ -6964,36 +7079,6 @@ remote_console_output (const char *msg)
   gdb_stdtarg->flush ();
 }
 
-struct stop_reply : public notif_event
-{
-  ~stop_reply ();
-
-  /* The identifier of the thread about this event  */
-  ptid_t ptid;
-
-  /* The remote state this event is associated with.  When the remote
-     connection, represented by a remote_state object, is closed,
-     all the associated stop_reply events should be released.  */
-  struct remote_state *rs;
-
-  struct target_waitstatus ws;
-
-  /* The architecture associated with the expedited registers.  */
-  gdbarch *arch;
-
-  /* Expedited registers.  This makes remote debugging a bit more
-     efficient for those targets that provide critical registers as
-     part of their normal status mechanism (as another roundtrip to
-     fetch them is avoided).  */
-  std::vector<cached_reg_t> regcache;
-
-  enum target_stop_reason stop_reason;
-
-  CORE_ADDR watch_data_address;
-
-  int core;
-};
-
 /* Return the length of the stop reply queue.  */
 
 int
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index 1c7999724c7..c0b3129afad 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -14,7 +14,7 @@ struct dummy_target : public target_ops
   void detach (inferior *arg0, int arg1) override;
   void disconnect (const char *arg0, int arg1) override;
   void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
-  void commit_resume () override;
+  void commit_resumed () override;
   ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
@@ -185,7 +185,7 @@ struct debug_target : public target_ops
   void detach (inferior *arg0, int arg1) override;
   void disconnect (const char *arg0, int arg1) override;
   void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
-  void commit_resume () override;
+  void commit_resumed () override;
   ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
@@ -441,22 +441,22 @@ debug_target::resume (ptid_t arg0, int arg1, enum gdb_signal arg2)
 }
 
 void
-target_ops::commit_resume ()
+target_ops::commit_resumed ()
 {
-  this->beneath ()->commit_resume ();
+  this->beneath ()->commit_resumed ();
 }
 
 void
-dummy_target::commit_resume ()
+dummy_target::commit_resumed ()
 {
 }
 
 void
-debug_target::commit_resume ()
+debug_target::commit_resumed ()
 {
-  fprintf_unfiltered (gdb_stdlog, "-> %s->commit_resume (...)\n", this->beneath ()->shortname ());
-  this->beneath ()->commit_resume ();
-  fprintf_unfiltered (gdb_stdlog, "<- %s->commit_resume (", this->beneath ()->shortname ());
+  fprintf_unfiltered (gdb_stdlog, "-> %s->commit_resumed (...)\n", this->beneath ()->shortname ());
+  this->beneath ()->commit_resumed ();
+  fprintf_unfiltered (gdb_stdlog, "<- %s->commit_resumed (", this->beneath ()->shortname ());
   fputs_unfiltered (")\n", gdb_stdlog);
 }
 
diff --git a/gdb/target.c b/gdb/target.c
index c3861209d5c..f056e010d09 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -1993,6 +1993,9 @@ target_wait (ptid_t ptid, struct target_waitstatus *status,
 	     target_wait_flags options)
 {
   target_ops *target = current_top_target ();
+  process_stratum_target *proc_target = current_inferior ()->process_target ();
+
+  gdb_assert (!proc_target->commit_resumed_state);
 
   if (!target->can_async_p ())
     gdb_assert ((options & TARGET_WNOHANG) == 0);
@@ -2046,6 +2049,7 @@ void
 target_resume (ptid_t ptid, int step, enum gdb_signal signal)
 {
   process_stratum_target *curr_target = current_inferior ()->process_target ();
+  gdb_assert (!curr_target->commit_resumed_state);
 
   target_dcache_invalidate ();
 
@@ -2059,28 +2063,6 @@ target_resume (ptid_t ptid, int step, enum gdb_signal signal)
   clear_inline_frame_state (curr_target, ptid);
 }
 
-/* If true, target_commit_resume is a nop.  */
-static int defer_target_commit_resume;
-
-/* See target.h.  */
-
-void
-target_commit_resume (void)
-{
-  if (defer_target_commit_resume)
-    return;
-
-  current_top_target ()->commit_resume ();
-}
-
-/* See target.h.  */
-
-scoped_restore_tmpl<int>
-make_scoped_defer_target_commit_resume ()
-{
-  return make_scoped_restore (&defer_target_commit_resume, 1);
-}
-
 void
 target_pass_signals (gdb::array_view<const unsigned char> pass_signals)
 {
@@ -3141,6 +3123,10 @@ target_update_thread_list (void)
 void
 target_stop (ptid_t ptid)
 {
+  process_stratum_target *proc_target = current_inferior ()->process_target ();
+
+  gdb_assert (!proc_target->commit_resumed_state);
+
   if (!may_stop)
     {
       warning (_("May not interrupt or stop the target, ignoring attempt"));
diff --git a/gdb/target.h b/gdb/target.h
index ee93c5cf395..450a6c7d46b 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -485,8 +485,13 @@ struct target_ops
 			 int TARGET_DEBUG_PRINTER (target_debug_print_step),
 			 enum gdb_signal)
       TARGET_DEFAULT_NORETURN (noprocess ());
-    virtual void commit_resume ()
+
+    /* Ensure that all resumed threads are committed to the target.
+
+       See the description of COMMIT_RESUMED_STATE for more details.  */
+    virtual void commit_resumed ()
       TARGET_DEFAULT_IGNORE ();
+
     /* See target_wait's description.  Note that implementations of
        this method must not assume that inferior_ptid on entry is
        pointing at the thread or inferior that ends up reporting an
@@ -1438,23 +1443,15 @@ extern void target_disconnect (const char *, int);
    target_commit_resume below.  */
 extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
 
-/* Commit a series of resumption requests previously prepared with
-   target_resume calls.
-
-   GDB always calls target_commit_resume after calling target_resume
-   one or more times.  A target may thus use this method in
-   coordination with the target_resume method to batch target-side
-   resumption requests.  In that case, the target doesn't actually
-   resume in its target_resume implementation.  Instead, it prepares
-   the resumption in target_resume, and defers the actual resumption
-   to target_commit_resume.  E.g., the remote target uses this to
-   coalesce multiple resumption requests in a single vCont packet.  */
-extern void target_commit_resume ();
-
-/* Setup to defer target_commit_resume calls, and reactivate
-   target_commit_resume on destruction, if it was previously
-   active.  */
-extern scoped_restore_tmpl<int> make_scoped_defer_target_commit_resume ();
+/* Ensure that all resumed threads are committed to the target.
+
+   See the description of COMMIT_RESUMED_STATE for more details.  */
+
+static inline void
+target_commit_resumed ()
+{
+  current_top_target ()->commit_resumed ();
+}
 
 /* For target_read_memory see target/target.h.  */
 
diff --git a/gdb/top.c b/gdb/top.c
index 3be95079654..31b751fa262 100644
--- a/gdb/top.c
+++ b/gdb/top.c
@@ -517,6 +517,13 @@ wait_sync_command_done (void)
   scoped_restore save_ui = make_scoped_restore (&current_ui);
   struct ui *ui = current_ui;
 
+  /* We're about to wait until the target stops after having resumed
+     it so must force-commit resumptions, in case we're being called
+     in some context where a scoped_disable_commit_resumed object is
+     active.  I.e., this function is a commit-resumed sync/flush
+     point.  */
+  scoped_enable_commit_resumed enable ("sync wait");
+
   while (gdb_do_one_event () >= 0)
     if (ui->prompt_state != PROMPT_BLOCKED)
       break;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed
  2021-03-21  0:02 [PATCH v4 0/2] Reduce back and forth with target when there are pending statuses Pedro Alves
  2021-03-21  0:02 ` [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have " Pedro Alves
@ 2021-03-21  0:02 ` Pedro Alves
  2021-03-25 16:00   ` Simon Marchi
  1 sibling, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-03-21  0:02 UTC (permalink / raw)
  To: gdb-patches

From: Simon Marchi <simon.marchi@efficios.com>

New in v4 (Pedro):

 - Made target_has_events a regular delegating target method.

 - Rename target_has_events -> target_has_pending_events

 - No longer check target_has_events in do_target_wait (that caused
   regressions).

 - No longer switch linux-nat to use an async event source.  Simply
   make target_has_pending_events consult whether the pipe is
   readable.  I ended up disliking that a SIGCHLD results in two round
   trips via the event loop.  Also results in less code and a "single
   source of truth".  And also some divergence from gdbserver.  We can
   always revisit, but this way it doesn't have to be a dependency, it
   can be done afterwards.

 - Removed the sprinkled gdb_assert implementations throughout,
   letting the default of false kick in.

 - No longer regresses against gdbserver, so updated commit log to no
   longer say it does.

 - Misc cleanups, formattings, etc.

Rationale
---------

Let's say you have multiple threads hitting a conditional breakpoint
at the same time, and all of these are going to evaluate to false.
All these threads will need to be resumed.

Currently, GDB fetches one target event (one SIGTRAP representing the
breakpoint hit) and decides that the thread should be resumed.  It
calls resume and commit_resume immediately.  It then fetches the
second target event, and does the same, until it went through all
threads.

The result is therefore something like:

  - consume event for thread A
  - resume thread A
  - commit resume (affects thread A)
  - consume event for thread B
  - resume thread B
  - commit resume (affects thread B)
  - consume event for thread C
  - resume thread C
  - commit resume (affects thread C)

For targets where it's beneficial to group resumptions requests (most
likely those that implement target_ops::commit_resume), it would be
much better to have:

  - consume event for thread A
  - resume thread A
  - consume event for thread B
  - resume thread B
  - consume event for thread C
  - resume thread C
  - commit resume (affects threads A, B and C)

Implementation details
----------------------

To achieve this, this patch adds another check in
maybe_set_commit_resumed_all_targets to avoid setting the
commit-resumed flag of targets that readily have events to provide to
infrun.

To determine if a target has events readily available to report, this
patch adds an `has_pending_events` target_ops method.  The method
returns a simple bool to say whether or not it has pending events to
report.

Testing
=======

To test this, I start GDBserver with a program that spawns multiple
threads:

 $ ../gdbserver/gdbserver --once :1234 ~/src/many-threads-stepping-over-breakpoints/many-threads-stepping-over-breakpoints

I then connect with GDB and install a conditional breakpoint that always
evaluates to false (and force the evaluation to be done by GDB):

 $ ./gdb -nx --data-directory=data-directory \
     /home/simark/src/many-threads-stepping-over-breakpoints/many-threads-stepping-over-breakpoints \
     -ex "set breakpoint condition-evaluation host" \
     -ex "set pag off" \
     -ex "set confirm off" \
     -ex "maint set target-non-stop on" \
     -ex "tar rem :1234" \
     -ex "tb main" \
     -ex "b 13 if 0" \
     -ex c \
     -ex "set debug infrun" \
     -ex "set debug remote 1" \
     -ex "set debug displaced"

I then do "continue" and look at the log.

The remote target receives a bunch of stop notifications for all
threads that have hit the breakpoint.  infrun consumes and processes
one event, decides it should not cause a stop, prepares a displaced
step, after which we should see:

 [infrun] maybe_set_commit_resumed_all_process_targets: not requesting commit-resumed for target remote, target has pending events

Same for a second thread (since we have 2 displaced step buffers).
For the following threads, their displaced step is deferred since
there are no more buffers available.

After consuming the last event the remote target has to offer, we get:

 [infrun] maybe_set_commit_resumed_all_process_targets: enabling commit-resumed for target remote
 [infrun] maybe_call_commit_resumed_all_process_targets: calling commit_resumed for target remote
 [remote] Sending packet: $vCont;s:p14d16b.14d1b1;s:p14d16b.14d1b2#55
 [remote] Packet received: OK

Without the patch, there would have been one vCont;s just after each
prepared displaced step.

gdb/ChangeLog:

	* async-event.c: Include "infrun.h".
	(async_event_handler_marked): New.
	* async-event.h (async_event_handler_marked): Declare.
	* infrun.c (maybe_set_commit_resumed_all_targets): Switch to
	inferior before calling target method.  Don't commit-resumed if
	target_has_pending_events is true.
	* linux-nat.c (async_file_is_marked): New.
	(linux_nat_target::has_pending_events): New.
	* linux-nat.h (linux_nat_target) <has_pending_events>: Declare.
	* remote.c (remote_target::has_pending_events): New.
	* target-delegates.c: Regenerate.
	* target.h (target_ops::has_pending_events): New target method.
	(target_has_pending_events): New.

Change-Id: I18112ba19a1ff4986530c660f530d847bb4a1f1d
---
 gdb/async-event.c      |  9 +++++++++
 gdb/async-event.h      |  3 +++
 gdb/infrun.c           | 12 ++++++++++++
 gdb/linux-nat.c        | 30 ++++++++++++++++++++++++++++++
 gdb/linux-nat.h        |  1 +
 gdb/remote.c           | 21 +++++++++++++++++++++
 gdb/target-delegates.c | 27 +++++++++++++++++++++++++++
 gdb/target.h           | 18 ++++++++++++++++++
 8 files changed, 121 insertions(+)

diff --git a/gdb/async-event.c b/gdb/async-event.c
index c4ab7318179..5cc518467b0 100644
--- a/gdb/async-event.c
+++ b/gdb/async-event.c
@@ -21,6 +21,7 @@
 
 #include "ser-event.h"
 #include "top.h"
+#include "infrun.h"
 
 /* PROC is a function to be invoked when the READY flag is set.  This
    happens when there has been a signal and the corresponding signal
@@ -308,6 +309,14 @@ clear_async_event_handler (async_event_handler *async_handler_ptr)
   async_handler_ptr->ready = 0;
 }
 
+/* See event-loop.h.  */
+
+bool
+async_event_handler_marked (async_event_handler *handler)
+{
+  return handler->ready;
+}
+
 /* Check if asynchronous event handlers are ready, and call the
    handler function for one that is.  */
 
diff --git a/gdb/async-event.h b/gdb/async-event.h
index 9d96235b38d..47759d5c2d3 100644
--- a/gdb/async-event.h
+++ b/gdb/async-event.h
@@ -78,6 +78,9 @@ extern void
    loop.  */
 extern void mark_async_event_handler (struct async_event_handler *handler);
 
+/* Return true if HANDLER is marked.  */
+extern bool async_event_handler_marked (async_event_handler *handler);
+
 /* Mark the handler (ASYNC_HANDLER_PTR) as NOT ready.  */
 
 extern void clear_async_event_handler (struct async_event_handler *handler);
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 0af2f940105..195cd4176e3 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2765,6 +2765,8 @@ schedlock_applies (struct thread_info *tp)
 static void
 maybe_set_commit_resumed_all_targets ()
 {
+  scoped_restore_current_thread restore_thread;
+
   for (inferior *inf : all_non_exited_inferiors ())
     {
       process_stratum_target *proc_target = inf->process_target ();
@@ -2806,6 +2808,16 @@ maybe_set_commit_resumed_all_targets ()
 	  continue;
 	}
 
+      switch_to_inferior_no_thread (inf);
+
+      if (target_has_pending_events ())
+	{
+	  infrun_debug_printf ("not requesting commit-resumed for target %s, "
+			       "target has pending events",
+			       proc_target->shortname ());
+	  continue;
+	}
+
       infrun_debug_printf ("enabling commit-resumed for target %s",
 			   proc_target->shortname ());
 
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index ccfd3c1320c..1186e10b925 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -263,6 +263,28 @@ async_file_mark (void)
      be awakened anyway.  */
 }
 
+/* Return true if the event pipe is marked.  */
+
+static bool
+async_file_is_marked ()
+{
+  int fd = linux_nat_event_pipe[0];
+  if (fd != -1)
+    {
+      struct timeval timeout {};
+
+      fd_set set;
+      FD_ZERO (&set);
+      FD_SET (fd, &set);
+
+      int n = select (fd + 1, &set, NULL, NULL, &timeout);
+      if (n > 0)
+       return true;
+    }
+
+  return false;
+}
+
 static int kill_lwp (int lwpid, int signo);
 
 static int stop_callback (struct lwp_info *lp);
@@ -4104,6 +4126,14 @@ linux_nat_target::async_wait_fd ()
   return linux_nat_event_pipe[0];
 }
 
+/* target_has_pending_events implementation.  */
+
+bool
+linux_nat_target::has_pending_events ()
+{
+  return async_file_is_marked ();
+}
+
 /* target_async implementation.  */
 
 void
diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h
index ff4d753422d..fb731c7f889 100644
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -90,6 +90,7 @@ class linux_nat_target : public inf_ptrace_target
 
   int async_wait_fd () override;
   void async (int) override;
+  bool has_pending_events () override;
 
   void close () override;
 
diff --git a/gdb/remote.c b/gdb/remote.c
index 5b2bc240bcb..893f0c1399d 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -428,6 +428,7 @@ class remote_target : public process_stratum_target
   void commit_resumed () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
+  bool has_pending_events () override;
 
   void fetch_registers (struct regcache *, int) override;
   void store_registers (struct regcache *, int) override;
@@ -6794,6 +6795,26 @@ remote_target::commit_resumed ()
   vcont_builder.flush ();
 }
 
+/* Implementation of target_has_pending_events.  */
+
+bool
+remote_target::has_pending_events ()
+{
+  if (target_can_async_p ())
+    {
+      remote_state *rs = get_remote_state ();
+
+      if (async_event_handler_marked (rs->remote_async_inferior_event_token))
+	return true;
+
+      /* Note that BUFCNT can be negative, indicating sticky
+	 error.  */
+      if (rs->remote_desc->bufcnt != 0)
+	return true;
+    }
+  return false;
+}
+
 \f
 
 /* Non-stop version of target_stop.  Uses `vCont;t' to stop a remote
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index c0b3129afad..1b2400175de 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -84,6 +84,7 @@ struct dummy_target : public target_ops
   bool is_async_p () override;
   void async (int arg0) override;
   int async_wait_fd () override;
+  bool has_pending_events () override;
   void thread_events (int arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
@@ -255,6 +256,7 @@ struct debug_target : public target_ops
   bool is_async_p () override;
   void async (int arg0) override;
   int async_wait_fd () override;
+  bool has_pending_events () override;
   void thread_events (int arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
@@ -2193,6 +2195,31 @@ debug_target::async_wait_fd ()
   return result;
 }
 
+bool
+target_ops::has_pending_events ()
+{
+  return this->beneath ()->has_pending_events ();
+}
+
+bool
+dummy_target::has_pending_events ()
+{
+  return false;
+}
+
+bool
+debug_target::has_pending_events ()
+{
+  bool result;
+  fprintf_unfiltered (gdb_stdlog, "-> %s->has_pending_events (...)\n", this->beneath ()->shortname ());
+  result = this->beneath ()->has_pending_events ();
+  fprintf_unfiltered (gdb_stdlog, "<- %s->has_pending_events (", this->beneath ()->shortname ());
+  fputs_unfiltered (") = ", gdb_stdlog);
+  target_debug_print_bool (result);
+  fputs_unfiltered ("\n", gdb_stdlog);
+  return result;
+}
+
 void
 target_ops::thread_events (int arg0)
 {
diff --git a/gdb/target.h b/gdb/target.h
index 450a6c7d46b..3be49c06d33 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -717,6 +717,15 @@ struct target_ops
       TARGET_DEFAULT_NORETURN (tcomplain ());
     virtual int async_wait_fd ()
       TARGET_DEFAULT_NORETURN (noprocess ());
+    /* Return true if the target has pending events to report to the
+       core.  If true, then GDB avoids resuming the target until all
+       pending events are consumed, so that multiple resumptions can
+       be coalesced as an optimization.  Most targets can't tell
+       whether they have pending events without calling target_wait,
+       so we default to returning false.  The only downside is that a
+       potential optimization is missed.  */
+    virtual bool has_pending_events ()
+      TARGET_DEFAULT_RETURN (false);
     virtual void thread_events (int)
       TARGET_DEFAULT_IGNORE ();
     /* This method must be implemented in some situations.  See the
@@ -1462,6 +1471,15 @@ extern ptid_t default_target_wait (struct target_ops *ops,
 				   struct target_waitstatus *status,
 				   target_wait_flags options);
 
+/* Return true if the target has pending events to report to the core.
+   See target_ops::has_pending_events().  */
+
+static inline bool
+target_has_pending_events ()
+{
+  return current_top_target ()->has_pending_events ();
+}
+
 /* Fetch at least register REGNO, or all regs if regno == -1.  No result.  */
 
 extern void target_fetch_registers (struct regcache *regcache, int regno);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-03-21  0:02 ` [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have " Pedro Alves
@ 2021-03-25 15:16   ` Simon Marchi
  2021-03-26 16:11     ` Pedro Alves
  2021-07-11 19:42   ` Jonah Graham
  1 sibling, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-03-25 15:16 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 2021-03-20 8:02 p.m., Pedro Alves wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> New in v4 (Pedro):
> 
>  - Fixed "phony stop" generation -- if a thread had been resumed with
>    a signal, we would lose the signal.  This was caught by the
>    testsuite against gdbserver + "maint set target-non-stop on".
> 
>  - Add reset_and_commit() method to scoped_disable_commit_resumed,
>    which gets rid of the need for creating a bunch of scopes and
>    intending a bunch of code.  I originally had this idea while
>    looking at the optional<scoped_disable_commit_resumed> in infrun.c
>    in the previous versions of this patch, and ended up generalizing
>    it and using it throughout.
> 
>  - Add scoped_enable_commit_resumed, the counterpart of
>    scoped_disable_commit_resumed: re-enables commit-resumed, and
>    commit-resumes.  Called from wait_sync_command_done, which is a
>    "sync" point where we need forward progress.  This fixes
>    regressions introduced in previous iterations, with "maint set
>    target-non-stop on".
> 
>  - Renamed and inverted logic of the defer_enable_commit_resumed
>    global.  It's now called 'enable_commit_resumed'.  The negated
>    logic got really confusing, particularly with
>    scoped_enable_commit_resumed, which led to double negation.
> 
>  - No need for
>    scoped_disable_commit_resumed::maybe_set_commit_resumed_all_process_targets
>    to be a class method.  Now a static free function.
> 
>  - replace assert in maybe_set_commit_resumed_all_process_targets with
>    continue.  Testing with "maint set target-non-stop on" ran against
>    this assertion (e.g., fork tests, which end up with multiple
>    inferiors sharing the same process_stratum target).
> 
>  - add and use target_commit_resumed
> 
>  - all_process_targets removed, replaced with iterating over inferiors
>    in the callers.  all_process_targets was iterating over inferiors
>    itself anyway, and it was building a std::set, which was iterated
>    on by the callers, and then immediately discarded.  Because the
>    std::set was never reused, this ends up being a pessimization.
>    Also, given the potential for targets on the target stack higher
>    than process_stratum stratum wanting to override commit_resumed(),
>    it's just better to iterate over inferiors, or IOW, iterate over
>    target stacks.
> 
>  - Also considering the potential for targets higher in the target
>    stack wanting to override target_commit_resumed, this version does
>    not depend on the patch to move the target_ops::commit_resume
>    method to process_stratum_target anymore.  Note all targets on a
>    stack can use the same process_stratum_target::commit_resumed_state
>    flag, just like they share the
>    process_stratum_target::threads_executing flag.
> 
>  - Misc cleanup, renaming, refactoring, etc. throughout.

I agree with all of the above.

I noted some minor comments below, but otherwise, this LGTM.

> The rationale for this patch comes from the ROCm port [1], the goal
> being to reduce the number of back and forths between GDB and the
> target when doing successive operations.  I'll start with explaining
> the rationale and then go over the implementation.  In the ROCm / GPU
> world, the term "wave" is somewhat equivalent to a "thread" in GDB.
> So if you read if from a GPU stand point, just s/thread/wave/.
> 
> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
> target, gives the illusion that it's possible for the debugger to
> control (start and stop) individual threads.  But in reality, this is
> not how it works.  Under the hood, all threads of a queue are
> controlled as a group.  To stop one thread in a group of running ones,
> the state of all threads is retrieved from the GPU, all threads are
> destroyed, and all threads but the one we want to stop are re-created
> from the saved state.  The net result, from the point of view of GDB,
> is that the library stopped one thread.  The same thing goes if we
> want to resume one thread while others are running: the state of all
> running threads is retrieved from the GPU, they are all destroyed, and
> they are all re-created, including the thread we want to resume.
> 
> This leads to some inefficiencies when combined with how GDB works,
> here are two examples:
> 
>  - Stopping all threads: because the target operates in non-stop mode,
>    when the user interface mode is all-stop, GDB must stop all threads
>    individually when presenting a stop.  Let's suppose we have 1000
>    threads and the user does ^C.  GDB asks the target to stop one
>    thread.  Behind the scenes, the library retrieves 1000 thread
>    states and restores the 999 others still running ones.  GDB asks
>    the target to stop another one.  The target retrieves 999 thread
>    states and restores the 998 remaining ones.  That means that to
>    stop 1000 threads, we did 1000 back and forths with the GPU.  It
>    would have been much better to just retrieve the states once and
>    stop there.
> 
>  - Resuming with pending events: suppose the 1000 threads hit a
>    breakpoint at the same time.  The breakpoint is conditional and
>    evaluates to true for the first thread, to false for all others.
>    GDB pulls one event (for the first thread) from the target, decides
>    that it should present a stop, so stops all threads using
>    stop_all_threads.  All these other threads have a breakpoint event
>    to report, which is saved in `thread_info::suspend::waitstatus` for
>    later.  When the user does "continue", GDB resumes that one thread
>    that did hit the breakpoint.  It then processes the pending events
>    one by one as if they just arrived.  It picks one, evaluates the
>    condition to false, and resumes the thread.  It picks another one,
>    evaluates the condition to false, and resumes the thread.  And so
>    on.  In between each resumption, there is a full state retrieval
>    and re-creation.  It would be much nicer if we could wait a little
>    bit before sending those threads on the GPU, until it processed all
>    those pending events.
> 
> To address this kind of performance issue, ROCdbgapi has a concept
> called "forward progress required", which is a boolean state that
> allows its user (i.e. GDB) to say "I'm doing a bunch of operations,
> you can hold off putting the threads on the GPU until I'm done" (the
> "forward progress not required" state).  Turning forward progress back
> on indicates to the library that all threads that are supposed to be
> running should now be really running on the GPU.
> 
> It turns out that GDB has a similar concept, though not as general,
> commit_resume.  One difference is that commit_resume is not stateful:
> the target can't look up "does the core need me to schedule resumed
> threads for execution right now".  It is also specifically linked to
> the resume method, it is not used in other contexts.  The target
> accumulates resumption requests through target_ops::resume calls, and
> then commits those resumptions when target_ops::commit_resume is
> called.  The target has no way to check if it's ok to leave resumed
> threads stopped in other target methods.
> 
> To bridge the gap, this patch generalizes the commit_resume concept in
> GDB to match the forward progress concept of ROCdbgapi.  The current
> name (commit_resume) can be interpreted as "commit the previous resume
> calls".  I renamed the concept to "commit_resumed", as in "commit the
> threads that are resumed".
> 
> In the new version, we have two things in process_stratum_target:
> 
>  - the commit_resumed_state field: indicates whether GDB requires this
>    target to have resumed threads committed to the execution
>    target/device.  If false, the target is allowed to leave resumed
>    threads un-committed at the end of whatever method it is executing.
> 
>  - the commit_resumed method: called when commit_resumed_state
>    transitions from false to true.  While commit_resumed_state was
>    false, the target may have left some resumed threads un-committed.
>    This method being called tells it that it should commit them back
>    to the execution device.

The above would need to be updated, since it still states that the
commit_resumed method is added to process_stratum_target.

> Let's take the "Stopping all threads" scenario from above and see how
> it would work with the ROCm target with this change.  Before stopping
> all threads, GDB would set the target's commit_resumed_state field to
> false.  It would then ask the target to stop the first thread.  The
> target would retrieve all threads' state from the GPU and mark that
> one as stopped.  Since commit_resumed_state is false, it leaves all
> the other threads (still resumed) stopped.  GDB would then proceed to
> call target_stop for all the other threads.  Since resumed threads are
> not committed, this doesn't do any back and forth with the GPU.
> 
> To simplify the implementation of targets, this patch makes it so that
> when calling certain target methods, the contract between the core and
> the targets guarantees that commit_resumed_state is false.  This way,
> the target doesn't need two paths, one commit_resumed_state == true
> and one for commit_resumed_state == false.  It can just assert that

I think I was missing a "for" between "one" and "commit_resumed_state ==
true".

> commit_resumed_state is false and work with that assumption.  This
> also helps catch places where we forgot to disable
> commit_resumed_state before calling the method, which represents a
> probable optimization opportunity.  I added assertions in the target
> method wrappers (target_resume and friends) to have some confidence
> that this contract between the core and the targets is respected.
> 
> The scoped_disable_commit_resumed type is used to disable the commit
> resumed state of all process targets on construction, and selectively
> re-enable it on destruction (see below for criteria).  Note that it
> only sets the process_stratum_target::commit_resumed_state flag.  A
> subsequent call to maybe_call_commit_resumed_all_process_targets is

Update maybe_call_commit_resumed_all_process_targets to
maybe_call_commit_resumed_all_targets.

> necessary to call the process_stratum_target::commit_resumed method on
> all process targets that got their commit_resumed_state flag turned
> back on.  This separation is because we don't want to call the
> commit_resumed methods in scoped_disable_commit_resumed's destructor,
> as they can throw.

The above should be updated to reflect that the commit_resumed is in the
target_ops class, and we call it on all target stacks (not all process
targets).

> @@ -2760,28 +2758,208 @@ schedlock_applies (struct thread_info *tp)
>  					    execution_direction)));
>  }
>  
> -/* Calls target_commit_resume on all targets.  */
> +/* Set process_stratum_target::COMMIT_RESUMED_STATE in all target
> +   stacks that have threads executing and don't have threads with
> +   pending events.  */
>  
>  static void
> -commit_resume_all_targets ()
> +maybe_set_commit_resumed_all_targets ()
> +{
> +  for (inferior *inf : all_non_exited_inferiors ())
> +    {
> +      process_stratum_target *proc_target = inf->process_target ();
> +
> +      if (proc_target->commit_resumed_state)
> +	{
> +	  /* We already set this in a previous iteration, via another
> +	     inferior sharing the process_stratum target.  */
> +	  continue;
> +	}
> +
> +      /* If the target has no resumed threads, it would be useless to
> +	 ask it to commit the resumed threads.  */
> +      if (!proc_target->threads_executing)
> +	{
> +	  infrun_debug_printf ("not re-enabling forward progress for target "
> +			       "%s, no resumed threads",
> +			       proc_target->shortname ());

"forward progress" is not a term used in GDB itself, so maybe change
this debug message to use "commit resumed" instead?

> @@ -3075,7 +3253,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>        }
>    }
>  
> -  commit_resume_all_targets ();
> +  maybe_call_commit_resumed_all_targets ();

Should this maybe_call_commit_resumed_all_targets call be a

    disable_commit_resumed.reset_and_commit ()

at the end of the scope just above?  In fact, the scope is probably
unnecessary now.

> diff --git a/gdb/target.h b/gdb/target.h
> index ee93c5cf395..450a6c7d46b 100644
> --- a/gdb/target.h
> +++ b/gdb/target.h
> @@ -485,8 +485,13 @@ struct target_ops
>  			 int TARGET_DEBUG_PRINTER (target_debug_print_step),
>  			 enum gdb_signal)
>        TARGET_DEFAULT_NORETURN (noprocess ());
> -    virtual void commit_resume ()
> +
> +    /* Ensure that all resumed threads are committed to the target.
> +
> +       See the description of COMMIT_RESUMED_STATE for more details.  */

Perhaps say "process_stratum_target::commit_resumed_state", since it's
in a different class / file.

> +    virtual void commit_resumed ()
>        TARGET_DEFAULT_IGNORE ();
> +
>      /* See target_wait's description.  Note that implementations of
>         this method must not assume that inferior_ptid on entry is
>         pointing at the thread or inferior that ends up reporting an
> @@ -1438,23 +1443,15 @@ extern void target_disconnect (const char *, int);
>     target_commit_resume below.  */
>  extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
>  
> -/* Commit a series of resumption requests previously prepared with
> -   target_resume calls.
> -
> -   GDB always calls target_commit_resume after calling target_resume
> -   one or more times.  A target may thus use this method in
> -   coordination with the target_resume method to batch target-side
> -   resumption requests.  In that case, the target doesn't actually
> -   resume in its target_resume implementation.  Instead, it prepares
> -   the resumption in target_resume, and defers the actual resumption
> -   to target_commit_resume.  E.g., the remote target uses this to
> -   coalesce multiple resumption requests in a single vCont packet.  */
> -extern void target_commit_resume ();
> -
> -/* Setup to defer target_commit_resume calls, and reactivate
> -   target_commit_resume on destruction, if it was previously
> -   active.  */
> -extern scoped_restore_tmpl<int> make_scoped_defer_target_commit_resume ();
> +/* Ensure that all resumed threads are committed to the target.
> +
> +   See the description of COMMIT_RESUMED_STATE for more details.  */

Here too.

> +
> +static inline void
> +target_commit_resumed ()
> +{
> +  current_top_target ()->commit_resumed ();

Following

    328d42d87e97 ("gdb: remove current_top_target function")

This will need to be implemented in target.c.

I suppose it wouldn't hurt to put a

    gdb_assert (current_inferior ()->process_target ()->commit_resumed_state);

in here.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed
  2021-03-21  0:02 ` [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed Pedro Alves
@ 2021-03-25 16:00   ` Simon Marchi
  2021-03-26 16:16     ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-03-25 16:00 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 2021-03-20 8:02 p.m., Pedro Alves wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> New in v4 (Pedro):
> 
>  - Made target_has_events a regular delegating target method.
> 
>  - Rename target_has_events -> target_has_pending_events
> 
>  - No longer check target_has_events in do_target_wait (that caused
>    regressions).
> 
>  - No longer switch linux-nat to use an async event source.  Simply
>    make target_has_pending_events consult whether the pipe is
>    readable.  I ended up disliking that a SIGCHLD results in two round
>    trips via the event loop.  Also results in less code and a "single
>    source of truth".  And also some divergence from gdbserver.  We can
>    always revisit, but this way it doesn't have to be a dependency, it
>    can be done afterwards.

Since linux-nat doesn't benefit from commit_resumed, we could also just
not implement has_pending_events in linux_nat_target, and let it default
to "return false".

> 
>  - Removed the sprinkled gdb_assert implementations throughout,
>    letting the default of false kick in.
> 
>  - No longer regresses against gdbserver, so updated commit log to no
>    longer say it does.
> 
>  - Misc cleanups, formattings, etc.

I don't think I previously submitted this patch upstream, I just had it
as a "draft".  But this is fine, the goal was to submit it eventually.
I agree with all of the above.

> Rationale
> ---------
> 
> Let's say you have multiple threads hitting a conditional breakpoint
> at the same time, and all of these are going to evaluate to false.
> All these threads will need to be resumed.
> 
> Currently, GDB fetches one target event (one SIGTRAP representing the
> breakpoint hit) and decides that the thread should be resumed.  It
> calls resume and commit_resume immediately.  It then fetches the

commit_resume -> commit_resumed

> second target event, and does the same, until it went through all
> threads.
> 
> The result is therefore something like:
> 
>   - consume event for thread A
>   - resume thread A
>   - commit resume (affects thread A)
>   - consume event for thread B
>   - resume thread B
>   - commit resume (affects thread B)
>   - consume event for thread C
>   - resume thread C
>   - commit resume (affects thread C)
> 
> For targets where it's beneficial to group resumptions requests (most
> likely those that implement target_ops::commit_resume), it would be
> much better to have:
> 
>   - consume event for thread A
>   - resume thread A
>   - consume event for thread B
>   - resume thread B
>   - consume event for thread C
>   - resume thread C
>   - commit resume (affects threads A, B and C)

Should be "commit resumed" in all of the above.

Otherwise, this all LGTM.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-03-25 15:16   ` Simon Marchi
@ 2021-03-26 16:11     ` Pedro Alves
  2021-03-26 16:17       ` Simon Marchi
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-03-26 16:11 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 25/03/21 15:16, Simon Marchi wrote:

>> In the new version, we have two things in process_stratum_target:
>>
>>  - the commit_resumed_state field: indicates whether GDB requires this
>>    target to have resumed threads committed to the execution
>>    target/device.  If false, the target is allowed to leave resumed
>>    threads un-committed at the end of whatever method it is executing.
>>
>>  - the commit_resumed method: called when commit_resumed_state
>>    transitions from false to true.  While commit_resumed_state was
>>    false, the target may have left some resumed threads un-committed.
>>    This method being called tells it that it should commit them back
>>    to the execution device.
> 
> The above would need to be updated, since it still states that the
> commit_resumed method is added to process_stratum_target.

Done.

> 
>> Let's take the "Stopping all threads" scenario from above and see how
>> it would work with the ROCm target with this change.  Before stopping
>> all threads, GDB would set the target's commit_resumed_state field to
>> false.  It would then ask the target to stop the first thread.  The
>> target would retrieve all threads' state from the GPU and mark that
>> one as stopped.  Since commit_resumed_state is false, it leaves all
>> the other threads (still resumed) stopped.  GDB would then proceed to
>> call target_stop for all the other threads.  Since resumed threads are
>> not committed, this doesn't do any back and forth with the GPU.
>>
>> To simplify the implementation of targets, this patch makes it so that
>> when calling certain target methods, the contract between the core and
>> the targets guarantees that commit_resumed_state is false.  This way,
>> the target doesn't need two paths, one commit_resumed_state == true
>> and one for commit_resumed_state == false.  It can just assert that
> 
> I think I was missing a "for" between "one" and "commit_resumed_state ==
> true".

Fixed.

> 
>> commit_resumed_state is false and work with that assumption.  This
>> also helps catch places where we forgot to disable
>> commit_resumed_state before calling the method, which represents a
>> probable optimization opportunity.  I added assertions in the target
>> method wrappers (target_resume and friends) to have some confidence
>> that this contract between the core and the targets is respected.
>>
>> The scoped_disable_commit_resumed type is used to disable the commit
>> resumed state of all process targets on construction, and selectively
>> re-enable it on destruction (see below for criteria).  Note that it
>> only sets the process_stratum_target::commit_resumed_state flag.  A
>> subsequent call to maybe_call_commit_resumed_all_process_targets is
> 
> Update maybe_call_commit_resumed_all_process_targets to
> maybe_call_commit_resumed_all_targets.

Fixed.

> 
>> necessary to call the process_stratum_target::commit_resumed method on
>> all process targets that got their commit_resumed_state flag turned
>> back on.  This separation is because we don't want to call the
>> commit_resumed methods in scoped_disable_commit_resumed's destructor,
>> as they can throw.
> 
> The above should be updated to reflect that the commit_resumed is in the
> target_ops class, and we call it on all target stacks (not all process
> targets).
> 

Done.

>> @@ -2760,28 +2758,208 @@ schedlock_applies (struct thread_info *tp)
>>  					    execution_direction)));
>>  }
>>  
>> -/* Calls target_commit_resume on all targets.  */
>> +/* Set process_stratum_target::COMMIT_RESUMED_STATE in all target
>> +   stacks that have threads executing and don't have threads with
>> +   pending events.  */
>>  
>>  static void
>> -commit_resume_all_targets ()
>> +maybe_set_commit_resumed_all_targets ()
>> +{
>> +  for (inferior *inf : all_non_exited_inferiors ())
>> +    {
>> +      process_stratum_target *proc_target = inf->process_target ();
>> +
>> +      if (proc_target->commit_resumed_state)
>> +	{
>> +	  /* We already set this in a previous iteration, via another
>> +	     inferior sharing the process_stratum target.  */
>> +	  continue;
>> +	}
>> +
>> +      /* If the target has no resumed threads, it would be useless to
>> +	 ask it to commit the resumed threads.  */
>> +      if (!proc_target->threads_executing)
>> +	{
>> +	  infrun_debug_printf ("not re-enabling forward progress for target "
>> +			       "%s, no resumed threads",
>> +			       proc_target->shortname ());
> 
> "forward progress" is not a term used in GDB itself, so maybe change
> this debug message to use "commit resumed" instead?

Fixed, using the same language used in the other condition, also based on
"commit-resume".

> 
>> @@ -3075,7 +3253,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>>        }
>>    }
>>  
>> -  commit_resume_all_targets ();
>> +  maybe_call_commit_resumed_all_targets ();
> 
> Should this maybe_call_commit_resumed_all_targets call be a
> 
>     disable_commit_resumed.reset_and_commit ()
> 
> at the end of the scope just above?  In fact, the scope is probably
> unnecessary now.

I put disable_commit_resumed.reset_and_commit () in the scope.
I left the scope as is, since it's already there.

> 
>> diff --git a/gdb/target.h b/gdb/target.h
>> index ee93c5cf395..450a6c7d46b 100644
>> --- a/gdb/target.h
>> +++ b/gdb/target.h
>> @@ -485,8 +485,13 @@ struct target_ops
>>  			 int TARGET_DEBUG_PRINTER (target_debug_print_step),
>>  			 enum gdb_signal)
>>        TARGET_DEFAULT_NORETURN (noprocess ());
>> -    virtual void commit_resume ()
>> +
>> +    /* Ensure that all resumed threads are committed to the target.
>> +
>> +       See the description of COMMIT_RESUMED_STATE for more details.  */
> 
> Perhaps say "process_stratum_target::commit_resumed_state", since it's
> in a different class / file.

Fixed.

> 
>> +    virtual void commit_resumed ()
>>        TARGET_DEFAULT_IGNORE ();
>> +
>>      /* See target_wait's description.  Note that implementations of
>>         this method must not assume that inferior_ptid on entry is
>>         pointing at the thread or inferior that ends up reporting an
>> @@ -1438,23 +1443,15 @@ extern void target_disconnect (const char *, int);
>>     target_commit_resume below.  */
>>  extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
>>  
>> -/* Commit a series of resumption requests previously prepared with
>> -   target_resume calls.
>> -
>> -   GDB always calls target_commit_resume after calling target_resume
>> -   one or more times.  A target may thus use this method in
>> -   coordination with the target_resume method to batch target-side
>> -   resumption requests.  In that case, the target doesn't actually
>> -   resume in its target_resume implementation.  Instead, it prepares
>> -   the resumption in target_resume, and defers the actual resumption
>> -   to target_commit_resume.  E.g., the remote target uses this to
>> -   coalesce multiple resumption requests in a single vCont packet.  */
>> -extern void target_commit_resume ();
>> -
>> -/* Setup to defer target_commit_resume calls, and reactivate
>> -   target_commit_resume on destruction, if it was previously
>> -   active.  */
>> -extern scoped_restore_tmpl<int> make_scoped_defer_target_commit_resume ();
>> +/* Ensure that all resumed threads are committed to the target.
>> +
>> +   See the description of COMMIT_RESUMED_STATE for more details.  */
> 
> Here too.

Fixed here too.

> 
>> +
>> +static inline void
>> +target_commit_resumed ()
>> +{
>> +  current_top_target ()->commit_resumed ();
> 
> Following
> 
>     328d42d87e97 ("gdb: remove current_top_target function")
> 
> This will need to be implemented in target.c.
> 

Hmm, hadn't noticed that, but then again, I haven't managed to follow
the list as much as I'd like.  Can't say I agree that it's much of an
improvement.  I had added the function because the pattern is used in
many spots and a shorthand seemed apropos.  The "current" in the name
also suggests it relies on global state, so it's not really hidden.
It was also nice in letting us write target method wrappers in the header,
which we can't now, as per your comment.


> I suppose it wouldn't hurt to put a
> 
>     gdb_assert (current_inferior ()->process_target ()->commit_resumed_state);
> 
> in here.

Done.

I've merged the patch, along with the preparatory patches, to master.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed
  2021-03-25 16:00   ` Simon Marchi
@ 2021-03-26 16:16     ` Pedro Alves
  2021-03-26 16:18       ` Simon Marchi
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-03-26 16:16 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 25/03/21 16:00, Simon Marchi wrote:
> On 2021-03-20 8:02 p.m., Pedro Alves wrote:
>> From: Simon Marchi <simon.marchi@efficios.com>
>>
>> New in v4 (Pedro):
>>
>>  - Made target_has_events a regular delegating target method.
>>
>>  - Rename target_has_events -> target_has_pending_events
>>
>>  - No longer check target_has_events in do_target_wait (that caused
>>    regressions).
>>
>>  - No longer switch linux-nat to use an async event source.  Simply
>>    make target_has_pending_events consult whether the pipe is
>>    readable.  I ended up disliking that a SIGCHLD results in two round
>>    trips via the event loop.  Also results in less code and a "single
>>    source of truth".  And also some divergence from gdbserver.  We can
>>    always revisit, but this way it doesn't have to be a dependency, it
>>    can be done afterwards.
> 
> Since linux-nat doesn't benefit from commit_resumed, we could also just
> not implement has_pending_events in linux_nat_target, and let it default
> to "return false".

Hmm, indeed.  It was necessary before I removed the check for target_has_events
from do_target_wait, otherwise we'd never consult the target.  But without that call,
indeed we don't need it.  I removed it.

I also noticed that the infrun.h include in async-event.c isn't needed,
so I removed that.

>> Rationale
>> ---------
>>
>> Let's say you have multiple threads hitting a conditional breakpoint
>> at the same time, and all of these are going to evaluate to false.
>> All these threads will need to be resumed.
>>
>> Currently, GDB fetches one target event (one SIGTRAP representing the
>> breakpoint hit) and decides that the thread should be resumed.  It
>> calls resume and commit_resume immediately.  It then fetches the
> 
> commit_resume -> commit_resumed
> 

Argh, I thought I had adjusted this before pushing, but I didn't.  Too late now.

Sorry about that!

> Otherwise, this all LGTM.

Great, I merged it to master now.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-03-26 16:11     ` Pedro Alves
@ 2021-03-26 16:17       ` Simon Marchi
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Marchi @ 2021-03-26 16:17 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

>> Following
>>
>>     328d42d87e97 ("gdb: remove current_top_target function")
>>
>> This will need to be implemented in target.c.
>>
> 
> Hmm, hadn't noticed that, but then again, I haven't managed to follow
> the list as much as I'd like.  Can't say I agree that it's much of an
> improvement.  I had added the function because the pattern is used in
> many spots and a shorthand seemed apropos.  The "current" in the name
> also suggests it relies on global state, so it's not really hidden.
> It was also nice in letting us write target method wrappers in the header,
> which we can't now, as per your comment.

We can discuss this on that patch's thread if you'd like.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed
  2021-03-26 16:16     ` Pedro Alves
@ 2021-03-26 16:18       ` Simon Marchi
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Marchi @ 2021-03-26 16:18 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

>>> Rationale
>>> ---------
>>>
>>> Let's say you have multiple threads hitting a conditional breakpoint
>>> at the same time, and all of these are going to evaluate to false.
>>> All these threads will need to be resumed.
>>>
>>> Currently, GDB fetches one target event (one SIGTRAP representing the
>>> breakpoint hit) and decides that the thread should be resumed.  It
>>> calls resume and commit_resume immediately.  It then fetches the
>>
>> commit_resume -> commit_resumed
>>
> 
> Argh, I thought I had adjusted this before pushing, but I didn't.  Too late now.
> 
> Sorry about that!

Bah, really not a big deal!

> 
>> Otherwise, this all LGTM.
> 
> Great, I merged it to master now.

Thanks!

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-03-21  0:02 ` [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have " Pedro Alves
  2021-03-25 15:16   ` Simon Marchi
@ 2021-07-11 19:42   ` Jonah Graham
  2021-07-12  0:24     ` Jonah Graham
  2021-07-12  2:33     ` [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group Simon Marchi
  1 sibling, 2 replies; 16+ messages in thread
From: Jonah Graham @ 2021-07-11 19:42 UTC (permalink / raw)
  To: Pedro Alves, Simon Marchi; +Cc: gdb-patches

Hi Pedro and Simon,

On Sat, 20 Mar 2021 at 20:02, Pedro Alves <pedro@palves.net> wrote:

> From: Simon Marchi <simon.marchi@efficios.com>
>
> New in v4 (Pedro):
>
>
Unfortunately this change seems to have caused a regression when using MI
to interrupt the target in non-stop mode. I have raised
https://sourceware.org/bugzilla/show_bug.cgi?id=28077, but the summary is
that in non-stop & async mode doing a -exec-interrupt passing
--thread-group causes an assertion:

~"/scratch/gdb/binutils-gdb/gdb/target.c:3757: internal-error: void
target_stop(ptid_t): Assertion `!proc_target->commit_resumed_state'
failed.\nA problem internal to GDB has been detected,\nfurther debugging
may prove unreliable."

Thanks,
Jonah

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-07-11 19:42   ` Jonah Graham
@ 2021-07-12  0:24     ` Jonah Graham
  2021-07-12  2:33     ` [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group Simon Marchi
  1 sibling, 0 replies; 16+ messages in thread
From: Jonah Graham @ 2021-07-12  0:24 UTC (permalink / raw)
  To: Pedro Alves, Simon Marchi; +Cc: gdb-patches

~~~
Jonah Graham
Kichwa Coders
www.kichwacoders.com


On Sun, 11 Jul 2021 at 15:42, Jonah Graham <jonah@kichwacoders.com> wrote:

> Hi Pedro and Simon,
>
> On Sat, 20 Mar 2021 at 20:02, Pedro Alves <pedro@palves.net> wrote:
>
>> From: Simon Marchi <simon.marchi@efficios.com>
>>
>> New in v4 (Pedro):
>>
>>
> Unfortunately this change seems to have caused a regression when using MI
> to interrupt the target in non-stop mode. I have raised
> https://sourceware.org/bugzilla/show_bug.cgi?id=28077, but the summary is
> that in non-stop & async mode doing a -exec-interrupt passing
> --thread-group causes an assertion:
>
> ~"/scratch/gdb/binutils-gdb/gdb/target.c:3757: internal-error: void
> target_stop(ptid_t): Assertion `!proc_target->commit_resumed_state'
> failed.\nA problem internal to GDB has been detected,\nfurther debugging
> may prove unreliable."
>
> Thanks,
> Jonah
>
>
>

PS. If you have a patch that you want me to run the Eclipse CDT testsuite
on, please cc me on the patch.

Jonah

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group
  2021-07-11 19:42   ` Jonah Graham
  2021-07-12  0:24     ` Jonah Graham
@ 2021-07-12  2:33     ` Simon Marchi
  2021-07-12  8:52       ` Pedro Alves
  1 sibling, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-07-12  2:33 UTC (permalink / raw)
  To: gdb-patches, Jonah Graham

As reported in PR gdb/28077, we hit an internal error when using
-exec-interrupt with --thread-group:

    info threads
    &"info threads\n"
    ~"  Id   Target Id             Frame \n"
    ~"* 1    process 403312 \"loop\" (running)\n"
    ^done
    (gdb)
    -exec-interrupt --thread-group i1
    ~"/home/simark/src/binutils-gdb/gdb/target.c:3768: internal-error: void target_stop(ptid_t): Assertion `!proc_target->commit_resumed_state' failed.\nA problem internal to GDB has been detected,\nfurther debugging may prove unreliable.\nQuit this debugging session? (y or n) "

This is because this code path never disables commit-resumed (a
requirement for calling target_stop, as documented in
process_stratum_target::»commit_resumed_state) before calling
target_stop.

The other 3 code paths in mi_cmd_exec_interrupt use interrupt_target_1,
which does it.  But the --thread-group code path uses its own thing
which doesn't do it.  Fix this by adding a scoped_disable_commit_resumed
in this code path.

Calling -exec-interrupt with --thread-group is apparently not tested at
the moment (which is why this bug could creep in).  Add a new test for
that.  The test runs two inferiors and tries to interrupt them with
"-exec-interrupt --thread-group X".

This will need to be merged in the gdb-11-branch, so here are ChangeLog
entries:

gdb/ChangeLog:

	* mi/mi-main.c (mi_cmd_exec_interrupt): Use
	scoped_disable_commit_resumed in the --thread-group case.

gdb/testsuite/ChangeLog:

	* interrupt-thread-group.c: New.
	* interrupt-thread-group.exp: New.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28077
Change-Id: I615efefcbcaf2c15d47caf5e4b9d82854b2a2fcb
---
 gdb/mi/mi-main.c                              |   3 +
 gdb/testsuite/gdb.mi/interrupt-thread-group.c |  65 +++++++++
 .../gdb.mi/interrupt-thread-group.exp         | 135 ++++++++++++++++++
 3 files changed, 203 insertions(+)
 create mode 100644 gdb/testsuite/gdb.mi/interrupt-thread-group.c
 create mode 100644 gdb/testsuite/gdb.mi/interrupt-thread-group.exp

diff --git a/gdb/mi/mi-main.c b/gdb/mi/mi-main.c
index e293dddc08d8..44008d1c0a86 100644
--- a/gdb/mi/mi-main.c
+++ b/gdb/mi/mi-main.c
@@ -383,6 +383,9 @@ mi_cmd_exec_interrupt (const char *command, char **argv, int argc)
     {
       struct inferior *inf = find_inferior_id (current_context->thread_group);
 
+      scoped_disable_commit_resumed disable_commit_resumed
+	("interrupting all threads of thread group");
+
       iterate_over_threads (interrupt_thread_callback, &inf->pid);
     }
   else
diff --git a/gdb/testsuite/gdb.mi/interrupt-thread-group.c b/gdb/testsuite/gdb.mi/interrupt-thread-group.c
new file mode 100644
index 000000000000..738c6fc3e3ee
--- /dev/null
+++ b/gdb/testsuite/gdb.mi/interrupt-thread-group.c
@@ -0,0 +1,65 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <unistd.h>
+#include <pthread.h>
+#include <assert.h>
+
+#define NUM_THREADS 4
+
+static pthread_barrier_t barrier;
+
+static void *
+thread_function (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+
+  for (int i = 0; i < 30; i++)
+    sleep (1);
+
+  return NULL;
+}
+
+static void
+all_threads_started (void)
+{}
+
+int
+main (void)
+{
+  pthread_t threads[NUM_THREADS];
+
+  pthread_barrier_init (&barrier, NULL, NUM_THREADS + 1);
+
+  for (int i = 0; i < NUM_THREADS; i++)
+    {
+      int res = pthread_create (&threads[i], NULL, thread_function, NULL);
+      assert (res == 0);
+    }
+
+  pthread_barrier_wait (&barrier);
+  all_threads_started ();
+
+  for (int i = 0; i < NUM_THREADS; i++)
+    {
+      int res = pthread_join (threads[i], NULL);
+      assert (res == 0);
+    }
+
+  return 0;
+}
+
diff --git a/gdb/testsuite/gdb.mi/interrupt-thread-group.exp b/gdb/testsuite/gdb.mi/interrupt-thread-group.exp
new file mode 100644
index 000000000000..b32fd1afc724
--- /dev/null
+++ b/gdb/testsuite/gdb.mi/interrupt-thread-group.exp
@@ -0,0 +1,135 @@
+# Copyright 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test "--exec-interrupt --thread-group".
+#
+# Run two inferiors, try interrupting them both with the command above.
+
+# This test uses non-stop, which requires displaced stepping.
+if { ![support_displaced_stepping] } {
+    unsupported "displaced stepping"
+    return -1
+}
+
+load_lib mi-support.exp
+set MIFLAGS "-i=mi"
+
+standard_testfile .c
+
+if {[gdb_compile_pthreads "$srcdir/$subdir/$srcfile" $binfile \
+	executable debug] != "" } {
+    return -1
+}
+
+save_vars { GDBFLAGS } {
+    append GDBFLAGS " -ex \"set non-stop on\" -ex \"set mi-async\""
+    mi_clean_restart $binfile
+}
+
+mi_detect_async
+
+# Create breakpoint by hand instead of using mi_runto, since we'll need it for
+# both inferiors.
+mi_create_breakpoint "all_threads_started" \
+    "set breakpoint on all_threads_started"
+
+# Run first inferior to all_threads_started (to ensure all threads are started)
+# and resume it.
+if { [mi_run_cmd] < 0 } {
+    return
+}
+
+mi_expect_stop "breakpoint-hit" "all_threads_started" ".*" ".*" ".*" {"" "disp=\"keep\""} \
+    "inferior i1 stops at all_threads_started"
+
+mi_send_resuming_command "exec-continue --thread-group i1" \
+    "continue inferior 1"
+
+# We can't run a second inferior on stub targets.  We can still test with one
+# inferior and ensure that the command has the desired effect.
+set use_second_inferior [expr {![use_gdb_stub]}]
+
+if { $use_second_inferior } {
+    # The inferior created by the -add-inferior MI command does not inherit the
+    # target connection of the first inferior.  If debugging through an
+    # extended-remote connection, that means we can't run that second inferior
+    # on the remote connection.  Use the add-inferior CLI command as a stop-gap.
+    if { [mi_is_target_remote] } {
+	mi_gdb_test "add-inferior" \
+	    "\\^done" \
+	    "add inferior 2"
+    } else {
+	mi_gdb_test "-add-inferior" \
+	    "\\^done,inferior=\"i2\"" \
+	    "add inferior 2"
+    }
+    mi_gdb_test "-file-exec-and-symbols --thread-group i2 $::binfile" \
+	"\\^done" \
+	"set executable of inferior 2"
+    # Run second inferior to all_threads_started (to ensure all threads are
+    # started) and resume it.
+    mi_gdb_test "-exec-run \
+	--thread-group i2" \
+	"\\^running.*" "run inferior 2"
+
+    mi_expect_stop "breakpoint-hit" "all_threads_started" ".*" ".*" ".*" {"" "disp=\"keep\""} \
+	"inferior i2 stops at all_threads_started"
+
+    mi_send_resuming_command "exec-continue --thread-group i2" \
+	"continue inferior 2"
+
+    mi_check_thread_states {
+	"running" "running" "running" "running" "running"
+	"running" "running" "running" "running" "running"
+    } "before interrupting"
+} else {
+    mi_check_thread_states {
+	"running" "running" "running" "running" "running"
+    } "before interrupting"
+}
+
+# Interrupt inferior 1, wait for events.
+mi_gdb_test "-exec-interrupt --thread-group i1" \
+    "\\^done" \
+    "interrupt inferior 1"
+
+for {set i 0} {$i < 5} {incr i} {
+    mi_expect_interrupt "inferior 1, interrupt $i"
+}
+
+if { $use_second_inferior } {
+    mi_check_thread_states {
+	"stopped" "stopped" "stopped" "stopped" "stopped"
+	"running" "running" "running" "running" "running"
+    } "after interrupting inferior 1"
+
+    # Interrupt inferior 2, wait for events.
+    mi_gdb_test "-exec-interrupt --thread-group i2" \
+	"\\^done" \
+	"interrupt inferior 2"
+
+    for {set i 0} {$i < 5} {incr i} {
+	mi_expect_interrupt "inferior 2, interrupt $i"
+    }
+
+    mi_check_thread_states {
+	"stopped" "stopped" "stopped" "stopped" "stopped"
+	"stopped" "stopped" "stopped" "stopped" "stopped"
+    } "after interrupting inferior 2"
+} else {
+    mi_check_thread_states {
+	"stopped" "stopped" "stopped" "stopped" "stopped"
+    } "after interrupting inferior 1"
+}
-- 
2.32.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group
  2021-07-12  2:33     ` [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group Simon Marchi
@ 2021-07-12  8:52       ` Pedro Alves
  2021-07-12 20:28         ` Simon Marchi
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-12  8:52 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches, Jonah Graham

On 2021-07-12 3:33 a.m., Simon Marchi via Gdb-patches wrote:

> gdb/testsuite/ChangeLog:
> 
> 	* interrupt-thread-group.c: New.
> 	* interrupt-thread-group.exp: New.

Missing gdb.mi/ :

 	* gdb.mi/interrupt-thread-group.c: New.
 	* gdb.mi/interrupt-thread-group.exp: New.

> +

> +# This test uses non-stop, which requires displaced stepping.
> +if { ![support_displaced_stepping] } {
> +    unsupported "displaced stepping"
> +    return -1
> +}

This is not true nowadays -- displaced stepping isn't a requirement for non-stop, it's
"merely" an optimization.  infrun will stop all threads to step over breakpoints inline
if displaced stepping is not supported.  Are you copying this from elsewhere?  Guess a bit
of spring cleaning is in order if so.

> +
> +mi_expect_stop "breakpoint-hit" "all_threads_started" ".*" ".*" ".*" {"" "disp=\"keep\""} \
> +    "inferior i1 stops at all_threads_started"
> +
> +mi_send_resuming_command "exec-continue --thread-group i1" \
> +    "continue inferior 1"
> +
> +# We can't run a second inferior on stub targets.  We can still test with one
> +# inferior and ensure that the command has the desired effect.
> +set use_second_inferior [expr {![use_gdb_stub]}]
> +
> +if { $use_second_inferior } {
> +    # The inferior created by the -add-inferior MI command does not inherit the
> +    # target connection of the first inferior.  If debugging through an
> +    # extended-remote connection, that means we can't run that second inferior
> +    # on the remote connection.  Use the add-inferior CLI command as a stop-gap.

Yeah, we should fix that at some point...

> +    if { [mi_is_target_remote] } {
> +	mi_gdb_test "add-inferior" \
> +	    "\\^done" \
> +	    "add inferior 2"
> +    } else {
> +	mi_gdb_test "-add-inferior" \
> +	    "\\^done,inferior=\"i2\"" \
> +	    "add inferior 2"
> +    }
> +    mi_gdb_test "-file-exec-and-symbols --thread-group i2 $::binfile" \
> +	"\\^done" \
> +	"set executable of inferior 2"
> +    # Run second inferior to all_threads_started (to ensure all threads are
> +    # started) and resume it.
> +    mi_gdb_test "-exec-run \
> +	--thread-group i2" \
> +	"\\^running.*" "run inferior 2"

Note this will fail on gdb builds that do not include a native target, which is
most "--target foo" builds.

Otherwise LGTM.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group
  2021-07-12  8:52       ` Pedro Alves
@ 2021-07-12 20:28         ` Simon Marchi
  2021-07-13 12:24           ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-07-12 20:28 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches, Jonah Graham



On 2021-07-12 4:52 a.m., Pedro Alves wrote:
> On 2021-07-12 3:33 a.m., Simon Marchi via Gdb-patches wrote:
> 
>> gdb/testsuite/ChangeLog:
>>
>> 	* interrupt-thread-group.c: New.
>> 	* interrupt-thread-group.exp: New.
> 
> Missing gdb.mi/ :
> 
>  	* gdb.mi/interrupt-thread-group.c: New.
>  	* gdb.mi/interrupt-thread-group.exp: New.

Fixed.

>> +# This test uses non-stop, which requires displaced stepping.
>> +if { ![support_displaced_stepping] } {
>> +    unsupported "displaced stepping"
>> +    return -1
>> +}
> 
> This is not true nowadays -- displaced stepping isn't a requirement for non-stop, it's
> "merely" an optimization.  infrun will stop all threads to step over breakpoints inline
> if displaced stepping is not supported.  Are you copying this from elsewhere?  Guess a bit
> of spring cleaning is in order if so.

Ok.  So we just put nothing, and if the target doesn't support non-stop,
GDB outputs "The target does not support running in non-stop mode." and
mi_run_cmd returns -1?

>> +    mi_gdb_test "-file-exec-and-symbols --thread-group i2 $::binfile" \
>> +	"\\^done" \
>> +	"set executable of inferior 2"
>> +    # Run second inferior to all_threads_started (to ensure all threads are
>> +    # started) and resume it.
>> +    mi_gdb_test "-exec-run \
>> +	--thread-group i2" \
>> +	"\\^running.*" "run inferior 2"
> 
> Note this will fail on gdb builds that do not include a native target, which is
> most "--target foo" builds.

I don't think so, because inferior 2 is using the remote target, which
can run.  Note that the formatting above was erroneous, the line is
broken at the wrong place, I've fixed it locally.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group
  2021-07-12 20:28         ` Simon Marchi
@ 2021-07-13 12:24           ` Pedro Alves
  2021-07-13 13:27             ` Simon Marchi
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2021-07-13 12:24 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches, Jonah Graham

On 2021-07-12 9:28 p.m., Simon Marchi wrote:
> On 2021-07-12 4:52 a.m., Pedro Alves wrote:

>>> +# This test uses non-stop, which requires displaced stepping.
>>> +if { ![support_displaced_stepping] } {
>>> +    unsupported "displaced stepping"
>>> +    return -1
>>> +}
>>
>> This is not true nowadays -- displaced stepping isn't a requirement for non-stop, it's
>> "merely" an optimization.  infrun will stop all threads to step over breakpoints inline
>> if displaced stepping is not supported.  Are you copying this from elsewhere?  Guess a bit
>> of spring cleaning is in order if so.
> 
> Ok.  So we just put nothing, and if the target doesn't support non-stop,
> GDB outputs "The target does not support running in non-stop mode." and
> mi_run_cmd returns -1?

That's how the CLI tests work, at least.

> 
>>> +    mi_gdb_test "-file-exec-and-symbols --thread-group i2 $::binfile" \
>>> +	"\\^done" \
>>> +	"set executable of inferior 2"
>>> +    # Run second inferior to all_threads_started (to ensure all threads are
>>> +    # started) and resume it.
>>> +    mi_gdb_test "-exec-run \
>>> +	--thread-group i2" \
>>> +	"\\^running.*" "run inferior 2"
>>
>> Note this will fail on gdb builds that do not include a native target, which is
>> most "--target foo" builds.
> 
> I don't think so, because inferior 2 is using the remote target, which
> can run.  Note that the formatting above was erroneous, the line is
> broken at the wrong place, I've fixed it locally.

OK.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group
  2021-07-13 12:24           ` Pedro Alves
@ 2021-07-13 13:27             ` Simon Marchi
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Marchi @ 2021-07-13 13:27 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches, Jonah Graham



On 2021-07-13 8:24 a.m., Pedro Alves wrote:
> On 2021-07-12 9:28 p.m., Simon Marchi wrote:
>> On 2021-07-12 4:52 a.m., Pedro Alves wrote:
> 
>>>> +# This test uses non-stop, which requires displaced stepping.
>>>> +if { ![support_displaced_stepping] } {
>>>> +    unsupported "displaced stepping"
>>>> +    return -1
>>>> +}
>>>
>>> This is not true nowadays -- displaced stepping isn't a requirement for non-stop, it's
>>> "merely" an optimization.  infrun will stop all threads to step over breakpoints inline
>>> if displaced stepping is not supported.  Are you copying this from elsewhere?  Guess a bit
>>> of spring cleaning is in order if so.
>>
>> Ok.  So we just put nothing, and if the target doesn't support non-stop,
>> GDB outputs "The target does not support running in non-stop mode." and
>> mi_run_cmd returns -1?
> 
> That's how the CLI tests work, at least.
> 
>>
>>>> +    mi_gdb_test "-file-exec-and-symbols --thread-group i2 $::binfile" \
>>>> +	"\\^done" \
>>>> +	"set executable of inferior 2"
>>>> +    # Run second inferior to all_threads_started (to ensure all threads are
>>>> +    # started) and resume it.
>>>> +    mi_gdb_test "-exec-run \
>>>> +	--thread-group i2" \
>>>> +	"\\^running.*" "run inferior 2"
>>>
>>> Note this will fail on gdb builds that do not include a native target, which is
>>> most "--target foo" builds.
>>
>> I don't think so, because inferior 2 is using the remote target, which
>> can run.  Note that the formatting above was erroneous, the line is
>> broken at the wrong place, I've fixed it locally.
> 
> OK.
> 

Thanks, pushed to both branches.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-07-13 13:27 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-21  0:02 [PATCH v4 0/2] Reduce back and forth with target when there are pending statuses Pedro Alves
2021-03-21  0:02 ` [PATCH v4 1/2] gdb: generalize commit_resume, avoid commit-resuming when threads have " Pedro Alves
2021-03-25 15:16   ` Simon Marchi
2021-03-26 16:11     ` Pedro Alves
2021-03-26 16:17       ` Simon Marchi
2021-07-11 19:42   ` Jonah Graham
2021-07-12  0:24     ` Jonah Graham
2021-07-12  2:33     ` [PATCH master + 11] gdb: disable commit-resumed on -exec-interrupt --thread-group Simon Marchi
2021-07-12  8:52       ` Pedro Alves
2021-07-12 20:28         ` Simon Marchi
2021-07-13 12:24           ` Pedro Alves
2021-07-13 13:27             ` Simon Marchi
2021-03-21  0:02 ` [PATCH v4 2/2] gdb: defer commit resume until all available events are consumed Pedro Alves
2021-03-25 16:00   ` Simon Marchi
2021-03-26 16:16     ` Pedro Alves
2021-03-26 16:18       ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).